Add DRBD dynamic resync speed params to design doc
Convert opcode TH code to the use of Field type
This makes more explicit the field behaviour - previously an optionalfield was detected via a "Maybe" constructor, and an optional one viaa "Just defval" one. With this, field behaviour become more explicit...
Unify some file lists in Makefile.am
These were repeated needlessly; I hope I grouped them correctly.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Andrea Spadaccini <spadaccio@google.com>
Add DRBD barriers disk parameters
Add the disk-barriers and meta-barriers parameters described in thedesign doc.
Style fixes on confd-client
Oops, forgot to check this before initial commit, sorry!
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
NEWS: Add missing space
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
htools: small change in error message in THH.hs
We should also display the value we can't parse, otherwise debuggingis very hard.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Agata Murawska <agatamurawska@google.com>
htools: improvements to JSON deserialisation
This fixes two problems:
- first, when we deserialise a big object, showing its value is not useful, as it will hide the actual error message- second, we shouldn't deserialise a container at once, because then...
htools: add new template haskell system
This system based on explicit types instead of ad-hoc rules(e.g. instead of deducing from "Maybe Int" an optional field, we nowcan say explicitly OptionalField ''Int). In the first phase, this willbe used for the equivalent of lib/objects.py, which has slightly...
Add a small confd client
This can be used to test live servers; currently there's not directway to interact with a confd server, except for burnin's builtin tests(which were the source of this file).
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
A few updates to the confd design (2.1)
While the 2.1 design is old and should be “immutable”, I can't finddocumentation about the confd protocol anywhere else, so let's correctthe design doc.
The patch is mostly style changes, plus a clarification on the ‘query’...
cmdlib: Make use of cluster's new “primary_hypervisor” property
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
objects.Cluster: Add property for primary hypervisor
This is useful for working with a node's hypervisor state, where onlythe primary hypervisor will be authoritative.
LV stripes parameters for plain and drbd
Add DRBD8 static resync speed disk parameter
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Use disk parameters in Logical Units
Use disk parameters in noded
qa: add gnt-cluster tests related to disk params
Add basic support for disk parameters
objects.py: * add disk parameters to Disk, Cluster, NodeGroup.
constants.py: * add dictionaries that will hold types and default values for disk parameters (for now, empty).
test/ganeti.constants_unittest.py:...
More fixes after commit 78519c106
A quick QA run successfully finished with these changes.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Andrea Spadaccini <spadaccio@google.com>
Fix “node_info” RPC result
Commit 78519c106 broke everything. Here's the fix.
query: Add fields for node's disk/hv state
These fields just return the node attribute's contents. They will beused by the watcher to detect out of date node states.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>...
hv_xen: Report memory used by hypervisor
- Report memory used by hypervisor (“mem_hv” as per resource model design document, “xmem” in htools)- Also report number of CPUs available to Dom0- Some other, small changes
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
hv_xen: Export number of CPUs for Dom0
This will be stored in the node object and used for calculations.
Add objects for disk/hv state
- Data objects- Serialization/deserialization- Unittests
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
objects.Node: Add static hv/disk state
hv_xen: Use constant for “Domain-0” name
Change “node_info” RPC to accept multiple VGs/hypervisors
Keeping the node state up to date will require information from multipleVGs and hypervisors. Instead of requiring multiple calls this changeallows a single call to return all needed information. Existing users...
locking: Allow checking if lock is owned in certain mode
With this patch the “LockSet” and “GanetiLockManager” classes have a newfunction to check if a single or a group of locks (at a certain level)have been acquired in a specific mode. This will be used for additional...
Merge branch 'devel-2.5'
Merge branch 'devel-2.4' into devel-2.5
Merge branch 'stable-2.5' into devel-2.5
ConfigWriter: Fix epydoc error
The parameter is called “mods”, not “modes”.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Andrea Spadaccini <spadaccio@google.com>(cherry picked from commit 1730d4a1ab56ef36d082b614d3d0ab13f3e14a85)
LUGroupAssignNodes: Fix node membership corruption
Note: This bug only manifests itself in Ganeti 2.5, but since theproblematic code also exists in 2.4, I decided to fix it there.
If a node was assigned to a new group using “gnt-group assign-nodes” the...
Fix pylint warning on unreachable code
Commit c50452c3186 added an exception when all instances should beevacuated off a node, but did so in a way which made pylint complainabout unreachable code.
LUNodeEvacuate: Disallow migrating all instances at once
There is a design issue in the iallocator interface which prevents usfrom doing this.
Separate OpNodeEvacuate.mode from iallocator
Until now the iallocator constants for node evacuation(IALLOCATOR_NEVAC_*) were also used for the opcode. However, it turnedout this was due to a misunderstanding and is incorrect. This patch addsnew constants (with the same values) and changes the affected places....
LUNodeEvacuate: Locking fixes
When evacuating a node, only an assertion without informative text wasused to check if the necessary node locks had been acquired. This was ontop of evaluating the list of nodes without having a node group lock, sothis was changed as well....
Fix error when removing node
ConfigWriter.GetAllInstancesInfo returns a dictionary, not a list.Removing a node would fail with “too many values to unpack”.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
manpages: update beparams explanations
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
constants: reindent a few dicts
Remove BE_MEMORY from beparams but keep compatibility
Queries are already compatible (be/memory is an alias for be/maxmem) andimport/exports work. This patch patch fixes it for cluster init, modifyand instance add/start/modify.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
burnin: use mem_size as max and min
unittests: use max/min memory
cmdlib: use MAXMEM for all operations
Since for now we can only start instances at their maximum memory, wemodify all checks to use that value. When we'll have better support forusing a value in between some of these checks have to move to minimummemory....
qa: use maximum and minimum memory
test modification of either parameter, but also both at once.
hypervisors: use maximum memory for all operations
ImportExport: use max and min memory params
Import uses the old "memory" parameter to populate the two new ones, ifthey're not overridden already.
FinalizeExport exports minmem and maxmem, but also memory, as maxmem, toallow importing to older ganeti clusters....
Query: allow query on maximum and minimum memory
be/memory is kept as an alias.
ShowInstanceConfig: show max and min memory
The old "memory" value is kept as maxmem, for now, forretrocompatibility.
instance hooks: pass maximum and minimum memory
Also pass the "memory" value for retrocompatibility, for now.
beparams: add min/max memory values
For now the new "memory" parameter stays there, but it will be removedlater. The new values are just taken from the old one, in this patch.
design-resource-model: update disk params section
Simplify design by moving all the parameters to disk template level,explaining why this is sub-optimal. Add notes about DRBD versions,corner cases and parameters application time.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>...
Set DRBD sync speed in DRBD8.Assemble
Instead of relying on clients of the class for setting the device speed(and, in general, the DRBD parameters), move this responsibility insidethe Assemble method.
build-rpc: Fail if call is defined more than once
Reapply commit 2a6de57 after merge
In the last merge I erroneously discarded the changes introduced bycommit 2a6de57 "Check the results of master IP RPCs". This commitreintroduces them.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix QA breakage caused by merge 0e82dcf9
Patch tested and confirmed to work by Andrea Spadaccini<spadaccio@google.com>.
masterd: Initialize job queue only after RPC client
Otherwise jobs started after an unclean master shutdown will fail asthey depend on the RPC client.
masterd: Shutdown only once running jobs have been processed
Until now, if masterd received a fatal signal, it would start shuttingdown immediately. In the meantime it would hang while jobs are stillprocessed. Clients couldn't connect anymore to retrieve a jobs' status....
daemon: Support clean daemon shutdown
Instead of aborting the main loop as soon as a fatal signal (SIGTERM orSIGINT) is received, additional logic allows waiting for tasks to finishwhile I/O is still being processed.
If no callback function is provided the old behaviour--shutting down...
daemon: Allow custom maximum timeout for scheduler
This is needed in case the scheduler user (daemon.Mainloop in this case)has other timeouts at the same time. Needed for clean master shutdown.
jqueue: Add code to prepare for queue shutdown
Doing so will prevent job submissions (similar to a drained queue),but won't affect currently running jobs. No further jobs will beexecuted.
workerpool: Export function to check for running tasks
daemon: Use counter instead of boolean for mainloop abortion
Also log a message when a fatal signal was received and use dict.items.
htools: adjust imports for newer compilers
While testing with ghc 7.2, I saw that some imports we are using arevery old (from ghc 6.8 time), even though current libraries are usingdifferent names.
We fix this and bump minimum documented version to ghc 6.12, as I...
admin.rst update regarding offline state of the instance
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
NEWS update - offline instance state
Backwards compatibity - added admin_up to query
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Man page update: online/offline state of instance
Add small node in admin.rst about confd disabling
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Warn if we enable maintain-node-health without confd
Adapt daemon-util to ENABLE_CONFD
We still allow explicit shutdown of confd, but we prevent manualor automatic start-up.
Adapt watcher for ENABLE_CONFD
If confd is disabled, do not automatically restart it. Furthermore, wecan't run maintenance actions if it is disabled so log a warning.
Note that I haven't completely disabled the NodeMaintenance class withENABLE_CONFD = False because I think they are at two different levels...
Prevent runnning of confd tests in burnin
Add toggle for enabling/disabling confd
Doesn't do anything yet.
Fix unittest bug related to offline instances
Currently, the code in Node.hs is overly strict: once a node's freememory reaches 0, it will refuse to add any instances (offline ornot). I think this is a safe safeguard (I don't expect nodes to runwithout at least 1MB of free memory), so rather than change this...
htools: reindent the rest of the files
htools: re-indent IAlloc.hs
htools: reindent hspace
htools: reindent hbal
htools: reindent CLI.hs
htools: re-indent QC.hs
htools: re-indent Node.hs
htools: finish re-indenting Cluster.hs
masterd: Don't pass mainloop to server class
It is not used.
workerpool: Allow processing of new tasks to be stopped
This is different from “Quiesce” in the sense that this function justchanges an internal flag and doesn't wait for the queue to be empty.Tasks already being processed continue normally, but no new tasks will...
workerpool: Use loop to ignore spurious notifications
This saves us from returning to the worker code when there is notask to be processed.
jqueue: Factorize code checking for drained queue
This is in preparation for a clean(er) shutdown of masterd.
LUInstanceCreate: Release unused node locks
After iallocator ran we can release any unused node locks. Since theymust be in exclusive mode this should improve parallelization duringinstance creation.
cmdlib.TLReplaceDisks: Use itertools.count
… instead of a variable which needs to be incremented for every step.
htools: rework message display construction
While diagnosing some (unrelated) memory usage in htools, I'vestumbled upon some very bad behaviour in checkData: mapAccum isnon-strict, and the tuple we use also, so that results in the list oflist of messages being very bad space-wise (hundreds of MB of memory...
hbal: handle empty node groups
This patch changes an internal assert (which can only be triggeredwhen a node group is empty) into properly handling this case (andreturning empty node/instance lists).
While we could handle this in the backend (Cluster.splitNodeGroup)...