hv_xen: Export number of CPUs for Dom0
This will be stored in the node object and used for calculations.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Add objects for disk/hv state
- Data objects- Serialization/deserialization- Unittests
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
objects.Node: Add static hv/disk state
hv_xen: Use constant for “Domain-0” name
Change “node_info” RPC to accept multiple VGs/hypervisors
Keeping the node state up to date will require information from multipleVGs and hypervisors. Instead of requiring multiple calls this changeallows a single call to return all needed information. Existing users...
locking: Allow checking if lock is owned in certain mode
With this patch the “LockSet” and “GanetiLockManager” classes have a newfunction to check if a single or a group of locks (at a certain level)have been acquired in a specific mode. This will be used for additional...
Merge branch 'devel-2.5'
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Merge branch 'stable-2.5' into devel-2.5
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Andrea Spadaccini <spadaccio@google.com>
ConfigWriter: Fix epydoc error
The parameter is called “mods”, not “modes”.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Andrea Spadaccini <spadaccio@google.com>(cherry picked from commit 1730d4a1ab56ef36d082b614d3d0ab13f3e14a85)
LUGroupAssignNodes: Fix node membership corruption
Note: This bug only manifests itself in Ganeti 2.5, but since theproblematic code also exists in 2.4, I decided to fix it there.
If a node was assigned to a new group using “gnt-group assign-nodes” the...
Fix pylint warning on unreachable code
Commit c50452c3186 added an exception when all instances should beevacuated off a node, but did so in a way which made pylint complainabout unreachable code.
LUNodeEvacuate: Disallow migrating all instances at once
There is a design issue in the iallocator interface which prevents usfrom doing this.
Separate OpNodeEvacuate.mode from iallocator
Until now the iallocator constants for node evacuation(IALLOCATOR_NEVAC_*) were also used for the opcode. However, it turnedout this was due to a misunderstanding and is incorrect. This patch addsnew constants (with the same values) and changes the affected places....
LUNodeEvacuate: Locking fixes
When evacuating a node, only an assertion without informative text wasused to check if the necessary node locks had been acquired. This was ontop of evaluating the list of nodes without having a node group lock, sothis was changed as well....
Fix error when removing node
ConfigWriter.GetAllInstancesInfo returns a dictionary, not a list.Removing a node would fail with “too many values to unpack”.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
constants: reindent a few dicts
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Remove BE_MEMORY from beparams but keep compatibility
Queries are already compatible (be/memory is an alias for be/maxmem) andimport/exports work. This patch patch fixes it for cluster init, modifyand instance add/start/modify.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
cmdlib: use MAXMEM for all operations
Since for now we can only start instances at their maximum memory, wemodify all checks to use that value. When we'll have better support forusing a value in between some of these checks have to move to minimummemory....
hypervisors: use maximum memory for all operations
ImportExport: use max and min memory params
Import uses the old "memory" parameter to populate the two new ones, ifthey're not overridden already.
FinalizeExport exports minmem and maxmem, but also memory, as maxmem, toallow importing to older ganeti clusters....
Query: allow query on maximum and minimum memory
be/memory is kept as an alias.
ShowInstanceConfig: show max and min memory
The old "memory" value is kept as maxmem, for now, forretrocompatibility.
instance hooks: pass maximum and minimum memory
Also pass the "memory" value for retrocompatibility, for now.
beparams: add min/max memory values
For now the new "memory" parameter stays there, but it will be removedlater. The new values are just taken from the old one, in this patch.
Set DRBD sync speed in DRBD8.Assemble
Instead of relying on clients of the class for setting the device speed(and, in general, the DRBD parameters), move this responsibility insidethe Assemble method.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>...
Reapply commit 2a6de57 after merge
In the last merge I erroneously discarded the changes introduced bycommit 2a6de57 "Check the results of master IP RPCs". This commitreintroduces them.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix QA breakage caused by merge 0e82dcf9
Patch tested and confirmed to work by Andrea Spadaccini<spadaccio@google.com>.
masterd: Initialize job queue only after RPC client
Otherwise jobs started after an unclean master shutdown will fail asthey depend on the RPC client.
masterd: Shutdown only once running jobs have been processed
Until now, if masterd received a fatal signal, it would start shuttingdown immediately. In the meantime it would hang while jobs are stillprocessed. Clients couldn't connect anymore to retrieve a jobs' status....
daemon: Support clean daemon shutdown
Instead of aborting the main loop as soon as a fatal signal (SIGTERM orSIGINT) is received, additional logic allows waiting for tasks to finishwhile I/O is still being processed.
If no callback function is provided the old behaviour--shutting down...
daemon: Allow custom maximum timeout for scheduler
This is needed in case the scheduler user (daemon.Mainloop in this case)has other timeouts at the same time. Needed for clean master shutdown.
jqueue: Add code to prepare for queue shutdown
Doing so will prevent job submissions (similar to a drained queue),but won't affect currently running jobs. No further jobs will beexecuted.
workerpool: Export function to check for running tasks
daemon: Use counter instead of boolean for mainloop abortion
Also log a message when a fatal signal was received and use dict.items.
Merge branch 'devel-2.4' into devel-2.5
Backwards compatibity - added admin_up to query
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Warn if we enable maintain-node-health without confd
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Adapt watcher for ENABLE_CONFD
If confd is disabled, do not automatically restart it. Furthermore, wecan't run maintenance actions if it is disabled so log a warning.
Note that I haven't completely disabled the NodeMaintenance class withENABLE_CONFD = False because I think they are at two different levels...
Add toggle for enabling/disabling confd
Doesn't do anything yet.
masterd: Don't pass mainloop to server class
It is not used.
workerpool: Allow processing of new tasks to be stopped
This is different from “Quiesce” in the sense that this function justchanges an internal flag and doesn't wait for the queue to be empty.Tasks already being processed continue normally, but no new tasks will...
workerpool: Use loop to ignore spurious notifications
This saves us from returning to the worker code when there is notask to be processed.
jqueue: Factorize code checking for drained queue
This is in preparation for a clean(er) shutdown of masterd.
LUInstanceCreate: Release unused node locks
After iallocator ran we can release any unused node locks. Since theymust be in exclusive mode this should improve parallelization duringinstance creation.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
cmdlib.TLReplaceDisks: Use itertools.count
… instead of a variable which needs to be incremented for every step.
Transition into and out of offline instance state
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Introduce admin_state as 3-values data type
Rename admin_up to admin_state
Fixed typo in _VerifyResultRow
algo: Make a dict from an flat list
This is in preparation to take deeper dict constructs from the commandline. You can feed the optionslist directly constructed of type"identkeyval" to it and it returns a fully deflated dict.
This is mainly needed for the resource model changes where we have to...
locking: Make some aliased methods public
Some methods, such as “_is_owned” and “list_owned”, have been aliased tomake them public for a while now. This patch makes the actualimplementation public.
SharedLock's “is_owned” needs to be aliased to “_is_owned” to remain...
cmdlib._ReleaseLock: Do nothing if no locks are owned
The locking library doesn't like it when “release()” is called ona lockset or lock which isn't held by the current thread. Insteadof modifying the library, which could have other side-effects,this rather simple change avoids errors when a LU simply tries to...
Use resource lock when setting node parameters
Also acquire instance and resource locks in shared mode (see comment).
Use node resource lock for replacing instance disks
If early-release is not used, the resource lock is kept while waitingfor disks to sync.
Hold node resource locks while setting instance parameters
Important for when disks are converted. Release locks once they're notneeded anymore. Make liberal use of assertions.
Hold node resource lock while moving instance
Acquire node resource lock when removing instance
Removing an instance affects available disk space and memory.
Use node resource lock when recreating instance disks
Recreating disks conflicts with other disk operations, therefore thenode resource lock must be acquired.
LUClusterRepairDiskSizes: Use node resource locks
Since this doesn't really touch the node, but it conflicts with e.g.growing a disk, the resource lock must be acquired.
LUInstanceGrowDisk: Use node resource lock
Also add one more feedback line. Downgrade instance lock to shared modewhile we're only waiting for disks to sync. The node lock is releasedwhen not needed anymore.
LUInstanceCreate: Hold node resource lock
The node resource lock is released once the disks are in sync (that is,after wiping).
LUNodeQueryvols: Acquire all locks in shared mode
Nothing is being written to.
LUNodeQueryStorage: Acquire all locks in shared mode
Nothing is written to.
cmdlib: Share lock in LUInstanceConsole
No writes are being done.
Document OpNodeMigrate's result for RAPI
- Commit b7a1c8161 changed the LU to generate jobs- Mention documented results in NEWS
LUNodeQuery: Call implementation's DeclareLocks function
Just in case we ever add locks for querying nodes. Currently_NodeQuery's DeclareLocks is a no-op function.
Use master IP address setup script in backend
Replace the code in backend.ActivateMasterIp andbackend.DeactivateMasterIp with the master IP address setup script,either the default one or the one provided by the user.
- Convert to string the netmask parameter in _BuildMasterIpEnv...
Change master IP address RPCs for external script
Change the master IP address RPC call chain to accept theuse_external_master_ip_script parameter. Introduces an unused parameterin backend.ActivateMasterIp and backend.DeactivateMasterIp, that will beused in the next commit....
Update cluster verify to check IP address scripts
Update cluster-verify to check the integrity of the default master IPaddress setup script and the presence and executability of the externalone (if currently in use by the cluster).
Add --use-external-mip-script flag
- add a command line flag to gnt-cluster init and modify to change the value of the cluster parameter use_external_mip_script;- add two constants representing the paths of the default script and of the external script;...
Add use_external_mip_script cluster parameter
Add the use_external_mip_script cluster parameter, that representswhether the master IP address turnup/turndown procedures must use ascript provided by the user (True) or the one provided by Ganeti(False)....
Ensure unused ports return to the free port pool
Ensure ports previously allocated by calling ConfigWriter's AllocatePort() arereturned to the pool of free ports when no longer needed:
Fix newer pylint's E0611 error in compat.py
These are triggered by our "stay-compatible" approach.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Adding basic elements for the new node params
This patch adds the new fields to the objects.py as well as defines theconstants used in the dicts and their type.
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
rpc: Fix another result processor
I forgot to change this in commit d9da5065c0.
rpc: Fix issue with “test_delay”'s timeout
I passed the timeout calculation function in the wrong field of thedefinition. A small change is also needed in “build-rpc” to not abortwhen writing the docstring.
rpc: Call result processor once for each node result
… instead of calling it with the whole results dictionary. Thisfixes an issue when replacing disks (and all other cases whereresult processors are used).
Fail if node/group evacuation can't evacuate instances
If an instance can't be evacuated, only a message would be printed. Withthis change the operation always aborts. Newly added unittests check forthis behaviour.
LUNodeSetParams: Lock affected instances only
Until now LUNodeSetParams would lock all instances if a node'ssecondary IP address was to be changed and would then releaseall instances it didn't actually need. With this patch the LUoptimistically locks instances and, once it got the locks,...
Check BGL when adding/removing node
RPC/test_delay: Use callable for timeout calculation
This avoids having to override the function in the RPC runner.
rpc: Move post-processor functions into definitions file
This way the generated code no longer contains arbitrary code.Post-processing functions are used by reference.
rpc: Use definitions directly instead of via generated code
Until now “autotools/build-rpc” would read the definition of all RPCsand write them to a new file, “lib/_generated_rpc.py” with somemodifications. With this patch the generated code loads the definitions...
Convert RPC definitions to dictionaries
This is in preparation to reducing the amount of generated code.
query: Use new SequenceToDict utility
Improve error handling in netmask change
- check if the master IP with the old netmask is up before attempting to change the netmask (to avoid a failed change netmask resulting in an undesired activation of the master IP);- improve error messages of the backend function;...
Add master_netmask to Cluster.UpgradeConfig
And also suppress pylint R0902 error about an object instance havingmore than 20 attributes.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Fix cluster start failure due to missing netmask
If the cluster version is upgraded from a version before commit5a8648eb609f7e3a8d7ad7f82e93cfdd467a8fb5 to a version after that commit,the master startup will fail because the ssconf file with the master...
LUInstanceRename: Compare name with name
… instead of object with name.
utils.algo: Add utility to convert sequence to dictionary
Useful for converting list of query fields to a dictionary and toconvert RPC definitions. Includes duplicate detection.
rpc: Make “test_delay” RPC actually work
Until now it would just call itself, eventually failing.
LUClusterRepairDiskSizes: Acquire instance locks in exclusive mode
Instances are modified if their disk size doesn't match.
cmdlib: Allow specifying lock level when calculating node locks
This is needed to lock node resources.
Show RPC calls from config in lock monitor
With this patch all RPC calls at runtime of masterd will show up in thelock monitor. There is a chicken-and-egg issue with initializing theconfiguration with a context since the lock manager, containing themonitor, requires the configuration. This is worked around by setting...
Update synopsis for “gnt-cluster repair-disk-sizes”
Mention that instances can be passed on the CLI when “--help” is used.
Derive IP hooks env variables from RPC parameter
Let the environment variables of the master IP turnup/turndown bederived from the parameter of the RPC itself (that is of typeobjects.MasterNetworkParameters in both cases).