Add the options attribute to cli.JobExecutor
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Implement generic CLI options->opcode updates
This patch changes SubmitOpCode and SubmitOrSend such that we have asingle function that does generic CLI options to opcode attributesfunction. This will allow, once all scripts pass the opts argument toSubmitOpCode, to pass the debug parameter or the dry-run one to the LUs....
Change the debug CLI option to integer/count
This changes from boolean to integer/count (for a future differentiationbased on the actual debug level). All the uses of the code only testit's boolean status, so it still works as an integer value.
Signed-off-by: Iustin Pop <iustin@google.com>...
Add a generic 'debug_level' attribute to opcodes
Also automatically fix opcodes which have this missing in the LU initroutine.
Fix dumpers/loaders after slots cleanup
Commit 154b958 changed (correctly) the slots usage, but this brokedumpers/loaders since we relied directly on the own class slotsfield.
To compensate, we introduce a simple function for computing the slots...
Add an early release lock/storage for disk replace
This patch adds an early_release parameter in the OpReplaceDisks andOpEvacuateNode opcodes, allowing earlier release of storage and moreimportantly of internal Ganeti locks.
The behaviour of the early release is that any locks and storage on all...
Merge branch 'stable-2.1' into devel-2.1
TLReplaceDisks: Delay iallocator when evacuating node
When evacuating nodes, the iallocator was run for allinstances without taking planned changes into consideration.This patch delays part of CheckPrereq and running theiallocator for node evacuation....
Implement debug level across OS-related RPC calls
This doesn't implement the full functionality, we need to add the debuglevel to the opcodes too, but at least won't require changing the RPCcalls during the 2.1 series.
Second try to fix LUVerifyCluster
My previous patch, commit 785d142, fixed the case where a node is markedoffline. With this patch it'll also handle other failures correctly.
LUVerifyCluster: Fix bug with offline nodes
[…] * Other Notes - NOTICE: 1 offline node(s) found. * Hooks ResultsFailure: command execution error:iteration over non-sequence
Commit a0c9776a introduced an error simulation mode to LUVerifyCluster.Due to a small mistake, offline nodes weren't skipped when checking the...
utils: Fix retry delay calculator
Before this patch, it would always sleep for at leastthe time specified as the upper limit. Now it actuallylimits the sleep time.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Bump RPC protocol version to 30
Merge remote branch 'origin/stable-2.1' into devel-2.1
Make the snapshot decision based on disk type
… instead of disk size, which is not as reliable. This actuallysimplifies the code; but it still leaves the possibility of stackoverflows if the disk data structure is corrupted.
Fix missing bridge for xen instances
Xen instances nic definitions miss the target bridge.
This bug was introduced in commit 503b97a9.
Signed-off-by: Alessandro Cincaglini <alessandro.ciancaglini@gmail.com>Reviewed-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>...
Fix flipping MC flag bug
Currently unofflining or undraining an already functional mastercandidate node, can cause it to demote itself. In order to avoid that weonly trigger the self-promotion check if the node is not currently acandidate.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
Add capability to use syslog for logging
This patch adds a configure-time parameter that will set the defaultsused by all programs, and command-line parameters in the daemons thatallow overriding it.
Syslog 'yes' enables syslog in addition to file-based logging, 'only'...
utils.FileLock: handle init errors properly
If the open of the lock file fails (due to whatever reason), 'self'won't have the 'fd' attribute, and thus we fail in Close/__del__, whichwill ruin proper error reporting:
IOError: [Errno 30] Read-only file system: '/var/lib/ganeti/queue/lock'...
locking: add/fix @type information
This patch missing @type information for all public methods, modifiesone to conform to the rest, and removes some information from @paramwhen it's been expressed in @type.
Fix slots definitions
According to http://docs.python.org/reference/datamodel.html#slots
Merge branch 'devel-2.0' into devel-2.1
Conflicts: lib/backend.py - trivial merge...
Add a crude disable for DRBD barriers
Ideally we want to/will have per-device DRBD controls of disk/metadataflushes. In the meantime, we want at least a disable of the barrierfunctionality for cases where one has battery-backed caches.
Background: DRBD has four mechanism of handling ordered disk-writes....
LURemoveNode safety in face of wrong node list
LURemoveNode runs under the BGL, which means we're guaranteed that thelist of nodes as retrieved in CheckPrereq is still valid inBuildHooksEnv. However, we can make Ganeti handle failures in case thelocking is broken (or the node list has been modified otherwise) easily,...
Fix an unsafe formatting bug
This might fix issue 84; in any case, the current situation is that wehave a potentially unsafe formatting, which should be fixed.
Ensure all int/float conversions are handled right
int()/float() can raise either ValueError (in case of int("a")), orTypeError (in case of int(None)). We had many bugs over time due tothis, and a recent one was just diagnosed, so we go over the codebase...
KVM: fix pylint warning
Specify string format arguments as logging function parameters
Signed-off-by: Guido Trotter <ultrotter@google.com>
KVM: be more resilient on broken migration answers
Before, when doing kvm live migrations we use to accept an "unknownstatus" but to reject anything that didn't match our regexp. Since we'veseen "info migrate" return a completely empty answer, we'll be more...
cli: Fix bug when not using headers
Commit 9fe72672 added code to not write spaces at the end of each line.Unfortunately it didn't work properly when not printing headers—there wouldstill be spaces.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Switch the SplitKeyVal function to accept escapes
This tiny patch switches the SplitKeyVal function (and thus the commandline options like -H, -B, etc.) to UnescapeAndSplit, thus allowing oneto use escaped commas in the values of parameters.
confd client: copy the peers in UpdatePeerList
Since the peer list is shuffled by the client, we don't keep a referenceto the list which was passed in, but copy it internally.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add an UnescapeAndSplit function
In many cases, where we accept (usually from the command line) a list ofparameters, we remove the use of the separator as an component of any ofthe elements.
This patch adds a new function that can split strings of the form...
Generate hmac file with a newline at the end
This makes it slightly easier to cut&paste its content.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
jqueue: Don't return negative number for unchecked jobs when archiving
When the queue was empty, the calculation for unchecked jobs whilearchiving would return -1. ``last_touched`` is set to 0, the job ID list(``all_job_ids``) is empty. Calculating ``len(all_job_ids) -...
cli.GenerateTable: Don't write EOL spaces
With this change, there won't be unnecessary space charactersat the end of lines.
Improve logging for workerpool tasks by providing repr
Before it would log something like “starting task(<ganeti.http.client._HttpClientPendingRequest object at 0x2aaaad176790>,)”,which isn't really useful for debugging. Now it'll log “[…]<ganeti.http.client._HttpClientPendingRequest...
workerpool: Simplify log messages
workerpool: Use worker name as thread name
This way it shows up in debug logs.
workerpool: Make worker ID alphanumeric
Having a proper name instead of just a number makes debuggingeasier.
locking: Fix race condition in LockSet
This patch fixes a race condition when acquiring all locks ina LockSet instance. The list of lock names needs to be sortedto guarantee a consistent locking order, but the names were notsorted when acquiring all locks in the set....
mcpu: Log lock status with sorted names
Reading and comparing sorted lists is easier when debugging locking problems.
locking: Append to list outside error handling block
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
locking: Don't fail in error handling if lock isn't owned
In case an exception was thrown while acquiring the lock, not necessarily allowned locks are also really acquired. Before this change, an exception could bemasked by another exception thrown here. There is no good clean-up strategy...
Normalize MAC addresses to all lower.
This change will normalize the MAC to all lower after validation.
Signed-off-by: René Nussbaumer <rn@google.com>Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Introduce a Luxi call for GetTags
This changes from submitting jobs to get the tags (in cli scripts) toqueries, which (since the tags query is a cheap one) should be muchfaster.
The tags queries are already done without locks (in the generic querypaths for instances/nodes/cluster), so this shouldn't break tags query...
LURenameCluster: run post hook on all nodes
Since the cluster name might be used for various purposes on nodes, weshould let all nodes "know" about a cluster rename by running the posthook on all nodes. This will make cluster rename slightlyslower/costlier, but it is not/shouldn't be an operation that is run...
Fix unused imports or add silences where needed
In some cases pylint doesn't parse the import correctly, so we addsilences; but there are also many cases of unused imports, which wesimply remove.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Olivier Tharan <olive@google.com>
bdev: Add a TODO and a pylint silence
A piece of old code in bdev.py uses a for loop over a single variablebecause we can 'break' out of the loop or exit on the 'else' path. Thisis not a nice usage of the for loop, it should be converted to astandard if...elif...else structure....
Further pylint disables, mostly for Unused args
Many of our functions have to follow a given API, and thus we have tokeep a given signature, but pylint doesn't understand this. Therefore,we silence this warning.
The patch does a few other cleanups.
LUDiagnoseOS._DiagnoseByOS: remove unused arg
The node_list argument to _DiagnoseByOS is not used, and is obsoleted bythe fact that the rlist argument already has the valid nodes as keys(assuming RPC behaviour didn't change). Thus, we remove it and silence...
hv_xen/_GetConfigFileDiskData: remove unused arg
The disk template is not needed, all that's used is the disk data. Assuch, remove this parameter from the function.
jqueue/_CheckRpcResult: log the whole operation
Currently only the rpc call, but not its description (which also showsthe argument) is logged. We change this to log failmsg too, and thisalso silences a warning.
Optparse extenders have to obey a given API
So we just silence the warning.
backend._OSOndiskAPIVersion: remove obsolete arg
The 'name' argument is not used anymore, probably since before 2.0.Since this is an internal function, we can just remove it (from itscaller too).
pylint cleanups: dangerous initializers
Plus a silence for a wrong "uninitialized var".
Convert to static methods (where appropriate)
Many methods are simple pure functions, and not depending on the objectstate. We convert these to staticmethods.
Add targeted pylint disables
This patch should have only:
- pylint disables- docstring changes- whitespace changes
Implement all hv functions in hv_chroot/hv_fake
The chroot and fake hypervisors were missing:
- the powercycle node functionality- proper handling of migration requests
The powercycle was just used as in the other hypervisors (use thestandard linux powercycle). The migration for chroot was disabled...
Add some stubs to bdev.FileStorage
This patch adds explicit errors (instead of notimplemented) inFileStorage (and the associated TODOs).
Fix indentation issues
Fix two bugs in ConfigWriter._EnsureUUID
Wrong argument name and wrong number of arguments in string formatting.
KVM: Abstract/rework instance up checks
This patch abstract the check "is instance stopped" into a separatefunction, and thus simplifies a couple of higher-level functions. Italso moves from manual read of the pidfile to use the (correctabstraction of) _InstancePidAlive....
KVM: Split out the pidfile computation
In some cases we only need the pidfile, but not the pid or the alivestatus.
Fix an error message
Detected by an 'Unused variable' warning.
Remove many 'Unused variable' warnings
Note there are some cases left which need extra cleanup.
Fix use of the logging functions
The logging functions expand the arguments themselves, thus it's saferto let them do it rather than manual string formatting.
Also re-wraps one comment.
Add targetted pylint disables
This patch adds targeted pylint disables, where it makes sense (eitherdue to limitations in pylint or due to historical usage), and also a fewblanket ones in rapi where all the names are… “different”.
Fix two bugs in seldom-used codepaths
New version of pylint, new bugs found!
Clarifiy some more wide pylint disables
This removes/updates some module-wide pylint disables.
Implement BuildHooksEnv for NoHooksLU
This just adds a stub function that raises an assertion error; thisaccomplishes two things:
- silences many pylint warnings- if we ever stumble upon this, a specific assertion error is (hopefully) clearer than just a not implemented error...
Fix indentation in hv_kvm
Per pylint warnings.
Partial cherry-pick of 6c881c5 from the 2.1 branch
This cherry-picks the utils.FieldSet.Matches changes and the significantjqueue.py change. These are stable in the 2.1 branch and therefore makesense to backport to 2.0 (are basically cleanups).
Fix a typo in the doc string
MaybeRaise in lib/errors.py had a typo in the doc string
Merge branch 'stable-2.0' into stable-2.1
CreateInstance: allow no ip check with start mode
Since gnt-instance start doesn't do any checks on the IP, it doesn'tmake much sense to do so in instance create (with start) if the userexpressly passes in ‘--no-ip-check’. Removing this requirement eases the...
Command line/RAPI support for --no-name-check
This patch adds --no-name-check to gnt-instance add and gnt-backupimport. This is opposite to the opcode parameter (name_check) as it issimilar to ip_check and start.
It also adds it to RAPI and gnt-instance batch-create as a parameter in...
Op/LUCreateInstance support for (no) name checks
This adds a new opcode parameter ‘name_check’ (similar to ip_check) thatis not required to be present (to easy backwards compatibility fortools).
It also adds a CheckArguments to LUCreateInstance and changes the...
Pass --fqdn to ssh hostname checks
The cluster verify checks for fqdn are done via address lookups, andthere we actually use the FQDN. However, for the ssh hostname checkwhich is done at node add time, we rely on the default of the “hostname”command. And Debian for example recently changed the default to return...
Move the hooks file mask into constants.py
This will allow reuse of the same mask for multiple validations.
Security issue: add validation of script names
This patch unifies the search for external script to always go throughutils.FindFile and implements in that function a restriction on validchars in file names and (additionally) that the passed name is the...
Improve LUQueryNodes for lockless case
In most uses of LUQueryNodes, we don't take a lock. This means that theinstance data is not protected across GetInstanceList andGetInstanceInfo, and this can lead to instances not existing anymore.
Switching to GetAllInstanceInfo means that we get a single,...
Add disk cache control parameter for KVM
This patch adds the 'cache' parameter for KVM; currently this is onlycustomisable at the hypervisor level, so it's the same for all drives(except any CDROM image, which gets the default).
Change pyinotify import for broader compatibility
On some distributions pyinotify is installed in a different way, and theactual module just contains an internal pyinotify entry, which is theactual library. On others the main pyinotify module contains the library...
ClusterMasterQuery: add primary ip field
By allowing also the primary ip field to be fetched directly, we avoidone more confd lookup, or dns request, to find out which address themaster node lives at.
confd ClusterMasterQuery: allow fields request
Change the ClusterMasterQuery to allow a query, and if present accept alist of fields to be returned. Currently only name and ip are accepted.
This feature will be used by NLD to route the cluster ip over the nbma....
DRBD: ignore unreadable meta devices
The DRBD driver can ignore dead disks but not dead meta devices (forwhich it refuses to configure the minor). To handle this case, we checkthat the meta device is readable and if not we ignore it (the same aswhen backend._RecursiveAssembleBD didn't find it)....
Simplify utils.ReadFile
The documentation for file objects' read method states that if the sizeis "negative or ommitted", all data is read; thus we can simplify it tohave size=-1 by default and not have the if test.
gnt-cluster verify: Warn if node time diverges too far
The warning will be generated if the clocks diverge by morethan 150 seconds. Due to the way the RPC system works, wecannot get exact time differences, e.g. if one of thequeried nodes is broken. The comparision is done using a...
KVM: fail when a routed nic has no ip
This shouldn't happen, but if it does it's better to fail at this level,rather than create a broken NIC script, which is hard to debug.
cmdlib: Work around race condition in DRBD before version 8.0.13
DRBD goes into sync mode for a short amount of time afterexecuting the "resize" command. DRBD 8.x below version8.0.13 contains a bug whereby calling "resize" in syncmode fails.
Remove quotes from CommaJoin and convert to it
This patch removes the quotes from CommaJoin and converts most of thecallers (that I could find) to it. Since CommaJoin does str(i) for i inparam, we can remove these, thus simplifying slightly a few calls....
Re-add “nic.bridges” field to RAPI bulk instance list
Commit 495cfdf0 removed “nic.bridges” from the defaultlist for bulk instance list RAPI requests.
Revert "Get rid of utils.CommaJoin"
This reverts commit 6915bc28fe053e92aa16cf2d974d205f1140219c based on thread onganeti-devel.
Conflicts:
lib/cmdlib.py (due to the error code classification, trivial)
Add check for OpenSSL entropy status
By checking for this explicitly, the errors (SSLEAY_RAND_BYTES, “PRNGnot seeded”) will happen in the start-up phase of the daemon and notonly when executing remote procedure calls.
Handle EEXIST in utils.RenameFile
This should fix an issue I've seen exactly once during testing. It might havebeen caused by parallel RPC calls to archive jobs.
[…] ganeti-noded:112 ERROR Error in RPC call […] File "/usr/lib/python2.4/site-packages/ganeti/backend.py", line 2365, in JobQueueRename...
Remove unused parameter “unlock” from cmdlib._WaitForSync