cli.py: use None as name for tag operations on the cluster
This change is mostly cosmetic. Previously, the literal "cluster" wasused for the 'name' field of tag operations on the cluster (as opposedto a node or an instance). Since this field has a type of TMaybeString...
Fix previous merge
A call to _CalculateGroupIPolicy wasn't refactored during the merge.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Merge branch 'devel-2.6'
Merge branch 'stable-2.6' into devel-2.6
jqueue: Return jobs to queue when shutting down
When a job is still waiting for locks and the queue is shutting down,they should be returned and not actually start processing. Until nowjobs which transitioned from “queued” to “waiting” were alreadyconsidered to be running as far as the shutdown code was concerned....
gnt-debug delay: Add "--submit" option
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Make hostname checks uniform between instance rename and add
Currently, we have instance rename doing extra checks on the hostname, to prevent accidental wrong renames; however, instance createdoesn't do these checks (issue 291), which (if DNS is misconfigured)...
Improve logging of new job submissions
This addresses issue 290: when receiving new jobs, logging isincomplete, and we don't have the job ID and/or summarieslogged. Only later, when the job is queried for or being processed, weknow more.
This is not good when troubleshooting, so let's improve the initial...
Improve handling of lock exceptions
There are two issues with lock exceptions right now:
- first, we don't log the original error; this is fine for now (locking.py always returns the same error here), but in general is brittle: if locking.py would start returning more information, we'd...
Fix runtime memory increases
Commit 2c0af7da which added the runtime memory changes functionalityhad a small typo (wrong name); I've rewritten this to only compute thedelta once, for simplicity.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Fix validation of vgname in OpClusterSetParams
This variable can be empty, when we want to disable LVM, so we can'tuse TMaybeString.
Fixes issue 285.
Fix removal of storage directory on shared file storage
This patch makes _RemoveDisks symmetric to _CreateDisks with respectto file-based storage: _CreateDisks uses "in constants.DTS_FILEBASED",whereas _RemoveDisks was not update and only uses "==constants.DT_FILE". This results in stale directories left on the...
Switch non-redundant check to disk template-based
Currently, the warning/notice about non-redundant instances in clusterverify is based non empty secondaries list (how old is this?); theproper way to check this nowadays is via DTS_MIRRORED.
Signed-off-by: Iustin Pop <iustin@google.com>...
Fix permission for socket directory
The directory must we writable also by the confd daemon user.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add option to force master-failover without voting
This fixes issue 282.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
backend: Switch to new file storage directory verification
The configuration is no longer used for verifying file storage paths.
Check allowed file storage paths during cluster-verify
Some paths, such as /bin or /usr/lib, should not be used for filestorage. This patch implements a check during cluster verification tofail in case such a path has been used.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Make Paramiko an optional dependency for listrunner
With the move away from “setup-ssh”, Paramiko is no longer necessary toconfigure SSH on nodes.
Remove setup-ssh
It has been superseeded by “prepare-node-join”.
gnt-node add: Use prepare-node-join
This patch changes “gnt-node add” to use the newly added“prepare-node-join” tool. Hereby Paramiko is no longer a hard dependencyfor setting up SSH on nodes.
In “gnt_cluster.py”, a positional parameter is no longer passed as a...
prepare-node-join: Use ssh.GetAllUserFiles
Instead of building the dictionary locally, the global version in“ssh.py” can be used.
ssh: Add function to get all of user's SSH files
This new function returns the file paths for all of a user's SSH-relatedfiles (RSA, DSA and authorized_keys).
RunCmd: Support standard input file descriptor
This patch changes “utils.RunCmd” to accept a file-like object or anumeric file descriptor which will be used as the command's standardinput. One use-case will be to pass all necessary data to“prepare-node-join”....
Factorize job selection in “gnt-job cancel”
This will also be used for changing jobs' priorities. All parameters tothe common function are non-optional.
utils.x509: Factorize code to extract X509 certificate
This will be useful in “gnt-node add”.
prepare_node_join: Move daemon SSH files to constants
This dictionary will also be useful in “gnt-node add”.
prepare-node-join: Swap private and public keys
Other places, such as “ssh.GetUserFiles”, use a structure where theprivate key comes before the private key. Until now prepare-node-joindid the opposite, that is the public key came first. To avoid confusion...
prepare-node-join: Use public key directly for auth…_keys
A public key already includes the necessary prefix (“ssh-rsa” or“ssh-dss”), so there is no need to add it again.
ssh.GetUserFiles: Parameter to disable directory check
Without this parameter, either an error would be raised or “.ssh” wouldhave to be created. Now it is possible to retrieve the paths withoutrequiring the “.ssh” directory to exist.
Update instance modify message
Currently the message does not say explicitly that instance-initiatedreboots are useless to trigger the use of new parameters, per thethread on the user mailing list. Let's improve it a bit.
Errors.hs: improve field names for ConfigVersionMismatch
Change {exp,act}Code to {exp,act}Ver, which gives a better idea thatthe integer fields represent version numbers.
Also:
- errors.py: update OpPrereqError's docstring to note that an error code is always expected as the second argument (it was previously...
Remove unused cache implementation
Note that this commit has no Makefile.am changes, as the files werenot actually used. So it's better to actually remove them.
bdev: Remove unused import of itertools
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
bdev: Add verification for file storage paths
An earlier version of this patch series verified all paths in cmdlib inthe master daemon. With this change all that verification code is movedto bdev to run inside the node daemon. The checks are much stricter...
jqueue: Factorize code to modify job
A new function will be added to change a job's priority.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
cli: Use callback for --priority
If the option is used elsewhere, the numeric value is directlyavailable.
jqueue: Add docstring for _DetermineJobDirectories
Somehow this was missed in commit 0422250e.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
jqueue: Fix comments in _SubmitJobUnlocked
Drop SSHS_FORCE constant
It is not actually used.
Improve logging of AssertionErrors
Currently, when we have an assertion error raised from cmdlib, it looks like this:
[cluster] root@node4:~# gnt-instance grow-disk instance1 0 1G Failure: command execution error:
This is very very confusing. This patch adds a bit of traceback...
tools.prepare_node_join: Fix pep8 errors
Pep8 didn't agree with the indentation.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Add initial implementation of prepare-node-join
This is a new tool as per the design document “design-ssh-setup”. Itreceives a JSON data structure on its standard input and configures theSSH daemon and root's SSH keys accordingly. Unit tests are included....
ssh.GetUserFiles: RSA support, unit tests
This patch changes “ssh.GetUserFiles” to support two different kinds ofSSH keys, RSA and DSA. Before it would always use DSA. Newly writtenunit tests are included.
Update blockdev's "info" at instance rename
Currently, we set "info" metadata on block devices at device creationtime, but we never update it, leading to stale data in case ofinstance renames. This would not be a big problem in case of regularrenames (assuming this is a rare operation), but importing instances...
LVM: remove old tags when adding new ones
This patch adds a small helper function to clear an LV's tags, andcalls it at SetInfo time. We need this to be able to correctly trackinstance renames, once we will call SetInfo at such times.
Add a small bdev helper function
I wanted to write that snippet the third time, which is too much :)
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix disk adoption interaction with ipolicy checks
In Ganeti 2.6, disk adoption is broken due to the ipolicy checks beingdone before we read volume size from remote nodes. We fix this bysimply moving these checks to after the disk adoption code whichupdates the disk size; it's not that nice that we fail a (almost)...
Compare significant fields only for simple SSH keys
For simple SSH keys, that is those without options such as“command="…"”, only the first two parts need to be compared. The thirdfield is a free-form comment.
This patch changes the comparison used in...
ensure-dirs: Don't accept arguments
Before they would just be silently ignored.
ensure-dirs: Fix program name on usage screen
No string replacements are used, so doubling of the percent sign is notnecessary.
Before: Usage: %ensure-dirs [--full-run]After: Usage: ensure-dirs [--full-run]
Merge branch 'devel-2.6' into master
cli: Fix small typo
s/it/if/
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Dato Simó <dato@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Group.hs: add 'allTags'; adjust loaders and test data for it
This commit adds a Group.allTags field to store the tags of node groups,and teaches each loader backend in HTools to populate it (additionally, theIAllocator class in lib/cmdlib.py now includes tags for groups too). Test...
Remove support for PUT in noded
This takes care of a FIXME; 2.6 already uses the new method, so we'regood during upgrades.
Ignore empty/comment lines in OS variants file
Per a conversation on ganeti@googlegroups.com:
- gnt-os diagnose ; gnt-os list take in consideration blank lines in /etc/ganeti/instance-image/variants.list that could be confusing.
Let's fix this and also let's ignore comment lines....
gnt-job cancel: Confirmation and selection of jobs
New parameters, “--pending”, “--queued” and “--waiting”, are added toselect all jobs in the respective state. If one of those options is usedand “--force” is not given, the user is asked to confirm the operation....
Add new constant for pending job status
This constant contains the job status' “queued”, “waiting” and“cancelled”.
Conflicts: NEWS: Trivial lib/tools/ensure_dirs.py: constant moved to pathutils...
ensure-dirs: Fix permissions on master socket
A socket shouldn't have its executable bit set.
errors: Document arguments to QueryFilterParseError
Also fix one small mistake in the docstring for QuitGanetiException.
Add support for cpu_cap and cpu_weight Xen params
This patch adds support for Xen's CPU scheduler 'cpu_cap' and'cpu_weight' parameters.
Ganeti default values (cap: 0=unlimited, weight: 256) are Xen defaults.
cpu_cap is not validated correctly because of actual Ganeti limitation...
LUClusterVerifyGroup: Localize virtual file paths
The check for file consistency didn't properly handle virtual pathsin case of a virtual cluster. This didn't cause any breakage as ina standard virtual cluster setup with only one node all files arevisible for every node....
Enable query socket usage in gnt-node/gnt-group
This switches gnt-node/gnt-group (and their equivalent RAPI resources)to go over the query socket.
vcluster: Don't virtualize /etc/hosts path
/etc/hosts is a bit special as it's a system-wide file and the virtualcluster/node root doesn't apply. The modification of /etc/hosts shouldbe disabled in virtual clusters. If it isn't, however, the vclusterfunctions would raise an exception complaining about a path outside of...
cli: Stop hardcoding /etc/hosts path
There is a constant for this purpose.
Move constant for /etc/hosts to pathutils
Needed for coming patches.
gnt-job: List archived jobs if requested
If requested via a filter or by including the “archived” output,archived jobs will be loaded and shown. This is significantly slowerthan just listing normal jobs, therefore by default they are not loadedat all....
gnt-job list: Add option to include archived jobs
This provides a convenience option to include archived jobs in theoutput list. It's equivalent to using “-o +archived”, but tab completionis nicer.
jqueue: Correct docstring
The description was not accurate.
jqueue: Add new in-memory attribute for archived jobs
This attribute is set to True for jobs which were restored from anarchived file. A new filter will act on this field.
query: Report data type for unary operators
All data kinds (used to restrict the data collected) referenced in afilter can be requested once it's been “compiled”. However, the kindsof fields used in boolean expressions (e.g. ["?", "xyz"]) were notrecorded. This patch changes the code accordingly and provides a unit...
verify-disks: Explicitely state nothing has to be done
Example output:$ gnt-cluster verify-disksSubmitted jobs 4327Waiting for job 4327 ...No disks need to be activated.
Add basic unit tests for "gnt-cluster epo"
This patch adds some unit tests for “gnt-cluster epo”. Not everything iscovered, but at least the bug fixed in the previous patch is.
Fix pylint breakage due to unused var in gnt_cluster
The usage of that variable was removed in 45a36f36, but accidentallythe enumerate() was left in.
cluster epo: Fix bug where IndexError is raised
Updating the “node_query_list” variable would fail if no arguments werepassed and the “--all” option wasn't specified. A follow-up patch willadd unit tests.
Fix usage of errors.ResolverError
This exception is documented to have three arguments, but in one casewe raise it with a simple string argument. Let's fix that.
Remove unused/deprecated error classes
It seems a few of the error classes are no longer used:
- LVMError, deprecated in 8c5533a5 (before ganeti 1.2.2!)- ConfdRequestError, deprecated in b0dcdc10- SshKeyError, introduced in the initial open source commit but never used (⁈)...
backend: Use utils.IsBelowDir instead of local code
utils.IsBelowDir is actually tested and doesn't allow writes to“…/queue*”, like the old code here did.
jstore: Nicer error message on non-numeric file content
An error like “invalid literal for int() with base 10” can be quiteconfusing.
bdev: Remove unnecessary empty line
My local pylint didn't complain.
Better list of replace-disks arguments + typos fixed
The man page and the bultin-in help for gnt-instance replace-disks wereinconsistent. Also fixed some typos in man pages.
Check fingerprint of file with allowed file storage paths
This makes differences show up in “gnt-cluster verify”.
Explicitly ask for the default iallocator in commands
Now "gnt-instance recreate-disks" uses the default iallocator when "." isspecified as the iallocator. For uniformity, the same behavior applies tothese commands: gnt-node evacuate gnt-instance migrate...
Support for the default iallocator in replace-disks
"gnt-instance replace-disks" now behaves like the other commands, and usesthe default iallocator when "." is passed as the iallocator parameter.
bdev: Add functions to verify file storage paths
- LoadAllowedFileStoragePaths: Loads a list of allowed file storage paths from a file- CheckFileStoragePath: Checks a path against the list of allowed paths
The unit test for “utils.IsBelowDir” is updated with cases which weren't...
jqueue: Look at archived jobs when watching
First: This enables the use of “gnt-job watch $id” for archived jobs.
Now, the reason for actually making this work is that duringsufficiently large group or node evacuations jobs are archived beforethe client gets to poll for their output. This led to situations where...
backend: Check for shared storage also
If normal file storage was disabled but shared storage enabled,“_TransformFileStorageDir” would still throw an exception.
in “opcodes._CheckStorageType” there's also a check, but I wasn't quitesure what the correct way of handling it was, so I added a TODO comment....
utils.FilterEmptyLinesAndComments: Return list
We don't use generators often and lists are easier to re-use.
Wipe added space when growing disks
This patch adds code to wipe newly added disk space when growing disksusing “gnt-instance grow-disk”. “New disk space” is defined as the deltabetween the old block device size (not necessarily equal to the amountrecorded in the configuration) and the new recorded size. Extra caution...
cmdlib._WipeDisks: Code formatting
- LogInfo takes *args, no need to replace values right away- Don't overwrite wipe_chunk_size right after it's been set
Factorize removing comments and empty lines from string
This will also be used for verifying the file storage directory.
ssconf: Fix mistake made in commit ee501db
Move a function from backend to ssconf
The “WriteSsconfFiles” function is used to write ssconf files. By movingit we can avoid importing backend into bootstrap. The latter is importedby CLI programs and backend doesn't have much to do with them.
Show old primary/secondary node on disk replacement
People unfamiliar with Ganeti's internals might be confused with thedifferent hostnames showing up later in the process.
cmdlib: Change wording of messages during disk wipe
Error messages don't need to say “please” and it's already obvious someinvestigation is needed. LogWarning already logs the message using“logging.error” internally.
Remove constant for disk wipe block size
It is dangerous to have this block size as a global constant as thatcould give the impression of it being easily changed. Doing so withoutfurther adjustments to how “dd” is called will lead to disks not beingwiped properly....