cli: Use ToStdout/ToStderr instead of print
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>(cherry picked from commit 03298ebe95f393fc77f8acce62e5aa0b95a2bdd0)
Fix small typo in gnt-node
The iallocator option is '-I' not '-i'.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Simplify handling of boolean args in rapi
This patch replaces hardcoded boolean-type args withbool(_checkIntVariable). There should be no other cases of this left, Ithink.
Fix checks in LUSetNodeParms for the master node
There was a check already in the LU for the master node, however iswasn't correct. This patch disallows any role changes on the master nodevia LUSetNodeParms (and as this LU can't change anything else, it...
Improve the example startup script
Currently, the supplised script has two issues: - it doesn't use start-stop-daemon --start correctly, leading to messages like "ganeti.errors.GenericError: /var/run/ganeti/ganeti-rapi.pid contains a live process" in the logs...
Fix insserv dependencies
(import of a Debian patch)
This patch removes xend from the list of dependencies.
Ganeti doesn't need xend running to startup, it will only need it later(and only if xen is used as virtualisation technology). It also removes'Xen' from the description in the init script....
Fix a typo in InitCluster
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>(cherry picked from commit 022c3a0b36cb60644b6861ff27ad59202883963c)
Ignore results from drained nodes in iallocator
Since drained nodes could be (partially or fully) broken in iallocator,we ignore results from these nodes when building the cluster map inpreparation for sending it to the script.
This is a cheap change for the stable branch; ideally we should not...
Ship the ethers hook
doc/examples/hooks/ethers has been added without being shipped in thereleased tarball. Putting a stop to this.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Ethers hook, compatibility with old lockfile
Remove "-l" option since some ancient systems ship a version of lockfile-progsnot supporting it.
Signed-off-by: Luca Bigliardi <shammash@google.com>
Remove a few unused imports from noded/masterd
Signed-off-by: Guido Trotter <ultrotter@google.com>
Upgrade be/hv params with default values
From time to time we're adding new be or hv parameters. With this patchmissing parameters get set to the default value when loading the clusterobject. This patch version also considers the case when hv/be params...
Implement the KERNEL_PATH parameter for xen-hvm
For the xen-hvm hypervisor, the KERNEL_PATH parameter is needed buttoday is hardcoded to a constants in the xen hypervisor library (argh!).
This patch moves this to a hypervisor constant with the default value...
Move HVM's device_model to a hypervisor parameter
This moves yet another hardcoded value to a hypervisor parameter. Iremoved the 64/32 difference as it doesn't seem valid to me - it's moreof a local site config rather than arch config.
Signed-off-by: Iustin Pop <iustin@google.com>...
objects: add configuration upgrade system
Add a very basic configuration update mechanism to objects.An object can define the UpgradeConfig method, which will be called atinit time, and use it to fill in missing defaults in the configuration.In the future we may want to make it more complex, for example adding...
Add cluster-init --no-etc-hosts parameter
If --no-etc-hosts is passed in at cluster init time we set a newparameter in the cluster's object to false, and avoid adding nodes tothe hosts file. The UpgradeConfig function is used to set the value toTrue, when upgrading from an old configuration version....
Merge branch 'master' into next
Update NEWS and version for 2.0.3 release
example ethers hook: use lockfile-progs
Rather than writing our own locking routing, use the one implemented bythe lockfile-create program.
ethers hook lock: use logger not echo
Overwrite debugging 'echo's
Signed-off-by: Luca Bigliardi <shammash@google.com>Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
ethers hook: reduce the probability of data loss
The hook was exiting immediately if lock was not acquired, entering a timedloop to have more chances when acquiring the lock.
Signed-off-by: Luca Bigliardi <shammash@google.com>Signed-off-by: Guido Trotter <ultrotter@google.com>...
devel/upload: revert rsync -p
The permissions replications also will change the permissions on the /and /usr directories, which is bad. This reverts it to the originalbehaviour.
export: add meaningful exit code
Currently ‘gnt-backup export’ always returns exit code zero, even in theface of complete failure during backup (only failure to stop/start theinstance will cause job failure and thus non-zero exit code). This isbad, since one cannot script the backup....
Fix detecting of errors in export
This should fix issue 61, by explicitely calling bash (which is is now anon-explicit dependency) and setting the pipefail command.
Implement gnt-cluster check-disk-sizes
This patch adds a new opcode and lu for checking disk sizes. Currentlyit does only top-level disk verification, and also doesn't checkprimary/secondary node size mismatches (these two are added as TODOs inthe Exec() function of the LU)....
rpc: add rpc call for getting disk size
Note that this exports the disk size as bdev returns it, in bytes. Thevalue will be converted to MiB in cmdlib.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
bdev: Add function for reading actual disk size
This patch adds a GetActualSize for block devices that returns theactual disk size. It is done using blockdev (and stat for file storage).
While this could be done via reading /sys/block/N/size, that is not as...
Implement --ignore-size in activate-disks
This patch modified OpActivateDisks, LUActivateDisks and gnt-instanceactivate-disks to support and pass this option to_AssembleInstanceDisks.
The patch is quite trivial I think; there should be no issues from it...
Add ignore size support in _AssembleInstanceDisks
This patch adds an optional parameter to _AssembleInstanceDisks thatallows ignoring of size information by making a copy of the diskstructure and setting the size to zero.
Add a objects.Disk.UnsetSize() method
This method recursively resets the size of the disk and its children tozero.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
bdev: allow ignoring of size in Assemble()
This patch changes the DRBD8 class (the only one to use the size inAssemble) to ignore the size in Assemble when a zero size is passed.This will allow activation of disks even when the size recorded in theconfiguration is wrong....
Fix instance import net option
This is identical to dc30b0e4 but applied to gnt-backup. Thanks to userocaner for catching it.
Simplify the devel/upload script
Instead of multiple uploads to each node, this script copies everythingas needed to the temporary directory, exactly as to be installed in thedestination machine, then runs only one rsync per host.
This is more dangerous (we can break /etc now), but for development...
Add a Copy method to object.ConfigObject
This small patch adds a simple Copy method that is can be used for'throw-away' copies of objects.
Add “gnt-job watch” command
This command can be used to follow the output of a job. It's usefultogether with the --submit parameter for other commands.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
jqueue: Fix error when WaitForJobChange gets invalid ID
When JobQueue.WaitForJobChange gets an invalid or no longer existing job ID ittries to return job_info and log_entries, both of which aren't defined yet.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
jqueue: Update message for cancelling running job
Extend call_node_start_master rpc with no_voting
When the parameter is set to True and start_daemons is also True,ganeti-masterd will be started with the new --no-voting --yes-do-itoptions.
This new option is set to True only on masterfailover, when no_voting is...
lvmstrap: Change diskinfo to use GenerateTable
This way the produced table is formatted nicely.
Signed-off-by: Stephen Shirley <diamond@google.com>Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Get rid of constants.RAPI_ENABLE
This constant is unused, except in qa. Removing it since it's always True.
This patch also removes the unused qa_rapi.PrintRemoteAPIWarningfunction, and removes a comment about temporary constants "until we havecluster parameters"....
Remove references to utils.debug
Various modules set it to True when called in debugging mode, but theutils module supports no such global.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
ganeti-rapi, replace hardcoded exit value
substitute exit(1) with exit(constants.EXIT_FAILURE).Also fix a wrongly indented line.
Add the bind-address option to ganeti-rapi
noded: Abstract hard-coded sys.exit value
On machines without the ssl file noded exists '5'.Changing this to constants.EXIT_NOTCLUSTER.
Also utils.GetNodeDaemonPort hasn't risen errors.ConfigurationError fora while, so removing that try/except block....
Add an example "ethers" hook
This hook can be used to update /etc/ethers with instance's macaddresses. A dhcp server on the nodes can then serve to the instancestheir correct address. (This has been tested with dnsmasq's dhcpimplementation)
burnin: move batch init/commit into a decorator
Many burnin steps initialize the batch queue at the beginning and commitit at the end of their operation. This patch moves this code to adecorator, in order to reduce redundant code.
burnin: move instance alive checks to a decorator
Many burn steps to a manual check of instance aliveness, via duplicatecode. This patch moves this code to a decorator.
burnin: Implement retryable operations
Some burnin steps are idempotent: e.g. reinstalling an instance (fromburning p.o.v.) can be done multiple times without any side-effects thatwould affect later burnin steps. As such, failing the whole burninprocess due a reinstall failure is undesirable....
Ignore vim swap files
burnin: fix removal errors hiding real errors
A long-standing bug in burnin makes errors during the removal phase(e.g. because an import has failed, or because the initial creation hasfailed) hide the original error.
This patch suppresses removal errors if we are already in ‘has_err’...
backend: Only build once the list of upload files
The list of upload files is built currently at every UploadFile() call.This patch moves it to a separate variable which is initialized onlyonce.
This won't make much difference but I regard it as cleanup....
Fix gnt-instance reinstall
Commit 55efe6dabe48e5c37dc1ff6099e0bb8afde7a468 "Convert instancereinstall to multi instance model" actually broke instance reinstall forsingle-instance cases. This one-liner fixes it.
Fix a couple of epydoc warnings
It seems epydoc needs fully-qualified references, and doesn't deal withrelative ones (not even in the current module) if there are anyambiguities.
There are other epydoc warnings, in the rapi docstrings, but those areleft as-is as they're removed in 2.1....
job queue: fix loss of finalized opcode result
Currently, unclean master daemon shutdown overwrites all of a job'sopcode status and result with error/None. This is incorrect, since theany already finished opcode(s) should have their status and resultpreserved, and only not-yet-processed opcodes should be marked as...
Switch gnt-debug submit-job to JobExecutor
Currently gnt-debug submits jobs individually, but in 2.1 JobExecutoruses the optimized SubmitManyJobs luxi call and as such should be usedwhenever multiple jobs need to be submitted.
This patch converts gnt-debug submit-job to use it and also removes an...
gnt-instance batch-create: use the job executor
This small patch changed the batch create functionality to use the jobexecutor instead of single-job submits.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>...
Convert instance reinstall to multi instance model
This patch converts ‘gnt-instance reinstall’ from single-instance tomulti-instance model; since this is dangerours, it's required to pass“--force --force-multiple” to skip the confirmation.
Modify cli.JobExecutor to use SubmitManyJobs
This patch changes the generic "multiple job executor" to use the manyjobs submit model, which automatically makes all its users use the newmodel.
This makes, for example, startup/shutdown of a full cluster much more...
Add a luxi call for multi-job submit
As a workaround for the job submit timeouts that we have, this patchadds a new luxi call for multi-job submit; the advantage is that all thejobs are added in the queue and only after the workers can startprocessing them....
job queue: fix interrupted job processing
If a job with more than one opcodes is being processed, and the masterdaemon crashes between two opcodes, we have the first N opcodes markedsuccessful, and the rest marked as queued. This means that the overall...
Fix an error path in job queue worker's RunTask
In case the job fails, we try to set the job's run_op_idx to -1.However, this is a wrong variable, which wasn't detected until theslots addition. The correct variable is run_op_index.
Add slots on objects in jqueue
Adding slots to _QueuedOpCode decreases memory usage (of these objects)by roughly four times. It is a lesser change for _QueuedJobs.
ganeti.initd: Pass $*_ARGS to programs when restarting them
Optimizie OpCode loading
This patch converts the opcode loading to a pre-built map (at importtime) instead of iteration over the globals dict at each call.
Microbenchmarks show that this should be around three times faster, andburnin still passes.
Yet another fallout from the pylint fixes
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Olivier Tharan <olive@google.com>
Fix another issue with hypervisor_name change
Update NEWS and version for 2.0.2 release
Improve the description of node flags in man page
[iustin@google.com: slightly reworded the explanation for offline andchanged the commit message]Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add enabled hypervisors to TestConfigRunner
This parameter is now mandatory for the cluster config to work.
Add a few more checks to verify config
- Check that the enabled hypervisors list is valid- Check that the master node is a valid node
Make sure enabled_hypervisors list is valid
Change default stripe count to 1
In order not to change the default during a stable series, we modifyconfigure.ac to default to one stripe, in effect keeping the status quo(well, minus the LVM Attach() changes).
Use full-stripe size in LVM growth
LVM has issues when growing stripped volumes, so it's best to specifythe growth in exact multiples of the full stripe size (as precise aspossible). For this we need to do a couple of changes: - in LVM Attach(), we query additionally the VG extent size and the LV...
Remove ConfigWriter.InitConfig
It's been replaced by a simpler bootstrap.InitConfig function, whichdoes the same job, and is currently unused.
Remove SimpleConfigWriter.SetMasterNode
This function is not used.
_GenerateDiskTemplate: use base_index in the name
Currently if a disk is added later the base_index is not considered, andall the disks are called disk0. This patch fixes it.
ganeti-masterd: avoid SimpleConfigReader
SimpleStore is a lot less heavyweight than SimpleConfigReader, and tojust get the master name we can use that. This is the only usage ofSimpleConfigReader currently, but we're not going to delete the class,as new usages will come in for ganeti-confd (in 2.1). Using it there,...
cmdlib: Fix typo in LUQueryClusterInfo
This was broken by my pylint fixes patch.
RAPI: implement instance reinstall
This patch adds instance reinstall to RAPI, with two optional parameters: - ‘os', in order to change the OS on reinstall - ‘nostartup’, in order to leave the instance down after reinstall
The call will first shutdown the instance, the reinstall it, and unless...
Create a new --no-voting option for masterfailover
This allows failing over in certain corner cases, such as a 2 nodecluster with one node down. The man page is also updated to documentthis dangerous option and how to recover from this situation.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
ganeti-masterd: allow non-interactive --no-voting
This will be used by ganeti-noded to start ganeti-masterd in a--no-voting masterfailover.
Fix pylint warnings
Add custom pylintrc
bootstrap: Don't leak file descriptor when generating SSL certificate
Fix problem with EAGAIN on socket connection in clients
If a user used ^Z to stop the program, poll() in socket.recv would returnEAGAIN due to SIGSTOP. This patch changes luxi.Transport.Recv to ignore EAGAIN.
Fix some typos
Increase maximum accepted size for a DRBD meta dev
With the change to stripped LVs, the actual size of a meta device (whichis small) can be more than we expected (for non-stripped LVs). Thispatch increases from 160MB to 1GB the accepted size, and updates the...
backend.DemoteFromMC: don't fail for missing files
If the config file is missing when the DemoteFromMC() function iscalled, it will raise a ProgrammerError. Instead of changing theutils.CreateBackup() file which is called from multiple places, for nowwe only change the DemoteFromMC() function to not call it if the file is...
Fix node readd issues
This patch fixes a few node readd issues.
Currently, the node readd consists of two opcodes: - OpSetNodeParms, which resets the offline/drained flags - OpAddNode (with readd=True), which reconfigures the node
The problem is that between these two, the configuration is inconsistent...
Cleanup config data when draining nodes
Currently, when draining nodes we reset their master candidate flag, butwe don't instruct them to demote themselves. This leads to “ERROR: file'/var/lib/ganeti/config.data' should not exist on non master candidates...
Allow GetMasterCandidateStats to ignore some nodes
This patch modifies ConfigWriter.GetMasterCandidateStats to allow it toignore some nodes in the calculation, so that we can use it to predictcluster state without some nodes (which we know we will modify, and thus...
Fix error message for extra files on non MC nodes
Currently the message for extraneous files on non master candidates isconfusing, to say the least. This makes it hopefully more clear.
Fix adjustement of candidates in cluster modify
The code for adjusting the candidate pool size was done after the configupdate, and this means we triggered the save of the config file withoutfixing the candidate pool, which aborts with an error.
The patch just moves it above. The old comment was valid, but we anyway...
Add a new node list field
This patch adds a ‘role’ node list field, which shows a one-characternode status. This is a simpler way to see the node status than selectingall the flags individually.
Fix HTTP server library handling of credentials
Currently the http library only checks credentials when authenticationis required. This means that any credentials are accepted on the rootresource, for example, which makes problems hard to diagnose - the...
Fix a typo in backend.InstanceReboot docstring
The documentation for the reboot was wrong. This patch fixes it andupdates the docstring with more details.
Fix handling of 'vcpus' in instance list
Currently running “gnt-instance list -o+vcpus” fails with a cryptic message: Unhandled Ganeti error: vcpus
This is due to multiple issues: - in some corner cases cmdlib.py raises an errors.ParameterError but this is not handled by cli.py...
Fix checking for valid OS in instance create
The current check in LUCreateInstance.CheckPrereq() is wrong - it only checksif we got an OS, but not if we got a valid OS. This patch fixes it.