UploadFile: allow ancillary files
Currently UploadFile is restricted to a static set of files, and thusgnt-cluster redist-conf (silently) fails to upload all config files.With this patch we add the new static files we distribute, and allhypervisor-provided ancillary files....
Convert UploadFile (and its callers) to new rpc
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add a node powercycle command
This (somewhat big) patch adds support for remotely rebooting the nodesvia whatever support the hypervisor has for such a concept.
For KVM/fake (and containers in the future) this just uses sysrq plus a‘reboot’ call if the sysrq method failed. For Xen, it first tries the...
Add a new CONFIRM_OPT option to cli.py
Today we are not very consistent as to what ‘--force’ represents:sometimes confirmation, sometimes forcing a possible dangerous option,etc.
This patch adds a new ‘--yes’ option that should be used for all simpleconfirmations of genre “yes, I really want to remove the instance”....
IsNormAbsPath and users, use "normalized" term
We used to refer to normalized paths as "normal" which might beconfusing. This fixes the syntax in all current IsNormAbsPath users andin the docstring.
Add utils.IsNormAbsPath function
Currently most of the time we check for absolute path, but that doesn'tprotect us from some invalid paths. In some places we should be morestrict, and this function should help us to.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
Hypervisors: make absolute path checking strict
Use the new utils.IsAbsNormPath function, rather than just os.path.isabs
Modify cli.JobExecutor to use SubmitManyJobs
This patch changes the generic "multiple job executor" to use the manyjobs submit model, which automatically makes all its users use the newmodel.
This makes, for example, startup/shutdown of a full cluster much more...
gnt-instance batch-create: use the job executor
This small patch changed the batch create functionality to use the jobexecutor instead of single-job submits.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Convert instance reinstall to multi instance model
This patch converts ‘gnt-instance reinstall’ from single-instance tomulti-instance model; since this is dangerours, it's required to pass“--force --force-multiple” to skip the confirmation.
Signed-off-by: Iustin Pop <iustin@google.com>...
KVM: add the network script to the ancillary files
_RedistributeAncillaryFiles function
This function is shared between AddNode and RedistributeConfig, and usedto redistribute additional files which are inherently part of thecluster configuration.
_RedistributeAncillaryFiles: add hypervisor files
Each hypervisor can declare additional files to be shipped to all nodes.
Xen: add ancillary files
Remove the HTS_COPY_VNC_PASSWORD constant/feature
Currently just for xen-hvm we copy the vnc password on node-add. Thiswill be changed for 2.1 with a more advanced gnt-cluster redist-conffunctionality which is going to be used by node-add as well.
KVM: replace hardcoded network script path
Currently the kvm automatic network scripts accepts to be overridden byan user supplied /etc/ganeti/kvm-vif-bridge script. We keep thisfunctionality but move the hardcoded path to a constant, dependent alsoon SYSCONFDIR....
Add a luxi call for multi-job submit
As a workaround for the job submit timeouts that we have, this patchadds a new luxi call for multi-job submit; the advantage is that all thejobs are added in the queue and only after the workers can startprocessing them....
Doc fixes for RAPI
After moving the documentation from the .py files to .rst, we had somecleanups to do.
This fixes the formatting of the comments, improves them a little, andremoves deprecated info (DOC_URI) from the python source.
Merge branch 'master' into branch-2.1
Release 2.0rc5
Move to data-based hvparam checks instead of code
Currently the hypervisor parameters are checked using hard-coded snippets ineach hypervisor. However, most parameter checks fall into three cases: - file check - directory check - string value in a set...
Merge commit 'origin/next' into branch-2.1
Move more hypervisor strings into constants
This patch adds constants for the mouse and boot order strings; whilethere are still some issues remaining, we're trying to cleanup hardcodedstrings from the hypervisors.
Since the formatting of frozensets is currently wrong, we also add an...
watcher: try to restart the master if down
Bugs in either our code or in associated libraries can bring the master daemondown, and this (due to the 2.0 architecture) stops all work on the cluster.
Since the watcher already does periodic checks on the cluster, we modify...
IAllocator: export total disk size for instances
This patch adds for current instance a ‘disk_space_total’ key, similarto the key for the new instance in case of new allocations.
Add -H/-B startup parameters to gnt-instance
This patch modifies the start instance script, opcode and logical unitto support temporary startup parameters.
Different from 1.2, where only the kernel arguments were supportingchanges (and thus xen-pvm specific), this version supports changing all...
call_instance_start: add optional hv/be parameters
This patch modifies the rpc.call_instance_start - the master side - totake optional hv/be parameters. The noded side is unchanged andoblivious to the change.
This will allow implementation of single-user capability and such on...
Fix gnt-job list argument handling
Currently QueryJob returns "None" when a wrong job ID is passed.Handle this in gnt-job list, by printing an error for each wrong job,and still giving output for all the jobs which actually do exist.
Instance reinstall: don't mix up errors
If the remote info rpc call fails we can't assume that the instance isup.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Don't check memory at startup if instance is up
2.1 design: add VNC console password changes
2.1 design: OS parameters
Initial design for the OS parameter changes proposed for 2.1.
Move HVM's device_model to a hypervisor parameter
This moves yet another hardcoded value to a hypervisor parameter. Iremoved the 64/32 difference as it doesn't seem valid to me - it's moreof a local site config rather than arch config.
Implement the KERNEL_PATH parameter for xen-hvm
For the xen-hvm hypervisor, the KERNEL_PATH parameter is needed buttoday is hardcoded to a constants in the xen hypervisor library (argh!).
This patch moves this to a hypervisor constant with the default value...
2.1 design: propose redistribute config changes
This patch proposes a mini-design to improve redistribute-config andintegrate it better with other logical units.
gnt-cluster modify: fix --no-lvm-storage
Currently doing a gnt-cluster-modify --no-lvm-storage is silentlyignored, as it passes a None value in vg_name, which is the same as notmodifying that parameter. Explicitely set the passed value to '', so thenon-true not-None value can be evaluate to actually remove a volume...
LUSetClusterParams: improve volume group removal
Currently LUSetClusterParams will remove the volume group if the vg_namefield passed in is not true, but not None. Setting the target volumegroup to False or the empty string, though, is a bad idea because it's...
gnt-cluster info: show more cluster parameters
Even if we cannot modify all of them, they are useful information aboutthe current cluster.
LUQueryClusterInfo: return a few more fields
Some fields can be set at cluster init, and perhaps even modifed withSetClusterParams but there's no way to know them. With this patch weexport them in the cluster info query.
Specify another type of core changes
If a change modifies the way all/most LUs work it should also beconsidered core.
KVM: Abstract runtime file removal in a function
This removes some code which was duplicated in shutdown and migrate.
Move the glossary to a separate file
Currently we have an insignificant glossary at the end of the design-2.0document. This patch moves it to a separate file with the goal that itwill grow and all files can refer to it.
Some small doc updates
We change some formatting to sphinx-specific, to show how thedocumentation can be improved.
KVMHypervisor: return memory and cpus as integers
Currently the KVM hypervisor returns strings for the memory and cpuvalues, while the xen hypervisor returns integers. Making this uniformconverting the values to integers in KVM as well.
LUSetInstanceParam: don't assume memory is integer
LUSetInstanceParam currently assumes that the 'memory' value of acall_instance_info result is an integer, while the rest of the codeexplicitely converts it to int(). Converting it to int works around a...
Switch the documentation to sphinx
This big patch converts the documentation build system to sphinx(http://sphinx.pocoo.org/). Since that uses reStructuredText sourcestoo, there is no change (yet) in the documents themselves, just in thebuild system....
Convert from auto-generated RAPI docs to static
This patch removes the autogeneration of the RAPI docs from the code(based on docstrings) and moves the current autogenerated output tothe rapi.rst file.
The reasons behind this are multiple: - the build system becomes a little more simple (this could have been...
Add the new DRBD test files to the Makefile
These were forgotten in commit 01e2ce3a6e4ca68983f50dedaddd0d0fc7b77026,and caused “make distcheck” to fail.
Fix QA and documentation about no initrd case
In Ganeti 1.2, “none” was used to signify no initrd. In 2.0 we havechanged to “no_” as a prefix (i.e. “-H no_initrd_path”) and thus wedocument in the manpage this.
The QA suite is changed accordingly.
Remove an unused function
The _TransformPath function is not used anymore in 2.0, let's remove it.
Exporting the instance network_port on the RAPI
Patch for adding network_port to the instance attributes exported by theRAPI.
[iustin@google.com: slightly changed the formatting]Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Minor patch to rapi documentation
Minor patch to clarify the URL necessary for accessing the RAPI.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Small doc change in README
The version is 2.0, and we don't build PDFs by default, only HTMLfiles.
Remove some superfluous imports
This is for Python 2.6 compatibility.
Make Python interpreter selectable for test scripts
The Python interpreter used to run the test cases is hard-coded to be/usr/bin/python. If we use the first one from $PATH instead, it ismuch easier to test ganeti with other Python versions.
Inform the OS create script of reinstalls
Sometimes reinstalls are slightly different than new installs. Forexample certain partitions may need to be preserved accross reinstalls.In order to do that on a per-os basis we pass in the INSTANCE_REINSTALLvariable to inform the create script about when a reinstall is...
Add initial 2.1 design doc
This document contains a skeleton for the 2.1 design process.For now it just has introductory paragraphs and a structure for thevarious areas' design, but some sections still don't have a text, aswe're still in the early design phases....
Pass optional arguments to the daemons
These can be set in the defaults file, default to no arguments beingpassed, and make it easy for local installation to customize the way theganeti daemons are called.
ganeti.initd: include defaults file, if present
In the example init script we'll execute an optional defaults file tomake it easier to add local customizations to the ganeti startup.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin <iustin@google.com>
Fix ;; indentation in the main initd loop
Currently two of the ;; ending the case bodies are not indented withanything. Reindent all of them to the body of the loop, as it's donesomewhere else in the init script.
Avoid DeprecationWarning on Python >= 2.6
Python 2.6 complains about module 'sha' being deprecated. It makesexecution of Ganeti commands a bit annoying, and when you run'ganeti-watcher' in cron jobs, you get a mail message after everyexecution.
Tests pass under under Python 2.6 and Python 2.4....
ganeti-noded: add bind address option
This allows ganeti-noded to bind only on one interface rather than allthe ones on the machine. The default behaviour doesn't change.
Fix compatibility with DRBD 8.2
This patch adds (and suppresses) the extra ipv4/ipv6 words before theactual address that newer DRBD versions add.
[iustin@google.com: slightly changed the patch to conform to styleguide, and changed the commit message]Signed-off-by: Iustin Pop <iustin@google.com>...
Fix compatibility with DRBD 8.3
DRBD 8.3 changes two more things compared to 8.2: - /proc/drbd format changed in multiple ways; the part we're interested is the ‘st:’ to ‘ro:‘ change (in the changelog named as “Renamed 'state' to 'role'” - “drbdsetup /dev/drbdN show” changed the ‘device’ stanza from:...
RunCmd: log command line for missing cmd case
In case of missing programs, currently utils.RunCmd doesn't show anyinformation to help debugging, only 'No such file or directory'. Thispatch adds error handling for the ENOENT case such that at least we have...
Abstract Linux node information in hv_base
Currently both hv_fake and hv_kvm implement practically identical codeto get the node information. Since future container-like hypervisorswill also need this functionality, this patch moves it into the baseclass (as a separate function) which can then be called from classes...
Fix argument checking in LUSetClusterParams
This patch fixes two issues with LUSetClusterParams and argumentchecking.
First, this LU used the wrong function name (CheckParameters instead ofCheckArguments), which means that no parameter checking was done at all;...
Small optimisation in utils.WriteFile
Currently we always try to remove the new file, even if the renamesucceeded. This patch tracks the existence of the new file and doesn'ttry to remove it if we managed to rename it.
Fix luxi serialization in ganeti-masterd
Currently, lib/luxi.py used lib/serializer.py for encoding/decodingmessages, but the master daemon uses directly the simplejson module.This is wrong as any non-trivial change to serializer.py will break themaster daemon....
Allow gnt-debug submit-job to take multiple args
Currently “gnt-debug submit-job” takes a single argument and hasnon-trivial startup-costs; in order to exercise the job system, it isbetter to be able to submit multiple jobs with a single invocation ofthe script....
Include node name in hypervisor validation errors
The current validation routine just says "failed", without specifyingthe node name. This is very confusing, and we should log the node nametoo.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Alexander Schreiber <als@google.com>
Fix gnt-cluster getmaster on non-master nodes
The current implementation of “gnt-cluster getmaster” doesn't work onnon-master nodes, which is a regression from 1.2. This patch implementsit (again) via ssconf.
Release 2.0rc4
Reviewed-by: ultrotter
Update gnt-instance(8) for info
Add the --all argument, and reword a bit the basic information.
Reviewed-by: iustinp
gnt-instance info --all
Don't show all instances info by default, but require --all to be passedfor this time consuming operation.
LUDiagnoseOS: change locking and error handling
Since the “list OSes” call is exported via RAPI, this can be used prettyeasily to DOS the master daemon during long jobs.
The implementation of LUDiagnoseOS makes an RPC call to all nodes; welock nodes here in order to prevent node removal....
Fix verify-disks with broken volume groups
When a remote node returns invalid LVM data, we check it, but we don'tstop and continue with the rest of the checks (which require a validvolume group). This raises an internal error and breaks verify disks.
This seems unchanged for a long while, I don't know why it surfaced just...
Prevent errors when xenvg is broken cluster verify
When vg_name is not returned at all, we currently abort with an internalerror. This is because we don't catch KeyError.
This patch adds a custom message for this case, and also adds KeyErrorto the list of catched exceptions, just for safety....
A bunch of doc and other small fixes
This patch adds a couple of both externally and internally reportedissues: - missing SGML tags (Issue 54), report and patch by superdupont - wrong variable used in the init.d script, report and patch by Karsten Keil <karsten-keil@t-online.de>...
Trivial typo fix in error message
Release 2.0rc3
Burnin tests were successful, release rc3.
Reviewed-by: imsnah
Distribute built documentation
This patch changes the way documentation is built in order to distributethe generated output in the 'dist' archive, and thus no longerrequiring the presence of the docbook/rst toolchains during build time.This will lower the requirements for installation and also makes the...
Disable synchronous (locking) queries
This patch raises an error in the master daemon in case the userrequests a locking query; accordingly, all clients were modified to sendonly lockless queries. This is short-term fix, for proper fix theclients should be modified to submit a job when the user request a...
Fix the output of watcher on non-master nodes
Currently the watcher spews errors message on non-master nodes. Thiscleans it up.
Change the watcher to use jobs instead of queries
As per the mailing list discussion, this patch changes the watcher touse a single job (two opcodes) for getting the cluster state (node listand instance list); it will then compute the needed actions based on...
Fix Xen soft reboot via polling
This patch fixes the Xen soft reboot ("xm reboot") via polling for a specifictime for either changed domain ID or decreased CPU run-time.
This sould prevent the race-conditions discussed on the mailing list forreboots....
Add a new ssconf file with the cluster tags
Since the cluster tags are/should be more-or-less static, add them as anssconf key, so that querying them is possible without creating ajob/requiring the masterd to be running.
Add some more debugging info to masterd
This patch will log data about queries, which are today completelyinvisible (at the default log level) in the master log file.
Release 2.0rc2
This updates the NEWS file and bumps up the version number.
Fix _NOQUOTE regexp
Allow expressions longer than one character to match.
Mainloop: avoid calculating timeout every time
set timeout_needs_update to False after calculating the timeout.
Raise on invalid gnt-cluster queue commands
kvm: use the correct vnc bind address
There is a bug in kvm, when binding vnc to a specific address theconstant 'vnc_bind_address' is passed in, instead of the actualrequested address. This patch fixes it.
Add the 2.0-specific node flags to the design doc
This patch adds the newly-introduced node flags to the design document,as they currently are missing from there.
The patch also reduces the TOC depth to 3, as it was too big.
Fix the --net option to gnt-instance add
Similar to the --disk fixes a while ago, --net is broken too. This patchfixes it.
Xen: Remove one hardcoded constant
s/"vnc_bind_address"/constants.HV_VNC_BIND_ADDRESS/
watcher: fix startup sequence locking the master
Currently, the watcher startup sequence does: - open a luxi client - get the instance list - get the node boot ids - open and lock the status file, and: - archive jobs - restart the down instances...
Handle ghost instances in temp DRBD map
Currently cluster-verify doesn't handle the (admitedly invalid) case where wehave reservation for instances that were removed in the meantime.
This patch adds a check for this and prevents code errors in cluster-verify in...
Fix error handling in replace-disks with new node
Currently the _CreateSingleBlockDev function only raises OpExecError and notBlockDeviceError. This means that we don't release the instance's temporaryminors properly, and this creates problems later if the instance is removed...