Fix error handling in replace-disks with new node
Currently the _CreateSingleBlockDev function only raises OpExecError and notBlockDeviceError. This means that we don't release the instance's temporaryminors properly, and this creates problems later if the instance is removed...
Fix serial_no field on instances
The instance objects did not get a serial_no field. This patch adds anew constants for the field name and uses it for all three cases(cluster, nodes, instances).
Reviewed-by: imsnah
Update gnt-cluster(8) for be/hyp parameter syntax
Now it displays:
--hypervisor-parameters hypervisor:hv-param=value [ ,hv-param=value ... ]--backend-parameters be-param=value [ ,be-param=value ... ]
Sorry for the super-long lines :( Is there a better way to insert spaces...
Complete the cfgupgrade script for 2.0 migrations
This patch makes the cfgupgrade script to handle: - instance changes - disk changes - further cluster fixes - adds configuration checks at the end, in non-dry-run mode
Reviewed-by: ultrotter
First run at cfgupgrade for 2.0 upgrades
This patch makes cfgupgrade work on empty cluster (i.e. no instances),up to a point that the config file can be converted from 1.2 to 2.0.This is not yet complete, though.
Fix bash completion for cluster copyfile/command
“copyfile” takes a file argument, so we enable file-completion for it.“gnt-cluster command” takes a command, so we enable command completion.
Release 2.0rc1
This patch updates the NEWS file and increases the version to 2.0 rc1.
Export tags to cluster verify hooks
This patch export the cluster and node tags to the cluster verify hookscripts. The tags are exported as a space-separated list, which allowseasy parsing from the shell (e.g. “for tag in $GANETI_CLUSTER_TAGS; do...”) and therefore requires the previous “Don't allow spaces in tag...
Don't allow spaces in tag names
This patch restricts the use of spaces in tags, as this does not allownice exporting of tags to environment in hooks. One can use underscoresor dashes instead of spaces.
Reviewed-by: schreiberal
Update the iallocator documentation
This updates the iallocator documentation to 2.0, bumps up theiallocator version (and moves a constants to lib/constants.py), andfixes a style on install.rst.
Fix a bug in utils.EnsureDirs
This fixes a bug introduced in rev 2562 and also fixes the indentation.
A doc update and a small indentation fix
This adds a small paragraph about the “master” role of a node, and fixesa wrong indentation in the bash completion file.
Use EnsureDirs in KVM as well.
The KVM hypervisor has also code to ensure a list of directories exist.Substitute it with our new utils function.
Reviewed-by: iustinp
Create runtime dir in bootstrap
Some hypervisors (KVM) need RUN_GANETI_DIR to exist even at cluster inittime. This patch creates it in InitCluster just before hv parameterchecking. Since the code to make list of directories is already repeatedtwice in the code, and this would be the third time, we abstract it into...
LUVerifyCluster: Handle the "no volume group" case
If we're only file based and out volume group is set to "None" there'sno point in asking nodes for their volume groups, logical volumes, anddrbd devices, and checking those.
Convert the RAPI document to restructured text
This patch changes the RAPI document, and the RAPI resourcesautogenerated-documentation to restructured text. This meant changingthe autogen tool.
The new fragment can be included via RST directives, and doesn't need...
Fix some epydoc style issues
99% of the epydoc return tags are "@return:", but each of the modified fileshad one "@returns:" line. We fix this for consistency.
Convert the install document to restructured text.
This switches back to the hardcoding of the version number, as we don'tyet have a wrapper for rst files that passes them throughreplace-sed-vars.
Fix the Makefile after the bash_completion patch
I've somehow left these two out. Sorry!
Add bash-completion rules
This is a not-complete bash completion file for ganeti commands (gnt-*)and the burnin tool. It is based on previous work by Minghua Ye<yeminghua@google.com> for Ganeti 1.1, which wasn't used because thelack of ssconf keys (which allow easy inspection by the shell of the...
Fix typos in utils.WriteFile's docstring
Fix mixed pvm/hvm clusters and instance listing
The current implementation of the combining of the instance lists willonly do this for instances whose all four-fields match in bothhypervisors; however, this is broken for the dynamic fields (state,times) which can change between the invocations of the two different...
Fix xen-hvm and KERNEL_ARGS
xen-hvm doesn't have KERNEL_ARGS, and I just changed blindly all oldextra_args to HV_KERNEL_ARGS. This makes xen-hvm work again.
Update some version-related constants
Since we are quite close to final RPC and hooks APIs, we update the hooks andprotocol_version constants.
Convert the hooks document to restructured text
This also updates the hooks document to 2.0.
Update some hooks settings
While reviewing the hooks document, I realised we are not correctlyexporting the instance properties.
This patch fixes: - export the disk and disk template in all LUs, not only (hardcoded) in the instance create - removes the instance create INSTANCE_ prefix on some non-instance...
Remove the extra_args parameter in instance start
This patch removes the extra_args parameter and instead switches theinstance to the HV_KERNEL_ARGS hypervisor option.
This is a big change, but it's a needed cleanup, this extra parameter onall RPC calls is not generic and we also need to have a persistent value...
Simplify a little the hypervisor routines
Instead of “instance.hvparams”, we use a shorter “hvp” name to make readabilitybetter.
Add definitions for the root_args hypervisor param
This patch adds a new hypervisor parameter for the hypervisors that canactually start and instance with external kernels.
Convert iallocator.sgml to restructured text
This is a no-contents change, this doc will need update to conform to2.0 message contents (and also the code will need increase to version 2of the iallocator protocol).
Convert the admin guide to restructured text
The RST format holds a little bit less information, as all the <fileclass="directory"> and <userinput> tags are gone, however we're notreally losing important context here. And it's way easier to read andupdate....
gnt-instance info: remove hvattr descriptions
Having hvattr descriptions is only confusing for the user, because evenif they explain better what an attribute is about, they don't help indeciding what keyword should be used to actually set it. If in the...
Make gnt-instance info work with offline nodes
This simply makes LUQueryInstanceData return the same information as fora static query when one or both of the nodes are down.
dumb-allocator: avoid allocating on drained nodes
This was forgotten when drained nodes were added.
Also generate HTML format for the man pages
This would help in generating online-viewable docs, which could link tothe man pages.
Update version numbers to beta2
Note that the RAPI change is in a docstring (i.e. example), not in code.
Show more details for failed xen commands
This patch also logs the output of the xm commands in case of failures;some corner cases were forgotten in the last redo.
Update the install and admin documents
This is not a real update, just a quick pass changing the obvious parts.
QA: add support for burnin rename
This patch adds support for optionally doing the rename burnin test, andadds an example to the sample QA file. To disable, either remove orspecify an empty rename target.
Fix some bugs in reboot
There are two issues fixed in this patch: - first, the recent RPC changes caused loss of data in hard reboot type; we weren't reporting any results from the stop/start instance calls; - second, in soft or hard reboots, we didn't initialized the disk...
Burnin: fix rename
In rename, we must stop different names in the first and second phases,so we create two different opcodes for this purpose (instead of usingthe same one twice, which doesn't work).
Update NEWS for beta 2
Convert IOErrors for /proc/drbd into our errors
If /proc/drbd can't be opened, this raises an IOError, but all theerror-handling behaviour in backend treats only BlockDeviceErrors. Thiscreates a plain failure in cluster verify and in other RPC calls.
This patch simply converts EnvironmentErrors into BlockDeviceErrors, and...
DEVNOTES: we have no --enable-rapi anymore
Remove it from the suggested development ./configure line
Convert default root partition to msdos style
As discussed with 2.0 msdos partition style should be the default in theinstance OS, so we're changing the default instance params accordingly.A followup patch will update the debootstrap os.
watcher: fix checking of boot IDs
The recent change (commit 2151) to the watcher to make it handle offlinenodes also saves the offline attribute to the state file, but this isnot needed and also breaks the checking of the boot ID. This patchsimply removes it, restoring the correct behaviour....
watcher: autoarchive old jobs
This patch adds auto-archiving of jobs older than 6 hours to thewatcher.
RAPI: documentation updates
This patch fixes the version and does some update to the RAPI resourcesdocs.
RAPI: fixes related to write mode
This patch fixes many small issues related to write functions: - update documentations w.r.t. how to add users - update the instance add function for latest API - add instance delete - fix addition of tags - update some error messages...
Some small improvements to the fake hypervisor
This patch modifies the fake hypervisor to substract the memory “used”by “running” instances from the free memory, so the actual nodeinformation changes based on the running instances.
Also some style changes and fixes are added....
Implement the backward-compatible ‘-s’ disk option
This patch adds back to the instance creation command (gnt-instace add,gnt-backup import) the ‘-s’ short form option for specifying asingle-disk instance.
Also a small bug in gnt-backup import is fixed....
SetInstanceParams: export nic changes to hooks
Currently we export the old instance "as is" and any nic changes getlost, so hooks won't know of a different ip, bridge, or mac address.This patch fixes it by putting the nics in the override dict, if anychanges are done....
Remove two fixed FIXME and convert one to TODO
The cli FIXME is not something broken, but rather some better handlingfeature we'd rather have, and the two backend FIXME are done (disks havetheir read only parameter set, and the error is raised and thus reaches...
RAPI: format error messages as JSON
This patch changes the format of the HTTP error messages from text/html, whichis hard to parse from RAPI clients, to JSON which can be automatically parsed.
The error message is an object, which contains always three keys:...
Make RAPI return 502/504 errors for luxi errors
This changes the RAPI error codes for luxi errors; a timeout error isnow reported properly as 504, while any other luxi error is reported as502.
It would be good to convert even more errors into proper return codes in...
Fix ganeti-rapi startup with missing certificate
This patch displays a nicer error message compared to the defaultstacktrace.
job queue: log the opcode error too
Currently we only log "Error in opcode ...", but we don't log the error itself.This is not good for debugging.
LUSetInstanceParams: Fix nic handling
CheckArguments: Use constants.VALUE_NONE rather than hardcoding the string "none" If we're adding a nic fill the nic_dict with default values Check if the mac is syntactically valid, if we have one Don't allow the mac to be 'auto' when modifying a nic...
ConfigWriter.AddInstance check instance mac
There is a race condition in CreateInstance, since the mac address isgenerated early and only added to the config (and thus really assured tobe unique) only at this point. Since it's possible that another instance...
Instance Creation: generate nics earlier
We want the real nic to be shown to the hooks and the allocators, sowe'll generate them in CheckPrereq. We also write a comment about therace condition we generate. This race condition existed even before, somoving this generation will just lenghen it a bit. A separate patch...
Handle better broken disks
While running burnin: File "/usr/lib/python2.4/site-packages/ganeti/objects.py", line 497, in str val += ", size=%dm)>" % self.sizeTypeError: int argument required
This happened while handling another error, so we lose the original...
Update the command line scripts man pages for 2.0
This patch updates the gnt-* scripts to show the new 2.0 syntax. It'snot guaranteed to be 80% complete.
Some command line scripts fixes
This patch changes the gnt-node and gnt-job list commands to acceptargument and list only the selected items, which is useful when havingmany nodes or jobs.
It also removes the “--units” option from gnt-job list as we don't...
Do not check 'None' disk IDs for duplicates
In case of 'None' logical or physical IDs, we don't need to check themfor duplicates. This case can happen for DRBD devices in case of newlyadded disks, for example.
Prevent race condition on MAC addresses
This patch adds a temporary set for MACs that have been requested butare not yet in the configuration (as part of an instance NIC). The MACsof an instance are automatically removed from this set when the instance...
Always use the same short option for iallocator
This patch changes the scripts so that the short name for the“--iallocator” option is always ‘-I’.
Some batcher fixes
Currently the batcher hypervisor parameter must be a dict with oneelement (e.g. {"xen-hvm": { "acpi": true }}). This is overly complex andhard to validate correctly; the patch splits it in two: - one "hypervisor" string parameter, with the name of the hypervisor...
Some small fixes
This patch removes the admin_ram LUQueryInstances field (is brokenanyway) and fixes the VNC address checks in the Xen Hypervisor.
Fix LUQueryInstances fields.
The query fields are now regular expressions. We need to quote the dots,otherwise invalid fields will be accepted but they will lose specialformatting in the cli scripts.
Apply the right permissions to /etc/hosts
In the current Ganeti version when modifying /etc/hosts we mistakenlygive it the permissions of the temporary file we create to define itscontent, which is by default 0600. This breaks most non-rootapplications, and thus must be corrected. This patch forces the mode to...
Fix RPC result handling in _AssembleInstanceDisks
For (status, data)-style RPC calls, the result data is in the ‘payload’attribute. This was missed in the conversion patch, with the only sideeffect that gnt-instance activate-disks didn't show a nice output...
Man page updates for the ganeti daemons.
This patch adds new man pages for the master and RAPI daemons, andupdates the node daemon and watcher man pages.
master daemon: allow skipping the voting process
This patch introduces a 'force' mode for the master daemon startup wherethe voting process is not done, but the user has to confirm manually thestartup (before forking, of course).
Remove a duplicate line in sed_vars
LOCALSTATEDIR is added twice to the sed variables.
ConfigWriter: add checks for duplicate disk IDs
This patch adds a safety check for duplicate disk logical/physical IDs,in order to prevent possible software bugs.
Switch the instance_shutdown rpc to (status, data)
This patch changes the return type from this RPC call to include statusinformation and renames the backend method to match the RPC call name.
The patch is a little bigger than the reboot one, since this call is...
Switch the instance_reboot rpc to (status, data)
This small patch changes the return type from this RPC call to includestatus information and renames the backend method to match the RPC callname.
FileStorage: abort creating over an existing file
In FileStorage there is a TODO: decide whether we should check for existing files and abort or notAfter Ganeti ate my instance data I decided. Let's abort.In general there is no reason we should overwrite existing files, and...
gnt-instance fix a typo in AddInstance
It's hvparams, not opts.hvparams.
_GenerateDiskTemplate: correct file disk index
Currently when adding disks the base for the index is not taken intoaccount, and disk 0 is added twice.
gnt-cluster, pass hvparams directly to dict()
If hvparams is not set it will be [], so dict() will transform it to anempty dict, which is safe in all cases.
ganeti-noded: Create LOCK_DIR if missing
We need this directory for locks, so if for any reason it's not therewe'll create it. The permissions are the standard /var/lock permissions.
HTS_USE_VNC, rename and remove KVM
Currently we use the HTS_USE_VNC constant only to copy the vnc passwordfile. While KVM uses vnc it currently has no password support, nor we'llbe on time making one for 2.0, so renaming the constant toHTS_COPY_VNC_PASSWORD and only putting Xen HVM in it. In the future...
Sort instance data in gnt-node info
The patch sorts the instance list in gnt-node info output, in order tomake it more readable (and stable).
Some fixes to node add and re-add
The patch changes the pre-checks in node-add and re-add: - if the node is not already in the cluster, refuse to re-add - when re-adding, reuse the secondary IP from the cluster configuration - when re-adding, reset the offline and drained flags, so that RPC...
Instance parameters: force typing
We want all the hv/be parameters to have a known type, rather than arandom mix of empty string, boolean values, and None, so we declare thetype of each variable and we enforce/convert it.
- Add some new constants for enforceable value types...
Implement modification of the drained flag
This patch adds LU and cli-level support for modification of the nodedrained flag. It is similar to the offline changes.
Prevent allocations on drained nodes
This patch adds checks for drained nodes in the logical units thatallocate or move instances around. We also update an error message (notstyle-compliant).
cluster verify: show correctly drained nodes
This patch changes slightly the output of gnt-cluster verify for drainednodes, and also adds a note with the total number of drained nodes(similar to the offline nodes note).
ConfigWriter: handle the drained node flag
This patch changes the master candidate pool computations inConfigWriter to properly handle drained nodes. They are now excludedfrom counting towards the reachable number of candidates.
The patch also adds verification of consistency for the node status....
burnin: do not use drained nodes
This patch updates burnin not to use drained nodes (similar to thehandling of offline nodes).
dumb allocator: do not use drained nodes
This patch changes the dumb allocator not to use drained nodes (similarto offline nodes).
Allow query of the drained node attribute
This patch exports the drained attribute: - LUQueryNodes accepts now the drained field - RAPI exports it for node objects - gnt-node info shows it now (along newly-added master_candidate and offline flags)...
Add a ‘drained’ attribute to node objects
This attribute will be used to prevent any allocation on the node (anyof replace-disks with new secondary this node, failover to the node,migration to the node).
The patch adds the attribute and initializes it correctly in cluster...
Some error message cleanups
Cleanup of DRBD8._CheckMetaSize
This patch converts the _CheckMetaSize method to raise exceptionsinstead of logging and returning False. This fits now in the new rpcreturn types, so it's a cheap change.
Change the disk assembly to raise exceptions
This big patch converts the bdev Assemble() methods and the supportingfunctions to raise exceptions instead of returning False. This is a bigpatch, since the assembly functions touch other functions: add children,...
Change BlockDev.Remove() failure result
Currently, the Remove() methods of block devices return True/False.This doesn't permit any error detail reporting.
This patch changes the return type to None for success, and raisesBlockDeviceError in case of failure. This permits the details to be...
Switch the blockdev_remove rpc to (status, data)
This converts the backend and cmdlib modules to a (status, data)implementation of the blockdev_remove rpc call. bdev.py is not yetconverted, so we don't actually have error information.
We also fix a bug in _RemoveDisks by not reusing a variable....
Change BlockDev.Shutdown() failure result
Currently, the Shutdown() methods of block devices return True/False.This doesn't permit any error detail reporting.