Clarify job ID-related type checks, add unittests
Instead of a rather complicated expression only “JobId” is output. JobID lists (like generated by “SubmitManyJobs”) are limited to two-itemlists. Unittests are added.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Change OpClusterVerifyConfig's result, verify results
This patch removes the list of node groups (not used anymore sincecommit fcad7225e3fc) from OpClusterVerifyConfig's result and adds resultverification to all OpClusterVerify* opcodes.
Fixed error in Makefile.am, changing spaces with tabs
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Added the test data files for netutils to Makefile.am
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Use LU-generated jobs for verifying cluster
This patch moves the logic for verifying the various node groups in acluster into the master daemon. Job dependencies are used to ensure theconfiguration, which requires the BGL, is verified first.
With this change it will be possible to expose whole-cluster...
opcodes: Use variables for verification parameters
Just some cleanup before the 2.5 release.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
mcpu: Specify actual received type on opcode issue
This helped me debug an issue with opcodes.
Use resource kind as OpQuery*'s description
This gives a hint as to what's queried. “QUERY” or“QUERY” are way better than just “QUERY”.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Added helper functions in netutils and related constants
Added the following functions to netutils:- IsValidInterface- GetInterfaceIpAddresses- _GetIpAddressesFromIpOutput
Added the following static methods to netutils.IPAddress:- GetAddressFamilyFromVersion...
Fix epydoc error in rlib2.py
I blindly assumed epydoc would use normal reST, but turns out it usesits own “epytext” in our configuration. Since the latter doesn't supportblockquotes, I just make the paragraph a literal block.
Fix typo in rlib2's docstring
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Benjamin Lipton <benlipton@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Documentation fixes and clarification
- In README, refer to “install.rst”, not “install.html”- In rapi.rst, wrap line longer than 72 characters- In rlib2.py, update and clarify description of POST vs. PUT
gnt-instance: Rename SHUTDOWN* to EXPAND*
Once upon a time these constants were only used for stopping instances,but pretty soon they became more useful. Let's rename them.
List returned fields in RAPI documentation
Also replace console types with constants.
rlib2: Exclude oplog/opresult from bulk job list
These fields can get rather large. Excluding them from the big bulk listreduces the amount of data. They are still available via per-jobrequests.
rlib: Expose node group tags
Commit 1ffd26739d3 added support for tagging node groups. Also add acheck for exposed fields.
rapi: Bulk support for jobs
This was requested in issue 181.
Fixed an error in the documentation of _GetKVMVersion
Fixed an epydoc compilation error that I introduced with last commit.
Mention globbing filters in ganeti(7) manpage
Removed code duplication for calls to _GetKVMVersion
Fix epydoc breakage caused by f8638e288c7a
Changed NET_PORT_CHECK to REQ_NET_PORT_CHECK, to improve consistency
I originally made this change because I needed the OPT_NET_PORT_CHECK,and I am committing it even if I don't need anymore OPT_NET_PORT_CHECKbecause IMO it improves the consistency of the name of the wrappers....
Added check for the ip command at configure time
Also, corrected a few places where the ip command was hardcoded.
Detect globbing patterns as query arguments
Short: this patch enables the use of “gnt-instance list '*.site'”.
Detailed description: This patch changes the command line interface codeto try to deduce the kind of filter from the arguments to a “list”command. If it's a list of plain names an old-style name filter is used....
cluster-merge: implement params delta mercifulness
Sometimes it's good to tell the user about parameter differences butthen proceed anyway. Strictness is still enforced for those parametersthat would break the cluster (volume group name, storage dir if file...
cluster-merge: consider file storage enable state
There's no point in checking whether the file storage dir in the twoclusters is the same if file storage is not even enabled
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Allow fixing of split instances via relocate
Currently, the IAllocator code requests strictly that the (set of) groups ofthe nodes we're relocating from is equal to the set of groups we'rerelocating to.
This, however, makes is impossible to fix split instances, since (by...
Revert deprecation of evacuate mode in hail
As discussed offline, the new node-change mode could be used forevacuation, but it's not directly useful as it returns a list ofopcodes; therefore, we need to partially revert commits fbe5fcf and5b53ca7 that removed it (and multi-evacuate, which remains removed)....
Further cleanup after multi-evacuate removal
Commit f0edfcf6 removed the parsing of multi-evacuate result, but thecode went from:
if mode in (multi-evac, relocate): … if mode relocate: …
to:
if mode relocate: … if mode == relocate...
Fix bug in IAllocator parsing of Evacuate result
Commit 342f9172 added stricter checks for the iallocator result inevacuate mode, but it does this irrespective of the resultstatus. When the result has failed and (according to the design) thelist of nodes is empty, this code will trigger the following:...
Implement globbing operator for filters
The operators “=*” and “!*” do globbing in filters, e.g.:
$ gnt-instance list --no-headers -o name 'name =* "*.site"'inst1.site.example.com
Zero DRBD metadata before creation
The docstring of the DRBD8 class says:
… The meta device is checked for valid size and is zeroed on create.
which is not done today, hence we havehttp://code.google.com/p/ganeti/issues/detail?id=182:
node1# mkreiserfs -f /dev/xenvg/t8...
Remove iallocator's “multi-evacuate” mode
It is no longer used and has been deprecated in 2.5.
confd.querylib: Remove long-deprecated query mode
This was never used by a stable version.
Add docstring to cmdlib.TLReplaceDisks._FindFaultyDisks
watcher: Fix breakage caused by 9bb69bb52fb9
The first argument to str.split is the separator, not the maximum numberof splits.
LUGroupVerifyDisks: Use _CheckInstanceNodeGroups' result
… instead of getting the list of instances once again from theconfiguration.
cmdlib: Factorize checking node groups' instances
Include hooks.rst in version check
Bump version to 2.5.0~beta1
watcher: Write per-group instance status, merge into global one
Each per-group watcher process writes its own instance status file. Oncethat's done it tries to acquire an exclusive lock on the global file andwill proceed to read all status file, merging them based on each file's...
cleaner: Remove watcher's instance status file after 21 days
utils.ReadFile: Add pre-read callback
This will be used by the watcher to store the file's fstat(2). It mustbe done from the filehandle.
Merge branch 'stable-2.4'
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Bumping version to 2.4.3
Fixed a typo in utils/process.py
Signed-off-by: Agata Murawska <agatamurawska@google.com>Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Fix unittest failure after list_owned changes
We just need an object that has a list_owned method.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Remove 15-second sleep from LUInstanceCreate
Remove 15 second sleep when wait_for_sync is not set. LUInstanceCreate alreadycalls _WaitForSync with oneshot=True, which already performs an internalwait-loop for disks to start syncing.
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>...
Add a readability alias
lu.glm.list_owned becomes lu.owned_locks, which is clearer for thereader.
Also rename three variables (which were before named owned_locks) tomake clearer what they track.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Fix broken object references in docstrings
The module is called “objects”, not “object”.
Add “gnt-instance change-group” command
Add opcode to change instance's group
This is quite similar to evacuating a group, but the lockingis different.
Factorize checking instance's node groups
Update the NEWS file for 2.4.3
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
ganeti-cleaner: Remove old watcher state files
Watcher state files can stay around if node groups are removed. Withthis patch they're removed after 21 days.
Remove WATCHER_STATEFILE constant
cfgupgrade: Remove old watcher state file
ganeti-watcher: Split for node groups
This patch brings a huge change to ganeti-watcher to make it aware ofnode groups. Each node group is processed in its own subprocess,reducing the impact of long-running operations.
The global watcher state file, $datadir/ganeti/watcher.data, is replaced...
Lock potential target nodes for group evacuation
All potential target nodes should be locked while calculatinga group evacuation.
Small changes in group evacuation
- Use OpPrereqError in CheckPrereq- Clarify command synopsis
cmdlib: Factorize getting iallocator
The same logic will be used for changing an instance's group.
Add design document for Ganeti 2.5
Including the designs which were actually implemented.
Pause DRBD sync for OS install if not wait_for_sync
When wait_for_sync is set to False in LUInstanceCreate, Ganeti lets DRBD syncin the background while performing the rest of the installation steps,including OS installation.
However, OS installation is a very disk-intensive task that intereferes badly...
Fix documentation of gnt-instance failover
Explain that we only start the instance on the new node if it wasoriginally running.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Small doc patch for gnt-node evacuate
Just explain a bit the relation between node evacuate and instancecommands.
Fix small typo in docstring
Fix typo in NEWS
“--dry-run” starts with two dashes.
Change the backend.InstanceLogName signature
This uses now the component for the transfer (if available), otherwise(e.g. in installs/renames) nothing.
Instance transfer: export component name to backend
This modifies the RPC layer to export the component name too to thebackend, so that it can be used in log files and messages.
Instance transfer: add argument for the 'component'
Currently, transfer data is done mainly with just the instance name,but when we have instances with multiple disks this is not enough todistinguish between the different transfers being done for theinstance....
Optimise use of repeated/looping GetInstanceInfo
Similar to the previous patch, this adds a helper function toeliminate repeated calls info ConfigWriter.
Optimise use of repeated/looping GetNodeInfo
This adds a new ConfigWriter.GetMultiNodeInfo function and replacesmultiple/looping calls to GetNodeInfo with it.
Fix lint errors
It turns out that the only use of the operator module was foritemgetter, so patch eb62069e should have removed that import too.
gnt-node.rst: Fix a typo
Add two more compat functions
operator.itemgetter(0) → fstoperator.itemgetter(1) → snd
snd is not used yet, but it makes sense to add both.
Add a flag to burnin to allow specifying VCPU count.
Signed-off-by: Pedro Macedo <pmacedo@google.com>Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Fix types passed to IAllocator
Iallocator mode reloc, parameter reloc_from takes a list; half of thecode already forced this parameter to list, we add the other two caseswhere it is needed.
htools: change absolute to relative symlinks
Currently we use absolute symlinks, but this doesn't work when weinstall remotely (due to install first to local temp dir, then rsyncto remote machines). To fix, we change to manually-computed relativepaths, which is not best, but it works....
jqueue: Add short delay before detecting job changes
By sleeping for 100ms after receiving a notification for a changed jobfile the job is given some additional time to change again. Thissignificantly reduces the number of LUXI calls for WaitForJobChanges...
Add primary/second nodes' group as query fields
These will be very useful for ganeti-watcher as it needs to retrieveinstances by group.
Fix doclint failures
Commit 54ca6e4b2 renamed some arguments, but didn't also renames themin the docstrings.
watcher: Separate function for writing instance status file
For now this will do another query to the master daemon, but with thesplit for node groups this issue will go away.
watcher: Make RAPI error messages less technical
watcher.state: Use strings, not objects
Until now the state class would receive instances as objects(ganeti.watcher.Instance), but this is not necessary. By using stringsthe interface is simplified.
This patch also simplifies some code accessing the internal structures,...
watcher: Raise error on unknown hook status
Also, remove punctuation from one error message.
watcher: Reformat constants
Make them match with style guide.
Add new watcher constants
WATCHER_STATEFILE will be removed at the end of thispatch series.
Fix formatting of frozensets
Signed-off-by: Stephen Shirley <diamond@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
cli: Add constant for node group option
ganeti-watcher will use this constant to pass the option to itself forprocessing all node groups.
Replace %r with '%s' in masterd/instance.py
I still don't know why Michael is a fan of %r, but in the meantimethis patch changes:
WARNING: import u'import-2011-07-29_01_39_33-y3gZKV' on node1 failed:Exited with status 1
into:
WARNING: import 'import-2011-07-29_01_39_33-y3gZKV' on node1 failed:...
Add "reboot_behavior" hypervisor flag
During instance installations, you do not want the instance to rebootand start again with the same parameters, as that will most likelyre-start the install process. Therefore, when the instance requests areboot it should instead shutdown. This flag allows this to be...
Removed non-existing -t option from the gnt-cluster man page
Clear the OS scripts environment
The OS scripts currently run with the whole noded environment; this isdifferent from the hooks which run with a cleared one and most likelyan oversight.
This might create problems when upgrading, so it needs to be clearly...
watcher: Split state class into separate module
Rename watcher's constant for instance status file
“upfile” is a bad name.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Fixed a typo in the installation tutorial
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
watcher: Split node maintenance into separate module
The node maintenance class is standalone.
Fixed doc compilation under Sphinx 1.0.7
Sphinx 1.0.7 complains if an indented block in .warning starts with :option.This fixes it.
Merge branch 'devel-2.4'
Remove requirement for variants on OS API v15+
This removes:
- the check in backend that such OSes have a variants file or if it exists that is non-empty; in order for this to work, we also rework the logic in backend._TryOSFromDisk to allow for optional OS files...