Documentation fixes and clarification
- In README, refer to “install.rst”, not “install.html”- In rapi.rst, wrap line longer than 72 characters- In rlib2.py, update and clarify description of POST vs. PUT
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
gnt-instance: Rename SHUTDOWN* to EXPAND*
Once upon a time these constants were only used for stopping instances,but pretty soon they became more useful. Let's rename them.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
rlib2: Exclude oplog/opresult from bulk job list
These fields can get rather large. Excluding them from the big bulk listreduces the amount of data. They are still available via per-jobrequests.
rlib: Expose node group tags
Commit 1ffd26739d3 added support for tagging node groups. Also add acheck for exposed fields.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
rapi: Bulk support for jobs
This was requested in issue 181.
Fixed an error in the documentation of _GetKVMVersion
Fixed an epydoc compilation error that I introduced with last commit.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Removed code duplication for calls to _GetKVMVersion
Fix epydoc breakage caused by f8638e288c7a
Changed NET_PORT_CHECK to REQ_NET_PORT_CHECK, to improve consistency
I originally made this change because I needed the OPT_NET_PORT_CHECK,and I am committing it even if I don't need anymore OPT_NET_PORT_CHECKbecause IMO it improves the consistency of the name of the wrappers....
Added check for the ip command at configure time
Also, corrected a few places where the ip command was hardcoded.
Detect globbing patterns as query arguments
Short: this patch enables the use of “gnt-instance list '*.site'”.
Detailed description: This patch changes the command line interface codeto try to deduce the kind of filter from the arguments to a “list”command. If it's a list of plain names an old-style name filter is used....
Allow fixing of split instances via relocate
Currently, the IAllocator code requests strictly that the (set of) groups ofthe nodes we're relocating from is equal to the set of groups we'rerelocating to.
This, however, makes is impossible to fix split instances, since (by...
Further cleanup after multi-evacuate removal
Commit f0edfcf6 removed the parsing of multi-evacuate result, but thecode went from:
if mode in (multi-evac, relocate): … if mode relocate: …
to:
if mode relocate: … if mode == relocate...
Fix bug in IAllocator parsing of Evacuate result
Commit 342f9172 added stricter checks for the iallocator result inevacuate mode, but it does this irrespective of the resultstatus. When the result has failed and (according to the design) thelist of nodes is empty, this code will trigger the following:...
Implement globbing operator for filters
The operators “=*” and “!*” do globbing in filters, e.g.:
$ gnt-instance list --no-headers -o name 'name =* "*.site"'inst1.site.example.com
Zero DRBD metadata before creation
The docstring of the DRBD8 class says:
… The meta device is checked for valid size and is zeroed on create.
which is not done today, hence we havehttp://code.google.com/p/ganeti/issues/detail?id=182:
node1# mkreiserfs -f /dev/xenvg/t8...
Remove iallocator's “multi-evacuate” mode
It is no longer used and has been deprecated in 2.5.
confd.querylib: Remove long-deprecated query mode
This was never used by a stable version.
Add docstring to cmdlib.TLReplaceDisks._FindFaultyDisks
watcher: Fix breakage caused by 9bb69bb52fb9
The first argument to str.split is the separator, not the maximum numberof splits.
LUGroupVerifyDisks: Use _CheckInstanceNodeGroups' result
… instead of getting the list of instances once again from theconfiguration.
cmdlib: Factorize checking node groups' instances
utils.ReadFile: Add pre-read callback
This will be used by the watcher to store the file's fstat(2). It mustbe done from the filehandle.
watcher: Write per-group instance status, merge into global one
Each per-group watcher process writes its own instance status file. Oncethat's done it tries to acquire an exclusive lock on the global file andwill proceed to read all status file, merging them based on each file's...
Merge branch 'stable-2.4'
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Fixed a typo in utils/process.py
Signed-off-by: Agata Murawska <agatamurawska@google.com>Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Remove 15-second sleep from LUInstanceCreate
Remove 15 second sleep when wait_for_sync is not set. LUInstanceCreate alreadycalls _WaitForSync with oneshot=True, which already performs an internalwait-loop for disks to start syncing.
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>...
Add a readability alias
lu.glm.list_owned becomes lu.owned_locks, which is clearer for thereader.
Also rename three variables (which were before named owned_locks) tomake clearer what they track.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Fix broken object references in docstrings
The module is called “objects”, not “object”.
Add “gnt-instance change-group” command
Add opcode to change instance's group
This is quite similar to evacuating a group, but the lockingis different.
Factorize checking instance's node groups
Remove WATCHER_STATEFILE constant
ganeti-watcher: Split for node groups
This patch brings a huge change to ganeti-watcher to make it aware ofnode groups. Each node group is processed in its own subprocess,reducing the impact of long-running operations.
The global watcher state file, $datadir/ganeti/watcher.data, is replaced...
Lock potential target nodes for group evacuation
All potential target nodes should be locked while calculatinga group evacuation.
Small changes in group evacuation
- Use OpPrereqError in CheckPrereq- Clarify command synopsis
cmdlib: Factorize getting iallocator
The same logic will be used for changing an instance's group.
Pause DRBD sync for OS install if not wait_for_sync
When wait_for_sync is set to False in LUInstanceCreate, Ganeti lets DRBD syncin the background while performing the rest of the installation steps,including OS installation.
However, OS installation is a very disk-intensive task that intereferes badly...
Fix documentation of gnt-instance failover
Explain that we only start the instance on the new node if it wasoriginally running.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix small typo in docstring
Change the backend.InstanceLogName signature
This uses now the component for the transfer (if available), otherwise(e.g. in installs/renames) nothing.
Instance transfer: export component name to backend
This modifies the RPC layer to export the component name too to thebackend, so that it can be used in log files and messages.
Instance transfer: add argument for the 'component'
Currently, transfer data is done mainly with just the instance name,but when we have instances with multiple disks this is not enough todistinguish between the different transfers being done for theinstance....
Fix lint errors
It turns out that the only use of the operator module was foritemgetter, so patch eb62069e should have removed that import too.
Optimise use of repeated/looping GetNodeInfo
This adds a new ConfigWriter.GetMultiNodeInfo function and replacesmultiple/looping calls to GetNodeInfo with it.
Optimise use of repeated/looping GetInstanceInfo
Similar to the previous patch, this adds a helper function toeliminate repeated calls info ConfigWriter.
Add two more compat functions
operator.itemgetter(0) → fstoperator.itemgetter(1) → snd
snd is not used yet, but it makes sense to add both.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Fix types passed to IAllocator
Iallocator mode reloc, parameter reloc_from takes a list; half of thecode already forced this parameter to list, we add the other two caseswhere it is needed.
jqueue: Add short delay before detecting job changes
By sleeping for 100ms after receiving a notification for a changed jobfile the job is given some additional time to change again. Thissignificantly reduces the number of LUXI calls for WaitForJobChanges...
Add primary/second nodes' group as query fields
These will be very useful for ganeti-watcher as it needs to retrieveinstances by group.
Fix doclint failures
Commit 54ca6e4b2 renamed some arguments, but didn't also renames themin the docstrings.
watcher: Separate function for writing instance status file
For now this will do another query to the master daemon, but with thesplit for node groups this issue will go away.
watcher: Make RAPI error messages less technical
watcher.state: Use strings, not objects
Until now the state class would receive instances as objects(ganeti.watcher.Instance), but this is not necessary. By using stringsthe interface is simplified.
This patch also simplifies some code accessing the internal structures,...
watcher: Raise error on unknown hook status
Also, remove punctuation from one error message.
watcher: Reformat constants
Make them match with style guide.
Add new watcher constants
WATCHER_STATEFILE will be removed at the end of thispatch series.
Fix formatting of frozensets
Signed-off-by: Stephen Shirley <diamond@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
cli: Add constant for node group option
ganeti-watcher will use this constant to pass the option to itself forprocessing all node groups.
Replace %r with '%s' in masterd/instance.py
I still don't know why Michael is a fan of %r, but in the meantimethis patch changes:
WARNING: import u'import-2011-07-29_01_39_33-y3gZKV' on node1 failed:Exited with status 1
into:
WARNING: import 'import-2011-07-29_01_39_33-y3gZKV' on node1 failed:...
Add "reboot_behavior" hypervisor flag
During instance installations, you do not want the instance to rebootand start again with the same parameters, as that will most likelyre-start the install process. Therefore, when the instance requests areboot it should instead shutdown. This flag allows this to be...
Clear the OS scripts environment
The OS scripts currently run with the whole noded environment; this isdifferent from the hooks which run with a cleared one and most likelyan oversight.
This might create problems when upgrading, so it needs to be clearly...
watcher: Split state class into separate module
Rename watcher's constant for instance status file
“upfile” is a bad name.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
watcher: Split node maintenance into separate module
The node maintenance class is standalone.
Merge branch 'devel-2.4'
Remove requirement for variants on OS API v15+
This removes:
- the check in backend that such OSes have a variants file or if it exists that is non-empty; in order for this to work, we also rework the logic in backend._TryOSFromDisk to allow for optional OS files...
Revert "cli.JobExecutor: Feedback function for info output"
This reverts commit 7421df8e5f2cf31022085b332d1300640ba5854b.
The feedback_fn argument to JobExecutor is used for PollJob, and thushas a fixed signature: a single arg, tuple of (timestamp, log type,...
Fix group verification of offline nodes
Commit aef59ae7 reworked the file verification, but forgot to takeinto account offline nodes.
The fact that this was not detected yet is due to the fact that wedon't test clusters with offline nodes in QA :(
Signed-off-by: Iustin Pop <iustin@google.com>...
Disallow variants for OSes that don't support them
Otherwise we get no variant checks at all, but the variant is stillrecorded.
Fix OS queries for API v20 w/parameters
OS parameters is a list of tuples, so we can't pass it directly toutils.NiceSort, hence we use a sort key.
This was not detected in QA since QA only tests API v10 :(
Add helper for declaring all locks shared
This patch adds a function for abstracting“dict.fromkeys(locking.LEVELS, 1)”. It also removes a duplicateassignment for the share_locks in LUInstanceQuerydata.
Additionally, it moves the _SupportsOob function to the helper...
Add ht-based result checks to opcodes
This adds the infrastructure necessary to check opcode results usinght-based functions. Checks are added for two opcodes.
Change OpClusterVerifyDisks to per-group opcodes
Until now verifying disks, which is also used by the watcher,would lock all nodes and instances. With this patch the opcodeis changed to operate on per nodegroup, requiring fewer locks.
Both “gnt-cluster” and “ganeti-watcher” are changed for the...
cmdlib: Give instance name in error message on group evacuation
cmdlib: Factorize mapping instance LVs to node/volume
cli.JobExecutor: Feedback function for info output
This will be used in the watcher where we don't want topollute stdout unless in debug mode.
Add OS search path to gnt-cluster info
Otherwise, it's pretty hard to figure it out from the command line.
Signed-off-by: Ben Lipton <benlipton@google.com>Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Fix recompilation of htools on regen-vcs-version
Currently, most htools code depends on Constants.hs which is generatedfrom constants.py and also depends on _autoconf.py. Also, _autoconf.pydepends on vcs-version, which all together means that when 'make...
Add another name for the --yes-do-it option
Most boring patch ever
s/'/"/ in (hopefully) the right places.
Reopen daemon's stdio on SIGHUP
Before this patch daemons would continue to refer to an old logfile fortheir standard I/O if they had been asked to reopen the log (SIGHUP).
Reopen log file only once after SIGHUP
Commit b6fa9a44 added a re-openable log handler. The log file isreopened when a daemon is sent a HUP signal. Due to a bug in the code,fixed by this patch, the log file would be reopened for every single logmessage thereafter....
Don't leak file descriptors when setting up daemon output
When a daemon's output is configured using “utils.SetupDaemonFDs”, thefunction must use dup2(2). Unfortunately the code didn't close theoriginal file descriptors, leaking them in the process.
gnt-instance info: Return static info if node offline
Before this patch “gnt-instance info” would fail with the error message“Error checking node $node: Node is marked offline” if the instance'sprimary node is marked offline and the user didn't explicitely request...
Ignore offline primary when failing over
When the source node for a failover is marked offline, there's no needto require the user to specify “--ignore-consistency”.
To make it work at all, a number of bugs introduced by the merge ofmigration and failover are also fixed by this patch....
gnt-instance console: Use query instead of opcode
This means opening the console no longer requires the instance lock,allowing it to be used during long-running operations (e.g. replacing adisk).
Add opcode attribute for comments
This attribute allows programmatic submitters of jobs (e.g. iallocator)to add a comment to each opcode, describing its purpose. Example:
$ gnt-job info 123Job ID: 123 … Opcodes: OP_INSTANCE_REPLACE_DISKS …...
gnt-node volumes: Fix instance names
Commit 84d7e26b changed “objects.Instance.MapLVsByN” to not just returnthe LV name, but to include the volume group name (e.g.“xenvg/d67e8700….disk0_data”). This in turn broke the mapping of volumenames in LUNodeQueryvols, stopping instance names from displayed in...
Fix instance failover (missing argument)
More fallout from commit 323f9095b49d.
Implement instance failover via RAPI
No idea why this was missed before.
Make lock monitor more versatile
With this change it'll be possible to register other lock informationproviders. One usecase for this are job dependencies, which can be shownin the output of “gnt-debug locks”, too.
The lock monitor is changed to accept more than one return value from...
locking.GLM: Allow adding locks to monitor
This will be used for exporting job dependencies throughthe lock monitor.
Export job dependencies through lock monitor
This makes them visible to the user. Example:
$ gnt-debug locks -o name,pendingName Pendingjob/890 job:891,892job/892 job:894
Add error state to LUGroupEvacuate's exceptions
Rename *_STATUS_WAITLOCK to …_WAITING
This patch renames the {JOB,OP}_STATUS_WAITLOCK constants to {JOB,OP}_STATUS_WAITING, as per design document for chained jobs.
gnt-group: Add command to evacuate whole group
Add new opcode for evacuating group