Fix small typo in docstring
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix typo in NEWS
“--dry-run” starts with two dashes.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add a flag to burnin to allow specifying VCPU count.
Signed-off-by: Pedro Macedo <pmacedo@google.com>Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add support for cluster/OS parameters in QA
Currently there is no way to QA with (for example) an initrd becausethe QA only inits the cluster with the default parameters. This makesit impossible to QA using anything but the default parameters, whichdoesn't always work....
Add OS search path to gnt-cluster info
Otherwise, it's pretty hard to figure it out from the command line.
Signed-off-by: Ben Lipton <benlipton@google.com>Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Reopen daemon's stdio on SIGHUP
Before this patch daemons would continue to refer to an old logfile fortheir standard I/O if they had been asked to reopen the log (SIGHUP).
Reopen log file only once after SIGHUP
Commit b6fa9a44 added a re-openable log handler. The log file isreopened when a daemon is sent a HUP signal. Due to a bug in the code,fixed by this patch, the log file would be reopened for every single logmessage thereafter....
Don't leak file descriptors when setting up daemon output
When a daemon's output is configured using “utils.SetupDaemonFDs”, thefunction must use dup2(2). Unfortunately the code didn't close theoriginal file descriptors, leaking them in the process.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Fix aliases in bash completion
Ever since commit 2d48a3a2 aliases were not included in the bashcompletion script. This patch also replaces one tab with two spaces.
gnt-node volumes: Fix instance names
Commit 84d7e26b changed “objects.Instance.MapLVsByN” to not just returnthe LV name, but to include the volume group name (e.g.“xenvg/d67e8700….disk0_data”). This in turn broke the mapping of volumenames in LUNodeQueryvols, stopping instance names from displayed in...
ht: Add new check for numbers
Places which receive floats can usually also deal with integers, e.g.OpTestDelay. Tests are added and the new check function is used for theaforementioned opcode and verifying query results.
Fix off-by-one bug in job serial generation
Commit 009e73d0 (September 2009) changed the job queue to generatemultiple job serials at once. Ever since it would return one more thanrequested.
The “serial” file in the job queue directory is defined to contain the...
Shorten some unbreakable lines in man pages
In order to make the display right on 80-columns terminals.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Correct some spelling mistakes
New lintian is even smarter:
- overriden → overridden- allows to → allows one to
Fix bug in recreate-disks for DRBD instances
The new functionality in 2.4.2 for recreate-disks to change nodes isbroken for DRBD instances: it simply changes the nodes without caringfor the DRBD minors mapping, which will lead to conflicts in non-empty...
Fix a lint warning
Patch db8e5f1c removed the use of feedback_fn, hence pylint warnnow.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
KVM: configure bridged NICs at migration start
Commit 5d9bfd870 moved tap interface handling from KVM to Ganeti, partlyto also solve the problem of routed interfaces getting configured tooearly during live migrations, causing network anomalies. In that...
Fix RAPI documentation regarding master role
Fix bug in drbd8 replace disks on current nodes
Currently the drbd8 replace-disks on the same node (i.e. -p or -s) hasa bug in that it does modify the instance disk temporarily beforechanging it back to the same value. However, we don't need to, andshouldn't do that: what this operation do is simply change the LVM...
LUInstanceCreate: use opcodes.RequireFileStorage
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Don't add ",boot=on" to disks on kvm >= 0.14
Under newer kvm this prevents the vm from starting.Ah, change!
KVM: fix per-instance stored UID value
When using the pool security model, ExecuteKVMRuntime was storing theinstance's UID using str(uid), which would result in storing theLockedUid._repr__() result:
$ cat /var/run/ganeti/kvm-hypervisor/uid/xxxxxxxxxxxxx...
Add one forgotten element to the file disk path
This was left out during the fix/refactoring
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
LUInstanceCreate: fix file storage dir calculation
- Move the calculation at the beginning of CheckPrereq, since it doesn't modify any state, but still keeps locks- Only perform the calculation if the actual disk template is filebased- Error out if there is no defined file storage dir...
Check that filestorage is enabled when requested
Remove self.op.file_storage_dir isabs check
As the manpage says, and the code does, self.op.file_storage_dir is anadditional relative path under the cluster file storage dir. As such itshould not be absolute.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
jqueue: Fix potential race condition when cancelling queued jobs
When a job was cancelled, its status would be changed and the filewritten again. Since this was a final status, the job file could bemoved anytime for archival. If the job was still in the queue, however,...
Fix argument order in ReserveLV and ReserveMAC
ConfigWriter.ReserveLV() and Configwriter.ReserveMAC() calledTemporaryReservationManager.Reserve() with the ec_id and resource argumentsswapped. As a result, two reservation attempts for the same resource type...
TLReplaceDisks: Move assertion checking locks
Commit 1bee66f3 added assertions for ensuring only the necessary locksare kept while replacing disks. One of them makes sure locks have beenreleased during the operation. Unfortunately the commit added the check...
node evac: don't call IAllocator if no instances
Currently we generate an empty list only for the '-n node' invocation,but for iallocator we still call the iallocator (which needs an RPCcall, etc.). By moving the computation of instances outside of the if...
RPC/Backend: Make UploadFile uid and gid agnostic
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Resolve uid/gid upon mainloop run
GetEntResolver: Make it possible to resolve uid/gid to name
utils.algo: Add InvertDict to invert a dict
autotools: Add noded group
Fix errors in hooks documentation
In many cases the opcode ID was incorrect. A unittest for this willbe added in the master branch.
Clarify a bit the noded man page
"This can be overriden" can be read as either the port we listen on orthe address we bind to. Replace with "The port" for great clarity!
Note --no-remember in NEWS
Switch QA over to using instance stop --no-remember
Instead of hardcoded Xen commands. This will make it work for allhypervisors, instead of duplicating hypervisor functionality in QAitself.
The timeout has been removed as gnt-instance stop itself will make...
Implement no_remember at RAPI level
Implement no_remember at CLI level
Introduce instance start/stop no_remember attribute
This will allow stopping or starting an instance without changing theremembered state. While this seems counter-intuitive at first (it willcreate cluster verify errors), it can help in a few corner cases:...
Bump version for the 2.4.2 release
I think we should stop finding bugs and instead release this :)
Preload the string-escape code in noded
This encoding, part of the standard Python installation, is used bythe pickle module (in turn used by subprocess when handlingfailures in program execution). Preloading it means that Python willcache it in memory so that even if the disk goes away or just the...
Abstract ignore_consistency opcode parameter
Two opcodes already use it and we need it for a third, time to add aconstant for it.
Fix a bug in LUInstanceMove
The opcode parameter ignore_consistency was used in the LU, but notactually declared in the OpCode. The patch adds it in the opcode andthe command line client.
ObQuote — Please, please, can I have static typing?
Signed-off-by: Iustin Pop <iustin@google.com>...
Fix error in iallocator documentation reg. disk mode
The code uses the disk object's “mode” attribute, which uses theconstants DISK_RDONLY (“ro”) and DISK_RDWR (“rw”).
Try to prevent instance memory changes N+1 failures
There are multiple bugs with the code checking for N+1 failures in theinstance memory changes which needs significant changes, in themeantime we can at least:
- change the warning message into an error (--force will skip checks)...
Update NEWS file for the 2.4.2 release
Use floppy disk and a second CDROM on KVM
Hi all,this patch will add 3 new KVM parameters and a new option.
New Parameters: - floppy_image_path = "" -> Specify the floppy image to load asfloppy disk. - cdrom2_image_path = "" -> Specify a second cdrom image to load on...
Document the selection of instance kernels
A simple doc patch to document how to configure the kernels for theinstances.
Make root_path an optional hypervisor parameter
This will allow us an easy migration to pv-grub, because a set root_pathconfused pv-grub.
Some man page updates
This adds documentation for both the short and long form of manyoptions (which was inconsistent before: in some cases only the shortform was used, in others only the long form).
Note that the standard this patch adopts is to document both forms as...
Add 2 new variables to the OS scripts environment
Add INSTANCE_PRIMARY_NODE and INSTANCE_SECONDARY_NODES. These newvalues are useful for OS scripts that needs to know the nodes wherethe instance lives.. or has lived.
Add --no-wait-for-sync when converting to drbd
Currently, when converting an instance from plain to DRBD, theinstance is blocked during the entire resync period. This patch addsthe --no-wait-for-sync so that the operation finishes as soon as theDRBD sync has started, without waiting for the entire sync. This makes...
Recreate instance disks: allow changing nodes
This patch introduces the option of changing an instance's nodes whendoing the disk recreation. The rationale is that currently if aninstance lives on a node that has gone down and is marked offline,it's not possible to re-create the disks and reinstall the instance on...
Rename instance: only show new name when different
It makes not sense to show messages like:Fri May 6 02:04:01 2011 - INFO: Resolved given name 'instance18' to'instance18'
So we'll skip the message if the resolved name is identical to therequested one....
Fix race condition in LUGroupAssignNodes
The original code would get all node information and their groupswithout before acquiring the necessary locks. With this patch the nodeinformation is only retrieved once all locks have been acquired. Groupsare locked optimistically and verified after acquiring the node locks....
Re-wrap and fix formatting issues in gnt-instance.rst
This is mostly rewrapping plus fixing a few small issues ingnt-instance.rst.
Documentation for the new parameters for KVM
Options added/updated are: cdrom2_image_path, floppy_image_path,cdrom_disk_type and boot_order.
Signed-off-by: Iustin Pop <iustin@google.com>[iustin@google.com: small formatting update]Reviewed-by: Iustin Pop <iustin@google.com>
cmdlib: Fix typo, s/nick/NIC/
A small optimisation in cluster verify
This removes (count of instances + count of nodes) lockacquires/releases.
A few docstring fixes
At least one generates an epydoc error :)
luxi: do not handle KeyboardInterrupt
With the current code, it's possible to mistake a ^C for a protocolerror:
node1# gnt-job info 221691[press ^C]Unhandled protocol error while talking to the master daemon:Error while deserializing response:
(and note empty error message)....
Handle EPIPE errors while writing to the terminal
This handles EPIPE errors in two places: ToStream (to catch loggingdone in GenericMain itself) and in GenericMain (to cover also plainprint statements).
Cluster verify: check for missing bridges
Currently cluster verify doesn't check for bridge information; theonly checks are done at instance create and failover/migratetime. This means a cluster that seems healthy will fail creation jobs.
This patch implements a simple verification that all nodes (in the...
TLReplaceDisks: Use implicit loop for dictionary
Release unneeded locks while replacing disks
If an iallocator is used, “gnt-instance replace-disks” would acquire thelocks of all nodes (only the allocator will decide which node to use).Unfortunately the unneeded locks were not released during the operation,...
locking: Export “list_owned” from lock manager
This is analog to “is_owned” and will be used for assertions.
gnt-instance: Fix typo in error message
The iallocator parameter is “-I”, not “-i”.
mlock: fail gracefully if libc.so.6 cannot be loaded
This allows noded to continue instead of blowing up if the libc majornumber changes.
Allow creating the DRBD metadev in a different VG
This is a simple change to allow specifying a different VG for themeta device during the creation of instances and addition of disks viagnt-instance modify.
Make _GenerateDRBD8Branch accept different VG names
This is a small change to make this function take a list of VG names,instead of a single one.
Fix WriteFile with unicode data
Unicode is fun, indeed:
len(buffer("abc"))
3
len(buffer(u"abc"))
12
So we can't pass unicode data to buffer(), as the result will be towrite the in-memory (usually UTF-32) representation to disk.
Fix for multiple VGs - PlainToDrbd and replace-disks
Converting an instance from 'plain' to 'drbd'. The old code wouldcreate the drbd volumes in the default VG and then the renames wouldfail. This fix pulls the plain VG names from the existing volumes and...
Replace disks: keep the meta device in the same VG
This patch enhances the multi-VG support in replace disks, by keepingthe meta device in the same VG, as opposed to moving it to the datadevice VG (note that we don't have a way to create the meta in adifferent VG in the first place, but at least we correctly handle a...
Fix punctuation in an error message
IIRC we don't use punctuation at the end of error messages.
Prevent readding of the master node
This breaks Ganeti in multiple ways. If we don't make the check ingnt-node itself, then bootstrap.SetupNodeDaemon will restart themaster daemon, making the operation fail:
node1# gnt-node add --readd node1 Cannot communicate with the master daemon....
Improve error messages in cluster verify/OS
A few issues in the clarity of the error messages are fixed:
- "ERROR: node node3: OS API version lenny-image": no preposition between the parameter type and the OS name, changed to "for lenny-image"
- "API version lenny-image differs from reference node node1: 10, 5...
Fix potential data-loss in utils.WriteFile
os.write can do incomplete writes, as long as at least some bytes havebeen written (like write(2)):
os.write(fd, " " * 1300)
1300
os.write(fd, " " * 1300)...
cli: Fix wrong argument kind for groups
Quote filename in gnt-instance.8
Fix typo in LUGroupAssignNodes
gnt-instance info: automatically request locking
Commit dae661a4 added support for controlling the locking, but itdidn't modify the gnt-instance info code, which leads to this commandalways showing:
Wed Apr 20 04:10:48 2011 - WARNING: Non-static data requested, locks...
Document the dependency on OOB for gnt-node power
Fix master IP activation in failover with no-voting
Thanks to net.for.hub@gmail.com for reporting this. The logic inmasterd.CheckMasterd did an early return in case of no_voting, henceskipping the master IP activation. We just change the ifs to notreturn but simply continue through the function....
disk wiping: fix bug in chunk size computation
The current wipe_chunk_size computation is doing min(int_value,float_value). For small disks (below 10GiB), the actual formula willresult into the float value being chosen. This results into veryinteresting behaviour:...
Fix bug in watcher
If “utils.RunParts” were to raise an exception, a log message waswritten and the code continued to run. Due to the exception the“results” variable would not be defined.
Also change the code to log a backtrace (getting an exception is rather...
Release locks before wiping disks during instance creation
Ganeti 2.3 introduced an optional feature to overwrite an instance'sdisks on creation. Unfortunately the code kept all locks while doing thewipe, slowing down the creation of multiple instances in parallel....
utils.WriteFile: Close file before renaming
Issue 154 (http://code.google.com/p/ganeti/issues/detail?id=154)reported an “Operation not supported” error when writing instanceexports to a mounted CIFS filesystem. Experimentation showed the errorto only occur when using rename(2) on an opened file. Various references...
Fix distcheck
README is not copied to the build tree.
Nicer formatting for group query error
Before this patc the message would look like “Some groups do not exist:[u'foo', u'bar']”, now it's “Some groups do not exist: foo, bar”.
gnt-instance.8: Fix wrongly formatted title
Update version in README
Also add a check to Makefile's check-local target.
Merge branch 'stable-2.4' into devel-2.4
LUInstanceQueryData: Don't acquire locks unless requested
Until now LUInstanceQueryData always acquired locks for the instance(s)and nodes involved. In combination with long-running operations thisprevented the use of “gnt-instance info”, even with the “--static”...
Increase the lock timeouts before we block-acquire
This has been observed to cause problems on real clusters via thefollowing mechanism:
- a long job (e.g. a replace-disks) is keeping an exclusive lock on an instance- the watcher starts and submits its query instances opcode which...
daemon.py: move startup log message before prep_fn
Before this, the output in the rapi daemon log was:2011-04-04 03:09:51,026: ganeti-rapi pid=17447 INFO Reading users fileat /var/lib/ganeti/rapi/users2011-04-04 03:09:51,027: ganeti-rapi pid=17447 INFO ganeti-rapi daemon...
Display the actual memory values in N+1 failures
This changes the display from:Mon Apr 4 02:29:46 2011 * Verifying N+1 Memory redundancyMon Apr 4 02:29:46 2011 - ERROR: node node2: not enough memory toaccomodate instance failovers should node node1 fail...
ssh.VerifyNodeHostname: remove the quiet flag
This is not needed for this function, and can interfere with debuggingof ssh failures.