cmdlib: Remove acquired_locks attribute from LUs
The “acquired_locks” attribute in LUs is used to keep a list of acquiredlocks at each lock level. This information is already known in the lockmanager, which also happens to be the authoritative source. Removing the...
cmdlib: Use local alias for lock manager
Saves some typing and we'll use it more often in the future.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
opcodes: Add function for compact summary
Depending on the opcode and its parameters, the existing “Summary”function can give a rater long summary. For displaying the summary inlogs and in the lock monitor, it should be shorter. Hence this newfunction is added to just use the opcode ID with common prefixes...
Show locksets in lock monitor
When all locks contained in a set are acquired, the lockset's internallock is acquired with the same mode. With this patch the internal lockwill show up on the lock monitor, named e.g. “instances/[lockset]”.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
locking: Make parameter to condition's wait() positional
It is always used in the locking code. Unittests are updated.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
SharedLock: Avoid acquires from sneaking in while notifying
In some rare cases new shared acquires could sneak in through thecondition cached in “__pending_shared” while the code was stillnotifying acquires. This was only working because such a condition...
Fix instance failover/migration w.r.t TLMigrateInstance
Commit 1c6e5787 removed the iallocator and target_node keywordparameters from TLMigrateInstance, but I didn't update their use inLUInstanceFailover and (not fully) in LUInstanceMigrate.
Signed-off-by: Iustin Pop <iustin@google.com>...
Fix DTS_EXT_MIRROR migration
Commit faaabe3c fixed failover behaviour for DTS_INT_MIRROR instances, howeverit broke migration for DTS_EXT_MIRROR instances, by moving iallocator and nodechecks from LUInstanceMigrate to TLMigrateInstance. This has the side-effect...
Use node group locking for replacing disks
This is one of the first opcodes to make use of node group locking. Toget an instance's node groups, the instance's nodes need to be lookedat. Due to a previous design decision nodes are locked after the group,...
config: Add function to determine instance's groups
This will be used for locking only the necessary node group(s)for per-instance operations.
TLMigrateInstance: Fix live migration breakage
Commit 77fcff4 unintentionally incorporated code fromTLMigrateInstance.CheckPrereq into TLMigrateInstance._RunAllocator, presumablyduring a rebase from earlier versions of the patch to the 2.5 codebase. As a...
cmdlib: Update error messages, remove some punctuation
- Clarify some error messages- Remove unnecessary punctuation- Merge two if conditions in one place
Use floppy disk and a second CDROM on KVM
Hi all,this patch will add 3 new KVM parameters and a new option.
New Parameters: - floppy_image_path = "" -> Specify the floppy image to load asfloppy disk. - cdrom2_image_path = "" -> Specify a second cdrom image to load on...
cmdlib: Factorize lock releasing
There will be more lock releasing with upcoming changes, so this willcentralize the logic behind it (what locks to keep, which variables toupdate, etc.).
Merge branch 'devel-2.4'
TLReplaceDisks: Use implicit loop for dictionary
Release unneeded locks while replacing disks
If an iallocator is used, “gnt-instance replace-disks” would acquire thelocks of all nodes (only the allocator will decide which node to use).Unfortunately the unneeded locks were not released during the operation,...
locking: Export “list_owned” from lock manager
This is analog to “is_owned” and will be used for assertions.
gnt-instance: Fix typo in error message
The iallocator parameter is “-I”, not “-i”.
mlock: fail gracefully if libc.so.6 cannot be loaded
This allows noded to continue instead of blowing up if the libc majornumber changes.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
cmdlib: Drop SSH runner from LU base class
It is no longer used.
cmdlib.py: fix indentation in _VerifyNode
Signed-off-by: Adeodato Simo <dato@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
TLMigrateInstance: Fix confusing text
Commit d5cafd31 changed this error message, swapping thetext parts in the process.
LUInstanceRename: Amend comment about lock
Also add an assertion.
iallocator: Relocation nodes must be in same group
Quoting from iallocator.rst: “[…] ``relocate`` request is used when anexisting instance needs to be moved within its node group […]”.
Fix 'unused import' lint error
Sorry!
SetEtcHostsEntry: maintain existing ordering
Currently RemoveEtcHostsEntry keeps the ordering, but SetEtcHostsEntrynot, as it will always write the new entry at the end of file. Ipersonally dislike this as it "uglifies" my custom host files, so thispatch makes it update the record instead in-place so to say instead of...
Convert utils.nodesetup to utils.WriteFile(data=…)
It makes no sense to iteratively write the new etc/hosts file, as wecan pre-compute the desired contents (neither the old nor the newversions are safe against concurrent changes anyway).
Allow creating the DRBD metadev in a different VG
This is a simple change to allow specifying a different VG for themeta device during the creation of instances and addition of disks viagnt-instance modify.
Make _GenerateDRBD8Branch accept different VG names
This is a small change to make this function take a list of VG names,instead of a single one.
Fix WriteFile with unicode data
Unicode is fun, indeed:
len(buffer("abc"))
3
len(buffer(u"abc"))
12
So we can't pass unicode data to buffer(), as the result will be towrite the in-memory (usually UTF-32) representation to disk.
Fix for multiple VGs - PlainToDrbd and replace-disks
Converting an instance from 'plain' to 'drbd'. The old code wouldcreate the drbd volumes in the default VG and then the renames wouldfail. This fix pulls the plain VG names from the existing volumes and...
Replace disks: keep the meta device in the same VG
This patch enhances the multi-VG support in replace disks, by keepingthe meta device in the same VG, as opposed to moving it to the datadevice VG (note that we don't have a way to create the meta in adifferent VG in the first place, but at least we correctly handle a...
Fix punctuation in an error message
IIRC we don't use punctuation at the end of error messages.
Prevent readding of the master node
This breaks Ganeti in multiple ways. If we don't make the check ingnt-node itself, then bootstrap.SetupNodeDaemon will restart themaster daemon, making the operation fail:
node1# gnt-node add --readd node1 Cannot communicate with the master daemon....
Improve error messages in cluster verify/OS
A few issues in the clarity of the error messages are fixed:
- "ERROR: node node3: OS API version lenny-image": no preposition between the parameter type and the OS name, changed to "for lenny-image"
- "API version lenny-image differs from reference node node1: 10, 5...
Fix potential data-loss in utils.WriteFile
os.write can do incomplete writes, as long as at least some bytes havebeen written (like write(2)):
os.write(fd, " " * 1300)
1300
os.write(fd, " " * 1300)...
QA: Add tests for node group tags
RAPI: Add support for tagging node groups
masterd: Add support for tagging node groups
gnt-group: Add commands for tagging groups
cli: Fix wrong argument kind for groups
Quote filename in gnt-instance.8
QA: Adding a config option to disable cluster epo
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
TLMigrateInstance: remove 10s sleeps
TLMigrateInstance._ExecMigration contains two 10-second sleeps betweenindividual migration steps.
Apart from prolonging the migration duration by 20s, the second sleepcauses FinalizeMigration to be called 10 seconds after the real...
Fix typo in LUGroupAssignNodes
gnt-instance info: automatically request locking
Commit dae661a4 added support for controlling the locking, but itdidn't modify the gnt-instance info code, which leads to this commandalways showing:
Wed Apr 20 04:10:48 2011 - WARNING: Non-static data requested, locks...
Document the dependency on OOB for gnt-node power
Fix master IP activation in failover with no-voting
Thanks to net.for.hub@gmail.com for reporting this. The logic inmasterd.CheckMasterd did an early return in case of no_voting, henceskipping the master IP activation. We just change the ifs to notreturn but simply continue through the function....
disk wiping: fix bug in chunk size computation
The current wipe_chunk_size computation is doing min(int_value,float_value). For small disks (below 10GiB), the actual formula willresult into the float value being chosen. This results into veryinteresting behaviour:...
Update manpages and other documents with editor settings
No rewrapping is done in this patch, just updates to the settings.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
gnt-group list: Query filter support
gnt-node list: Query filter support
Update manpage, quote field names.
gnt-instance list: Query filter support
Fix bug in watcher
If “utils.RunParts” were to raise an exception, a log message waswritten and the code continued to run. Due to the exception the“results” variable would not be defined.
Also change the code to log a backtrace (getting an exception is rather...
opcodes: Change parameter type definition for query filter
The old definition wouldn't accept integers.
cli: Add option to force names to be treated as filter
cli: Add support for parsing query filters
qlang: Add function to distinguish filters from names
cli: Error reporting for query filter parsing
qlang: Add parser for query filter language
With this parser, command line utilities will be able to provide filtersthrough query2 in a simplistic language. Example filters:
name "node3.example.com" master or (name "node4.example.com") be/memory == 128 and name =~ /^web/i...
Update ganeti.7 manpage for query filter language
htools: make some error messages more explicit
Add instance query field for OS parameters
These were not available as a query field before. Update unittestsand description text for the other “..params” fields.
QA: also run gnt-cluster repair-disk-sizes
So that we don't happen again to break this forever without realisingit.
The patch also replaces one ' with ".
Release locks before wiping disks during instance creation
Ganeti 2.3 introduced an optional feature to overwrite an instance'sdisks on creation. Unfortunately the code kept all locks while doing thewipe, slowing down the creation of multiple instances in parallel....
Fix shared_file_storage_dir on upgrades
If the cluster was upgraded from 2.4 or earlier, this key won't exist(it's only set to a correct value on cluster init), so we need toproperly set it to a null string (disabled).
QA: run the redist-conf command
This was (AFAICS) completely missing from the QA suite.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Prevent ssconf values from having non-string values
For whatever reason, my test cluster managed to acquireshared_file_storage_dir with a None value, instead of emptystring. This is not flagged in masterd itself, but the node daemonwill fail in writing the value to disk, as it calls len() on the...
Add some tests for the auto_balance attribute
It tests node add/remove secondary, rather than cluster-level N+1checks, but it's better than nothing.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Adeodato Simo <dato@google.com>
Node operations: take into account auto_balance
This patch changes the add to secondary/remove from secondary code tonot deduct/add the instance's memory if the instance is notauto_balanced.
Read/write auto_balance via Text
This also means another change in the text format; we really shouldmove to json…
The unittests are also update for the new 9-column layout andadditionally a bit of improvement is done.
Read auto_balance via Rapi
Read auto_balance via Luxi
Show the auto_balance flag in the instance listing
cli: Replace hardcoded strings with constants
utils.WriteFile: Close file before renaming
Issue 154 (http://code.google.com/p/ganeti/issues/detail?id=154)reported an “Operation not supported” error when writing instanceexports to a mounted CIFS filesystem. Experimentation showed the errorto only occur when using rename(2) on an opened file. Various references...
Fix distcheck
README is not copied to the build tree.
Nicer formatting for group query error
Before this patc the message would look like “Some groups do not exist:[u'foo', u'bar']”, now it's “Some groups do not exist: foo, bar”.
gnt-instance.8: Fix wrongly formatted title
Some more changes to Makefile.am for htools
I duplicate the BINARY= rule in the ghc invocation in order to be ableto silence the if, which was confusing.
Additionally, a new target for running just the htools unit-tests isprovided.
Add a new attribute to Instance.Instance
This will mirror Ganeti's be/auto_balance one, which we need to use toproperly match N+1 computations.
Update version in README
Also add a check to Makefile's check-local target.
htools: Make opcode naming consistent with Ganeti codebase
This patch just cleans up the htools codebase to make it more consistentwith the naming of the Ganeti codebase.
Merge branch 'stable-2.4' into devel-2.4
OpCodes.hs: make allow_failover optional
And default to False, like in the Python codebase.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
htools: add an utility function for JSON parsing
This allows extracting values from a JSON object that might miss, buthave a well-defined default value.
Two small Makefile fixes related to htools
First, fix hs-coverage on non-pristine tree, where the index.html filealready existed, and second, disallow compilation of htools binariesif configure, for some reason, didn't enable them.
htools: Use OpMigrateInstance with allow_failover option
Before hbal decided on the fly if an instance is migratable or not. Aswe implemented failover fallback in commit d5cafd31456 we can start touse that.
Signed-off-by: René Nussbaumer <rn@google.com>...
LUInstanceQueryData: Don't acquire locks unless requested
Until now LUInstanceQueryData always acquired locks for the instance(s)and nodes involved. In combination with long-running operations thisprevented the use of “gnt-instance info”, even with the “--static”...
gnt-instance migrate: Adding --allow-failover option
TLMigrateInstance: Merge failover code, allow fallback
As the code for failover for checking is almost identical it's an easytask to switch it over to the TLMigrateInstance. This allows us tofallback to failover if migrate fails prereq check for some reason....
Increase the lock timeouts before we block-acquire
This has been observed to cause problems on real clusters via thefollowing mechanism:
- a long job (e.g. a replace-disks) is keeping an exclusive lock on an instance- the watcher starts and submits its query instances opcode which...
utils: Add function generating regex for DNS name globbing
The intent of this function is to be able to provide a globbing operatoror query filters. One should be able to say, for example, something tothe effect of “gnt-instance shutdown '*.site'”.
Also rename a variable in MatchNameComponent....
Verify file consistency using centrally computed list
Until now “gnt-cluster verify” (LUClusterVerify) would compute its ownlist of files to check for consistency. This list was not complete andcertain inconsistencies were missed.
With this patch the code is changed to use the list of files used by...
cmdlib: Factorize computation of ancillary files
… and change the logic in _RedistributeAncillaryFiles. The virtuallysame list of files will be used to verify the files' consistency.
qlang: Remove OP_GLOB operator
It'll be implemented using OP_REGEXP by the parser.
query: Add implementation of regex match operator
So far this operator was not implemented. This patch adds an additionalvalue preparation function to the function table for binary operators,used to compile the regular expression. Unittests are included....