Use floppy disk and a second CDROM on KVM
Hi all,this patch will add 3 new KVM parameters and a new option.
New Parameters: - floppy_image_path = "" -> Specify the floppy image to load asfloppy disk. - cdrom2_image_path = "" -> Specify a second cdrom image to load on...
cmdlib: Factorize lock releasing
There will be more lock releasing with upcoming changes, so this willcentralize the logic behind it (what locks to keep, which variables toupdate, etc.).
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Merge branch 'devel-2.4'
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
TLReplaceDisks: Use implicit loop for dictionary
Release unneeded locks while replacing disks
If an iallocator is used, “gnt-instance replace-disks” would acquire thelocks of all nodes (only the allocator will decide which node to use).Unfortunately the unneeded locks were not released during the operation,...
locking: Export “list_owned” from lock manager
This is analog to “is_owned” and will be used for assertions.
gnt-instance: Fix typo in error message
The iallocator parameter is “-I”, not “-i”.
mlock: fail gracefully if libc.so.6 cannot be loaded
This allows noded to continue instead of blowing up if the libc majornumber changes.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
cmdlib: Drop SSH runner from LU base class
It is no longer used.
cmdlib.py: fix indentation in _VerifyNode
Signed-off-by: Adeodato Simo <dato@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
TLMigrateInstance: Fix confusing text
Commit d5cafd31 changed this error message, swapping thetext parts in the process.
LUInstanceRename: Amend comment about lock
Also add an assertion.
iallocator: Relocation nodes must be in same group
Quoting from iallocator.rst: “[…] ``relocate`` request is used when anexisting instance needs to be moved within its node group […]”.
Fix 'unused import' lint error
Sorry!
SetEtcHostsEntry: maintain existing ordering
Currently RemoveEtcHostsEntry keeps the ordering, but SetEtcHostsEntrynot, as it will always write the new entry at the end of file. Ipersonally dislike this as it "uglifies" my custom host files, so thispatch makes it update the record instead in-place so to say instead of...
Convert utils.nodesetup to utils.WriteFile(data=…)
It makes no sense to iteratively write the new etc/hosts file, as wecan pre-compute the desired contents (neither the old nor the newversions are safe against concurrent changes anyway).
Signed-off-by: Iustin Pop <iustin@google.com>...
Allow creating the DRBD metadev in a different VG
This is a simple change to allow specifying a different VG for themeta device during the creation of instances and addition of disks viagnt-instance modify.
Make _GenerateDRBD8Branch accept different VG names
This is a small change to make this function take a list of VG names,instead of a single one.
Fix WriteFile with unicode data
Unicode is fun, indeed:
len(buffer("abc"))
3
len(buffer(u"abc"))
12
So we can't pass unicode data to buffer(), as the result will be towrite the in-memory (usually UTF-32) representation to disk.
Fix for multiple VGs - PlainToDrbd and replace-disks
Converting an instance from 'plain' to 'drbd'. The old code wouldcreate the drbd volumes in the default VG and then the renames wouldfail. This fix pulls the plain VG names from the existing volumes and...
Replace disks: keep the meta device in the same VG
This patch enhances the multi-VG support in replace disks, by keepingthe meta device in the same VG, as opposed to moving it to the datadevice VG (note that we don't have a way to create the meta in adifferent VG in the first place, but at least we correctly handle a...
Fix punctuation in an error message
IIRC we don't use punctuation at the end of error messages.
Prevent readding of the master node
This breaks Ganeti in multiple ways. If we don't make the check ingnt-node itself, then bootstrap.SetupNodeDaemon will restart themaster daemon, making the operation fail:
node1# gnt-node add --readd node1 Cannot communicate with the master daemon....
Improve error messages in cluster verify/OS
A few issues in the clarity of the error messages are fixed:
- "ERROR: node node3: OS API version lenny-image": no preposition between the parameter type and the OS name, changed to "for lenny-image"
- "API version lenny-image differs from reference node node1: 10, 5...
Fix potential data-loss in utils.WriteFile
os.write can do incomplete writes, as long as at least some bytes havebeen written (like write(2)):
os.write(fd, " " * 1300)
1300
os.write(fd, " " * 1300)...
RAPI: Add support for tagging node groups
gnt-group: Add commands for tagging groups
masterd: Add support for tagging node groups
cli: Fix wrong argument kind for groups
TLMigrateInstance: remove 10s sleeps
TLMigrateInstance._ExecMigration contains two 10-second sleeps betweenindividual migration steps.
Apart from prolonging the migration duration by 20s, the second sleepcauses FinalizeMigration to be called 10 seconds after the real...
Fix typo in LUGroupAssignNodes
gnt-instance info: automatically request locking
Commit dae661a4 added support for controlling the locking, but itdidn't modify the gnt-instance info code, which leads to this commandalways showing:
Wed Apr 20 04:10:48 2011 - WARNING: Non-static data requested, locks...
Fix master IP activation in failover with no-voting
Thanks to net.for.hub@gmail.com for reporting this. The logic inmasterd.CheckMasterd did an early return in case of no_voting, henceskipping the master IP activation. We just change the ifs to notreturn but simply continue through the function....
disk wiping: fix bug in chunk size computation
The current wipe_chunk_size computation is doing min(int_value,float_value). For small disks (below 10GiB), the actual formula willresult into the float value being chosen. This results into veryinteresting behaviour:...
gnt-group list: Query filter support
gnt-node list: Query filter support
Update manpage, quote field names.
gnt-instance list: Query filter support
Fix bug in watcher
If “utils.RunParts” were to raise an exception, a log message waswritten and the code continued to run. Due to the exception the“results” variable would not be defined.
Also change the code to log a backtrace (getting an exception is rather...
cli: Add support for parsing query filters
cli: Add option to force names to be treated as filter
opcodes: Change parameter type definition for query filter
The old definition wouldn't accept integers.
cli: Error reporting for query filter parsing
qlang: Add function to distinguish filters from names
qlang: Add parser for query filter language
With this parser, command line utilities will be able to provide filtersthrough query2 in a simplistic language. Example filters:
name "node3.example.com" master or (name "node4.example.com") be/memory == 128 and name =~ /^web/i...
Add instance query field for OS parameters
These were not available as a query field before. Update unittestsand description text for the other “..params” fields.
Release locks before wiping disks during instance creation
Ganeti 2.3 introduced an optional feature to overwrite an instance'sdisks on creation. Unfortunately the code kept all locks while doing thewipe, slowing down the creation of multiple instances in parallel....
Fix shared_file_storage_dir on upgrades
If the cluster was upgraded from 2.4 or earlier, this key won't exist(it's only set to a correct value on cluster init), so we need toproperly set it to a null string (disabled).
Prevent ssconf values from having non-string values
For whatever reason, my test cluster managed to acquireshared_file_storage_dir with a None value, instead of emptystring. This is not flagged in masterd itself, but the node daemonwill fail in writing the value to disk, as it calls len() on the...
cli: Replace hardcoded strings with constants
utils.WriteFile: Close file before renaming
Issue 154 (http://code.google.com/p/ganeti/issues/detail?id=154)reported an “Operation not supported” error when writing instanceexports to a mounted CIFS filesystem. Experimentation showed the errorto only occur when using rename(2) on an opened file. Various references...
Nicer formatting for group query error
Before this patc the message would look like “Some groups do not exist:[u'foo', u'bar']”, now it's “Some groups do not exist: foo, bar”.
Merge branch 'stable-2.4' into devel-2.4
LUInstanceQueryData: Don't acquire locks unless requested
Until now LUInstanceQueryData always acquired locks for the instance(s)and nodes involved. In combination with long-running operations thisprevented the use of “gnt-instance info”, even with the “--static”...
gnt-instance migrate: Adding --allow-failover option
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
TLMigrateInstance: Merge failover code, allow fallback
As the code for failover for checking is almost identical it's an easytask to switch it over to the TLMigrateInstance. This allows us tofallback to failover if migrate fails prereq check for some reason....
Increase the lock timeouts before we block-acquire
This has been observed to cause problems on real clusters via thefollowing mechanism:
- a long job (e.g. a replace-disks) is keeping an exclusive lock on an instance- the watcher starts and submits its query instances opcode which...
utils: Add function generating regex for DNS name globbing
The intent of this function is to be able to provide a globbing operatoror query filters. One should be able to say, for example, something tothe effect of “gnt-instance shutdown '*.site'”.
Also rename a variable in MatchNameComponent....
Verify file consistency using centrally computed list
Until now “gnt-cluster verify” (LUClusterVerify) would compute its ownlist of files to check for consistency. This list was not complete andcertain inconsistencies were missed.
With this patch the code is changed to use the list of files used by...
cmdlib: Factorize computation of ancillary files
… and change the logic in _RedistributeAncillaryFiles. The virtuallysame list of files will be used to verify the files' consistency.
qlang: Remove OP_GLOB operator
It'll be implemented using OP_REGEXP by the parser.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
query: Add implementation of regex match operator
So far this operator was not implemented. This patch adds an additionalvalue preparation function to the function table for binary operators,used to compile the regular expression. Unittests are included....
cmdlib: Fix mistake made in commit 75c7520f0
Commit 75c7520f0 used the wrong constant. I double-checked all otherchanges made in the commit.
cmdlib: Replace hardcoded values with constants
daemon.py: move startup log message before prep_fn
Before this, the output in the rapi daemon log was:2011-04-04 03:09:51,026: ganeti-rapi pid=17447 INFO Reading users fileat /var/lib/ganeti/rapi/users2011-04-04 03:09:51,027: ganeti-rapi pid=17447 INFO ganeti-rapi daemon...
Display the actual memory values in N+1 failures
This changes the display from:Mon Apr 4 02:29:46 2011 * Verifying N+1 Memory redundancyMon Apr 4 02:29:46 2011 - ERROR: node node2: not enough memory toaccomodate instance failovers should node node1 fail...
RAPI: Convert instance shutdown to the new FillOpCode
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
ssh.VerifyNodeHostname: remove the quiet flag
This is not needed for this function, and can interfere with debuggingof ssh failures.
Add a simple wrapper over utils.Retry
The new wrapper makes moving legacy code to utils.Retry or addingretries in existing code simpler.
Automatically enable hail if enabled and found
Expose whether htools was enabled to Python code
This exports whether htools was enabled at configure-time, and adds aconstant for our reference iallocator.
test.ganeti.process_unittest: Fix race condition
There was a race condition on heavily loaded testsystem causing randomlyto fail the timeout unittests as the signal handler is not yet setup butthe timeout has already hit.
Therefore we introduce a workaround to wait until a program reached a...
RAPI client: Remove support for version 0 instance creation requests
RAPI server: Drop support for instance creation format 0
Ganeti 2.1.3, released in June 2010, added support for a new, extensibleinstance creation request format, called version 1. This patch removessupport for the old and undocumented version 0 format....
Improved GanetiRapiClient docstrings
- Added @rtype and/or @return where missing- Fixed @param for Query() filter_ parameter (colon was missing)
Signed-off-by: Simeon Miteff <simeon.miteff@gmail.com>Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Relax instance ERROR on admin_down on offline node
This fixes a issue, where an stopped instances is reported as ERRORin cluster verify if it lives on a offline node. As the instances isdown this shouldn't happen.
Signed-off-by: René Nussbaumer <rn@google.com>...
Implement submitting jobs from logical units
The design details can be seen in the design document(doc/design-lu-generated-jobs.rst).
watcher: improve logging a bit
Add some debug logging to detail why we don't run some steps.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Fix output for “gnt-job info”
If the result of an opcode was a non-empty dictionary, itwould be impossible to differenciate between input and result:
Input fields: […] debug_level: 0 fields: cluster_name,master_node,volume_group_name jobs: [[True, u'37922'], [True, u'37923'], [True, u'37924']]...
Rewrite of ensure-dirs in python
I provided unittest to test the important pieces of the infrastructure.The one remaining function (ResuriveEnsure) is not easy to unittestbut also not critical if it fails to operate correctly.
Add opcode summary to SubmitManyJobs errors
Requested-by: Iustin Pop <iustin@google.com>Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
RAPI client: Tidy and test WaitForJobCompletion
- Use constants- Don't sleep if no delay is given- Mark function as deprecated: it uses polling instead of waiting for changes (but the latter needs authentication); it can still be used- Add unittests...
RAPI client: Add job status constants
RAPI client: Job IDs are strings
Split BuildHooksEnv of LUs
Commit dd7f677623 added another call to BuildHooksEnv to providepost-phase status variables. Since BuildHooksEnv also built the nodelists, that meant they have to be built twice. First a rather strictcheck was used, but it turned out to be more tricky. Commit b423c51336...
RAPI client: fix epydoc formatting
Add a helper function to the RAPI client
This adds a new method WaitForJobCompletion that can be used forclient who are not interested in the entire job log, just in itscompletion status.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>...
Remove restrictive hook node list check
Commit dd7f67762 added a restrictive check for the node lists returnedby BuildHooksEnv, leading to errors with some LUs, one of which wasfixed in commit 0dfa2c227. As it turns out, other LUs have similarissues, some not easy to fix. This patch disables the restrictive check...
watcher: Fix misleading usage output
When “ganeti-watcher” is called with an argument, it would hint ata non-existing “-f” parameter. With this patch the separate usagestring is no longer necessary.
Fix hook node list when adding node
This broke QA (and everyone trying to add a node) by complaining aboutdifferent node lists.
Clarify --force-join parameter message
This isn't only used during cluster merge.
Signed-off-by: Stephen Shirley <diamond@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
hooks: Provide variables with post-opcode values
When a hook is called, it is provided with a number of variablesdescribing the status of the instance/node/etc. before the operation.Some opcodes provide extra variables to see modified values from hooks,...
HooksMaster: Add more assertions for variable names
Also replace explicit loop with dict.update.
mcpu: Tidy HooksMaster a bit
- Dictionary indentation- Add empty lines for readability- Simplify conditional code
cmdlib: Factorize running post-pase hook
locking: Fix race condition in lock monitor
In some rare cases it can happen that a lock is re-created very soonafter deletion, while the old instance hasn't been destructed yet. Insuch a case the code would detect a duplicate name and raise anexception....
qlang: Remove unused import
RAPI: Add support for querying resources
- Access is only permitted for authenticated clients (queries can return sensitive data)- Filters can be specified when sending a PUT request- Updates RAPI client, documentation and tests
Add support for query resources in RAPI URIs