Remove last use of utils.RunCmd from the watcher
The watcher has one last use of ganeti commands as opposed to sendingrequests via luxi. The patch changes this to use the cli functions.
The patch also has two other changes: - fix the docstring for OpVerifyDisks (found out while converting...
Add cluster options from ssconf to configuration
ssconf will become write-only from ganeti-masterd's point of view,therefore all settings in there need to go into the main configurationfile.
Reviewed-by: iustinp
Move instantiation of config into bootstrap.py
Future patches will add even more variables to the cluster config.Adding more parameters wouldn't make the function easier to use andit doesn't make sense to pass them to another function, as it'sonly done once in bootstrap.py on cluster initialization....
Change the results from cli.PollJob
Curently PollJob accepts a generic job, but will return (historyartifact) only the first opcode result. This is wrong, as it doesn'tallow polling of a job with multiple results.
Its only caller (for now) is also changed, so no functional changes...
Enhance the job-related timestamps
This patch adds start, stop, and received timestamp for jobs (and allowsquerying of them), and allows querying of the opcode timestamps.
Reviewed-by: imsnah
Abstract the timestamp formatting into cli.py
Currently we format the timestamp inside the gnt-job info function. Wewill need this more times in the future, so move it to cli.py as aseparate, exported function.
Add opcode execution log in job info
This patch adds the job execution log in “gnt-job info” and also allowsits selection in “gnt-job list” (however here it's not very useful asit's not easy to parse). It does this by adding a new field in the queryjob call, named ‘oplog’....
Move a hardcoded constant to constants.py
For now we only use the ‘C’ protocol so we can put it in constants.pyinstead of hardcoding it.
Enable the use of shared secrets
This patch enables the use of the shared secrets for DRBD8 disks, using(hardcoded in constants.py) the md5 digest algorithm.
For making this more flexible, either we implement a cluster parameter(once the new model is in place), or we can make it ./configure-time...
Extend DRBD disks with shared secret attribute
This patch, which is similar to r1679 (Extend DRBD disks with minorsattribute), extends the logical and physical id of the DRBD disks with ashared secret attribute. This is generated at disk creation time and...
Implement job summary in gnt-job list
It is not currently possibly to show a summary of the job in the outputof “gnt-job list”. The closes is listing the whole opcode(s), but thatis too verbose. Also, the default output (id, status) is not veryuseful, unless one looks for (and knows about) an exact job ID....
Nicely sort the job list
Unless we decide to change the job identifiers to integer, we should atleast sort the list returned by _GetJobIDsUnlocked.
Move the pseudo-secret generation to utils.py
The bootstrap code needs a pseudo-secret and this is currently generatedinside the InitGanetiServerSetup function. Since more users will needthis, move it to utils.py
Reviewed-by: ultrotter
Fix a bug related to static minors
When the node does not yet have any minors allocated, the first minor(0) will not be entered in the ConfigWriter._temporary_drbds structure.This does not happen for our current usage, since we always ask for twominors (so the next call will not match this case), but it will be...
Add checks for tcp/udp port collisions
In case the config file is manually modified, or in case of bugs, thetcp/udp ports could be reused, which will create various problems(instances not able to start, or drbd disks not able to communicate).
This patch extends the ConfigWriter.VerifyConfig() method (which is used...
Update the cluster serial_no on certain operations
This patch adds update of the cluster serial number for: - add/remove node (as the cluster's node list is changed) - add/remove/rename instance (as the cluster's instance list is changed) - change the volume group name...
Allow listing of the serial_no via gnt-* list
This patch adds listing of the serial_no attribute in gnt-instance andgnt-node list, and updates to the manpages to reflect the change.
Initialize and update the serial_no on objects
This patch add initialization of the serial_no on instance and nodes,and update of the field whenever an object is updated in the genericcase, via ConfigWriter.Update(obj) and in the specific case ofinstances' state being modified manually....
Switch the global serial_no to the top object
Currently the serial_no that is incremented every time the configurationfile is written is located on the 'cluster' object in the configurationstructure. However, this is wrong as the cluster serial_no should be...
Add serial_no attributes to objects
This patch adds the ‘serial_no’ attribute to the other top-level objects(the configuration object itself, the nodes and the instances).
Replace a cfg.AddInstance with UpdateInstance
This seems to be the last (deprecated) use of AddInstance in order toupdate an instance.
The patch also removes a whitespace-at-eol case.
Fix iallocator name
port forward of patch from revision 1690 with following message:
Patch on revision 1686 used the wrong field: ial.name, which is the instancename and not the iallocator name. self.op.iallocator is the right field.
Sorry for this inconvenience....
Fix a broken format string
This patch fixes a broken format string. It's expecting 3 parameters, but onlygets 2. This change will add the missing parameter. This is a forward-portof the fix in Ganeti 1.2
Switch config.py to logging
A couple of more modules are using the obsolete logger functions, configbeing one of them.
Switch to static minors for DRBD
With some todos remaining, this patch switches the DRBD devices to usethe passed minors, and the cmdlib code (add instance and replace disks)to request and assign minors to the DRBD disks.
Todos: - look at the disk RPC calls to see which can be optimized away, since...
Implement config support for drbd static minors
This patch adds support for allocating static minors.
Like for the LVM uuids, we add a new cache for the temporarily allocatedrequests, and the users of the new methods must manually clear thecache. If this doesn't happen, at worst we lose some minors....
Fix disk replace secondary with static minors
The code in 'updating instance configuration' section of the replacedisks with change secondary node was setting a wrong new logical_id forthe drbd devices (only set the new node, not the new minor). The patch...
Extend DRBD disks with minors attribute
This patch converts the DRBD disks to contain also a minor (per eachnode) attribute. This minor is not yet used and is always initializedwith None, so the patch does not have any real-world impact - except forautomatically upgrading config files (it adds the minors as None, None)....
Apply filter properly in LUQuery{Nodes, Instances}
Currently when not locking all nodes/instances are returned, regardlessif the user asked only for some of them. With this patch we return tothe previous behaviour: - if no names are specified return info on all current ones...
Remove auto_balance from burnin/cmdlib
There is no such feature in trunk yet.
Add utils.ReadFile function
It abstracts exception handling and is like a complement toutils.WriteFile.
GetAllInstancesInfo, change internal iterator name
GetAllInstancesInfo used "node" as an iterator name. Change it toinstance to make it less confusing.
Parallelize Tag operations
For now we lock the instance/node for adding/deleting tags from it, butwe could probably in the future do without, with more support from theconfig for atomic operations.
Parallelize LUSetClusterParams (and add a FIXME)
Parallelize LURemoveExport
Parallelize LURemoveInstance
Using the new add/remove infrastructure this becomes pretty easy! :)
Parallelize LUCreateInstance
Finally, instance create on different node, without iallocator, can runin parallel. Iallocator usage still needs all nodes to be locked,unfortunately. As a bonus most checks which could have been moved toExpandNames, before any locking is done....
Implement adding/removal of locks by declaration
With this patch LUs can declare locks to be added when they start and/orremoved after they finish. For now locks can only be added in theacquired state, and removed if owned, and added locks default to be...
LockSet: forbid add() on a partially owned set
This patch bans add() on a half-acquired set. This behavior waspreviously possible, but created a deadlock if someone tried to acquirethe set-lock in the meantime, and thus is now forbidden. ThetestAddRemove unit test is fixed for this new behavior, and includes a...
Fix typo in a locking.py comment
Use is_owned to determine whether to unlock
Now that is_owned is public we don't need to play games at the end of anLU. If we're still owning anything we just release it.
Add GanetiLockManager.is_owned function
This is a public version of the private function we already had.We don't just change the previous version because it had lots of usersin the library itself and in the testing code.
Fix LockSet._names() to work with the set-lock
If the set-lock is acquired, currently, the _names function will fail ona double acquire of a non-recursive lock. This patch fixes the behavior,and some lines of code added to the testAcquireSetLock test check that...
jqueue: Add common RPC error handling function
We didn't decide yet what exactly it should do with failed nodes.
Remove locking of instances in certain queries
This patch is similar to the node patch (rev 1650). We disable lockingof instance (and nodes) if we only query static information.
Add an atomic ConfigWrite.GetAllInstanceInfo()
In order to be able to query instance without locking them, we need thesame atomic query of multiple instances as for nodes.
Add ConfigWriter._UnlockedGetInstanceList/Info()
This patch splits the GetInstanceInfo and GetInstanceList methods intotwo parts, one locked one _Unlocked similar to the way nodes arequeried.
Rewrite the 'only submit job' handling in scripts
The "sys.exit(0)" was not nice as you couldn't differentiate it fromother exit codes. We change this to a specially defined exception forthis, so that multi-opcode commands can handle this nicely.
Optimize the OpQueryNodes for names only
Currently, OpQueryNodes is locking all nodes (in shared mode), whichwill also block the special case of querying only for the node names(this is needed for gnt-cluster command, for example). There is nological requirement to not give the administrator enough power if she/he...
Add a way to export all node information at once
The patch adds a new function to export all node information at once(i.e. atomically with respect to the configuration lock).
Never remove job queue lock in node daemon
Otherwise, corruption could occur in some corner cases. E.g. whenLeaveNode is running in a child and is in the process of removingqueue files, the main process gets killed, started again and getsa request to update the queue. This is rather extreme corner case,...
Export backend.GetMasterInfo over the rpc layer
We create a multi-node call so that querying all nodes for agreementwill be fast.
Change backend._GetMasterInfo to return more data
The _GetMasterInfo() function needs to export the master name too to beuseful in master safety checks. This patch makes it a public (no _)function and adds a third element in the return tuple. Its callers are...
Parallelize LUQueryInstanceData
Parallelize LUVerify{Cluster,Disks}
These are two easy querying LUs which require shared access to allnodes/instances.
Parallelize LUReplaceDisks
This is the most complex parallelization so far. We have to lock oneinstance (and its nodes) plus one more node if doing a remote replace,or all nodes if doing a remote replace with iallocator.
_LockInstancesNodes: support append mode
This will be used to lock the instance's nodes in addition to some more.
Processor: remove ChainOpCode
This function was incompatible with the new locking system, and itsusage has been removed from the code. For now LUs share code by callingcommon module-private functions in cmdlib.py, in the future they willuse tasklets (when those will be implemented)....
Parallelize LU{A,Dea}ctivateInstanceDisks
Now that they are not used in other opcodes by chaining,this can easily be done.
LUReplaceDisks: remove use of ChainOpCode
The calls to OpActivateInstanceDisks and OpDeactivateInstanceDisks hasbeen replaced by _StartInstanceDisks and _SafeShutdownInstanceDisksrespectively. This is the last usage of ChainOpCode.
Create new _SafeShutdownInstanceDisks function
This new function checks whether an instance is running, before shuttingdown its disks. This is what the Exec() of LUDeactivateInstanceDisksdid, so that is replaced by a call to this function.
Fix a typo in LogicalUnit.ExpandNames docstring
s/locking.LEVEL_INSTANCES/locking.LEVEL_INSTANCE/
Use constants.LOCKS_REPLACE instead of hardcoding
This constant replaces what we used to write in recalculate_locks, andrepresents the lock recalculation mode. It lives in constants.py becauseit's used only in cmdlib, and thus doesn't deal with the locking library...
Fix LUReplaceDisks with iallocator
self._RunAllocator() sets self.op.remote_node, but doesn't return thenew remote node. If we set it to the return value of the function webasically reset it to None, and iallocator is never run.
Fix LUGrowDisk
The rpc library returns a list, not a tuple, so we'll accept both.
Fix iallocator run
Parallelize LUExportInstance
Unfortunately for the first version we need to lock all nodes. The patchdiscusses why this is and discuss ways to improve this in the future.
Parallelize LUGrowDisk
LURebootInstance: lock only primary when possible
When rebooting an instance and we're not changing it's disks status (allthe cases except in a "full" reboot) we can lock just its primary node.
Add primary_only flag to _LockInstancesNodes
As the name says when the flag is on (the default is off) only theprimary nodes are locked, as opposed to all of them.
utils.FileLock: Implement timeout
The timeout can be used in ganeti-noded to be more robust againstdeadlocks.
Add locking.ALL_SET constant and use it
Rather than specifying None in needed_locks every time, with a nicecomment saying to read what we mean rather than what we write, and thatNone actually means All, in our magic world, we'll hide this secretunder the ALL_SET constant in the locking module, which has value, you...
utils.SplitTime: More rounding fixes
SplitTime didn't round the same on different platforms. This patch changesit to use microseconds and not care about rounding.
Prevent mistakes using _GetWantedNodes
All the users of _GetWantedNodes have been converted to be concurrentLUs, and thus cannot call this function with an empty list of nodesanymore. This patch makes this restriction a part of the functionitself. This prevents mistakes in new concurrent LUs, and creates more...
Paralleliza LUQueryNodeVolumes and LUQueryExports
Parallelize LUDiagnoseOS
LUQueryExports: make 'node' field mandatory
It turns out this fields was already mandatory. If it hadn't beed valid,in fact, a value of None would have been passed to _GetWantedNodes whichwould have thrown an exception.
s/Chain(OpQueryExports)/rpc.call_export_list(...)/
Parallel opcodes are not (yet?) supported for chaining. Turns outthough that chaining is used only four times in the code, and twice it'sfor querying exports. But what's the need to chain the full opcode, when...
Fix wrong indentation in LUQueryNodes
Merge r1607 from branches/ganeti/ganeti-1.2
Use a default vnc_bind_address if None is specified
merge r1568 from branches/ganeti/ganeti-1.2
Add more fields to gnt-instance list
merge r1548 from branches/ganeti/ganeti-1.2
Fix wrong wording of instance rename error message.
merge r1541 from branches/ganeti/ganeti-1.2
more information for VNC console port
merge r1540 from branches/ganeti/ganeti-1.2
Allow access to HVM serial console
merge r1539 from branches/ganeti/ganeti-1.2
Display VNC console port in gnt-instance info.
merge r1538 from branches/ganeti/ganeti-1.2
Check HVM device type on instance modify as well.
Check memory size before setting it
With this change when a user asks for a new memory size for an instance,the number is checked instead of just applied. The operation fails onlyif the instance would not be able to restart on its primary node, butgenerates warnings should it be impossible to failover the instance or...
Pass the force param to SetInstanceParms
It was already allowed in gnt-instance modify, but ignored.It will be used to force skipping parameter checks.
This is a forward-port from branches/ganeti-1.2
Original-Reviewed-by: imsnahReviewed-by: iustinp
Merge r1536 from branches/ganeti/ganeti-1.2
Add HVM device type flags 2/3
utils.SplitTime: Fix rounding of milliseconds
Reported by Iustin.
It used to return this:
utils.SplitTime(1234.999999999999)
(1234, 1000)
while it should've returned this:
(1235, 0)
merge r1535 from branches/ganeti/ganeti-1.2
Add HVM device type flags 1/4
Make WaitForJobChanges deal with long jobs
This patch alters the WaitForJobChanges luxi-RPC call to have aconfigurable timeout, so that the call behaves nicely with long jobsthat have no update.
We do this by adding a timeout parameter in the RPC call, and returning...
merge r997 from branches/ganeti/ganeti-1.2
Fix gnt-instance modify for HVM parameters
This patch makes gnt-instance modify work again for the advancedHVM parameters after it was broken by other changes.
Fix error message when masterd is not listening
Fix issue when acquiring empty lock sets
By design if an empty list of locks is acquired from a set, no locks areacquired, and thus release() cannot be called on the set. On the otherhand if None is passed instead of the list, the whole set is acquired,...
jqueue: Replace normal cache dict with weakref dict
A job should only exist once in memory. After the cache is cleaned,there can still be references to a job somewhere else. If thereare multiple instances, one can get updated while a function iswaiting for changes on another instance. By using...
jqueue: Keep timestamp of opcode start and end
jqueue: Reset run_op_idx after job is done
It can be confusing otherwise.
Fix a small typo in a constant
Seems noone ran a burnin lately :)
Reviwed-by: amischenko,ultrotter
Make sure that client programs get all messages
This is a large patch, but I can't figure out how to split it withoutbreaking stuff. The old way of getting messages by always getting thelast one didn't bring all messages to the client if they were added...