Guido Trotter [Thu, 11 Sep 2008 09:43:25 +0000 (09:43 +0000)]
Add GanetiLockManager.is_owned function
This is a public version of the private function we already had.
We don't just change the previous version because it had lots of users
in the library itself and in the testing code.
Reviewed-by: imsnah
Guido Trotter [Thu, 11 Sep 2008 09:43:04 +0000 (09:43 +0000)]
Fix LockSet._names() to work with the set-lock
If the set-lock is acquired, currently, the _names function will fail on
a double acquire of a non-recursive lock. This patch fixes the behavior,
and some lines of code added to the testAcquireSetLock test check that
this and other functioins behave properly.
Reviewed-by: imsnah
Iustin Pop [Thu, 11 Sep 2008 08:25:33 +0000 (08:25 +0000)]
Add gnt-instance (start|stop) --submit
Finish the --submit changes with these two, which (because they are
multi-opcode commands) require special handling.
Reviewed-by: ultrotter
Michael Hanselmann [Wed, 10 Sep 2008 17:46:47 +0000 (17:46 +0000)]
jqueue: Add common RPC error handling function
We didn't decide yet what exactly it should do with failed nodes.
Reviewed-by: ultrotter
Iustin Pop [Wed, 10 Sep 2008 17:07:17 +0000 (17:07 +0000)]
Remove locking of instances in certain queries
This patch is similar to the node patch (rev 1650). We disable locking
of instance (and nodes) if we only query static information.
Reviewed-by: ultrotter
Iustin Pop [Wed, 10 Sep 2008 17:07:03 +0000 (17:07 +0000)]
Add an atomic ConfigWrite.GetAllInstanceInfo()
In order to be able to query instance without locking them, we need the
same atomic query of multiple instances as for nodes.
Reviewed-by: ultrotter
Iustin Pop [Wed, 10 Sep 2008 17:06:50 +0000 (17:06 +0000)]
Add ConfigWriter._UnlockedGetInstanceList/Info()
This patch splits the GetInstanceInfo and GetInstanceList methods into
two parts, one locked one _Unlocked similar to the way nodes are
queried.
Reviewed-by: ultrotter
Iustin Pop [Wed, 10 Sep 2008 17:06:39 +0000 (17:06 +0000)]
Do not use jobs in gnt-instance _ExpandNames()
In the gnt-instance script, _ExpandNames() uses jobs to query instance
names. This is not optimal, so we change it to use queries.
Reviewed-by: ultrotter
Iustin Pop [Wed, 10 Sep 2008 17:06:27 +0000 (17:06 +0000)]
Implement "--submit" on gnt-instance
This patch adds support for the “--submit” parameter in the gnt-instance
script, for the commands where it makes sense.
Reviewed-by: ultrotter
Iustin Pop [Wed, 10 Sep 2008 15:43:27 +0000 (15:43 +0000)]
Rewrite the 'only submit job' handling in scripts
The "sys.exit(0)" was not nice as you couldn't differentiate it from
other exit codes. We change this to a specially defined exception for
this, so that multi-opcode commands can handle this nicely.
Reviewed-by: imsnah
Iustin Pop [Wed, 10 Sep 2008 11:03:00 +0000 (11:03 +0000)]
Optimize the OpQueryNodes for names only
Currently, OpQueryNodes is locking all nodes (in shared mode), which
will also block the special case of querying only for the node names
(this is needed for gnt-cluster command, for example). There is no
logical requirement to not give the administrator enough power if she/he
knows what to do, so it would be logical that querying the node names
works without a lock.
The patch changes the LUQuerytNodes.ExpandNodes to only request locking
when the selected fields contain dynamic attributes, and a small change
in the Exec() method to use the new, atomic query method from
ConfigWriter.
Reviewed-by: ultrotter
Iustin Pop [Wed, 10 Sep 2008 11:02:45 +0000 (11:02 +0000)]
Add a way to export all node information at once
The patch adds a new function to export all node information at once
(i.e. atomically with respect to the configuration lock).
Reviewed-by: ultrotter
Michael Hanselmann [Tue, 9 Sep 2008 12:57:15 +0000 (12:57 +0000)]
ganeti-noded: Add constant for queue lock timeout
Reviewed-by: iustinp
Michael Hanselmann [Tue, 9 Sep 2008 12:47:16 +0000 (12:47 +0000)]
Never remove job queue lock in node daemon
Otherwise, corruption could occur in some corner cases. E.g. when
LeaveNode is running in a child and is in the process of removing
queue files, the main process gets killed, started again and gets
a request to update the queue. This is rather extreme corner case,
but we should opt for safety.
Reviewed-by: iustinp
Iustin Pop [Tue, 9 Sep 2008 12:25:01 +0000 (12:25 +0000)]
Implement master startup safety check
This is an initial version of the master startup checks. It's a very
rudimentary change, however in normal usage (an old master was started,
the rest of the cluster is functioning normally) it will succeed in
preventing wrong startups.
Reviewed-by: imsnah
Iustin Pop [Tue, 9 Sep 2008 12:24:50 +0000 (12:24 +0000)]
Export backend.GetMasterInfo over the rpc layer
We create a multi-node call so that querying all nodes for agreement
will be fast.
Reviewed-by: imsnah
Iustin Pop [Tue, 9 Sep 2008 12:24:12 +0000 (12:24 +0000)]
Change backend._GetMasterInfo to return more data
The _GetMasterInfo() function needs to export the master name too to be
useful in master safety checks. This patch makes it a public (no _)
function and adds a third element in the return tuple. Its callers are
modified too.
Reviewed-by: imsnah
Guido Trotter [Tue, 9 Sep 2008 10:42:16 +0000 (10:42 +0000)]
Parallelize LUQueryInstanceData
Reviewed-by: iustinp
Guido Trotter [Tue, 9 Sep 2008 10:42:03 +0000 (10:42 +0000)]
Parallelize LUVerify{Cluster,Disks}
These are two easy querying LUs which require shared access to all
nodes/instances.
Reviewed-by: iustinp
Guido Trotter [Tue, 9 Sep 2008 10:41:50 +0000 (10:41 +0000)]
Parallelize LUReplaceDisks
This is the most complex parallelization so far. We have to lock one
instance (and its nodes) plus one more node if doing a remote replace,
or all nodes if doing a remote replace with iallocator.
Reviewed-by: iustinp
Guido Trotter [Tue, 9 Sep 2008 10:41:38 +0000 (10:41 +0000)]
_LockInstancesNodes: support append mode
This will be used to lock the instance's nodes in addition to some more.
Reviewed-by: iustinp
Guido Trotter [Tue, 9 Sep 2008 10:41:25 +0000 (10:41 +0000)]
Processor: remove ChainOpCode
This function was incompatible with the new locking system, and its
usage has been removed from the code. For now LUs share code by calling
common module-private functions in cmdlib.py, in the future they will
use tasklets (when those will be implemented).
Reviewed-by: iustinp
Guido Trotter [Tue, 9 Sep 2008 10:41:10 +0000 (10:41 +0000)]
Parallelize LU{A,Dea}ctivateInstanceDisks
Now that they are not used in other opcodes by chaining,
this can easily be done.
Reviewed-by: iustinp
Guido Trotter [Tue, 9 Sep 2008 10:40:58 +0000 (10:40 +0000)]
LUReplaceDisks: remove use of ChainOpCode
The calls to OpActivateInstanceDisks and OpDeactivateInstanceDisks has
been replaced by _StartInstanceDisks and _SafeShutdownInstanceDisks
respectively. This is the last usage of ChainOpCode.
Reviewed-by: iustinp
Guido Trotter [Tue, 9 Sep 2008 10:40:39 +0000 (10:40 +0000)]
Create new _SafeShutdownInstanceDisks function
This new function checks whether an instance is running, before shutting
down its disks. This is what the Exec() of LUDeactivateInstanceDisks
did, so that is replaced by a call to this function.
Reviewed-by: iustinp
Guido Trotter [Tue, 9 Sep 2008 10:40:27 +0000 (10:40 +0000)]
Fix a typo in LogicalUnit.ExpandNames docstring
s/locking.LEVEL_INSTANCES/locking.LEVEL_INSTANCE/
Reviewed-by: iustinp
Guido Trotter [Tue, 9 Sep 2008 10:40:09 +0000 (10:40 +0000)]
Use constants.LOCKS_REPLACE instead of hardcoding
This constant replaces what we used to write in recalculate_locks, and
represents the lock recalculation mode. It lives in constants.py because
it's used only in cmdlib, and thus doesn't deal with the locking library
by itself.
Reviewed-by: iustinp
Guido Trotter [Tue, 9 Sep 2008 09:39:25 +0000 (09:39 +0000)]
Fix LUReplaceDisks with iallocator
self._RunAllocator() sets self.op.remote_node, but doesn't return the
new remote node. If we set it to the return value of the function we
basically reset it to None, and iallocator is never run.
Reviewed-by: imsnah
Michael Hanselmann [Tue, 9 Sep 2008 09:01:53 +0000 (09:01 +0000)]
Use lock timeout for queue updates in ganeti-noded
This helps to prevent complete deadlocks.
Reviewed-by: iustinp
Guido Trotter [Mon, 8 Sep 2008 15:54:03 +0000 (15:54 +0000)]
Fix LUGrowDisk
The rpc library returns a list, not a tuple, so we'll accept both.
Reviewed-by: iustinp
Guido Trotter [Mon, 8 Sep 2008 15:53:16 +0000 (15:53 +0000)]
Fix iallocator run
The rpc library returns a list, not a tuple, so we'll accept both.
Reviewed-by: iustinp
Guido Trotter [Mon, 8 Sep 2008 15:53:01 +0000 (15:53 +0000)]
OpVerifyDisks returns a list, not a tuple
Fixing the check in gnt-cluster, or gnt-cluster verify-disks is broken.
Since the version in 1.2 used to return a tuple we'll accept both.
Reviewed-by: iustinp
Guido Trotter [Mon, 8 Sep 2008 13:44:24 +0000 (13:44 +0000)]
Parallelize LUExportInstance
Unfortunately for the first version we need to lock all nodes. The patch
discusses why this is and discuss ways to improve this in the future.
Reviewed-by: iustinp
Guido Trotter [Mon, 8 Sep 2008 13:44:10 +0000 (13:44 +0000)]
Parallelize LUGrowDisk
Reviewed-by: iustinp
Guido Trotter [Mon, 8 Sep 2008 13:43:56 +0000 (13:43 +0000)]
LURebootInstance: lock only primary when possible
When rebooting an instance and we're not changing it's disks status (all
the cases except in a "full" reboot) we can lock just its primary node.
Reviewed-by: iustinp
Guido Trotter [Mon, 8 Sep 2008 13:43:43 +0000 (13:43 +0000)]
Add primary_only flag to _LockInstancesNodes
As the name says when the flag is on (the default is off) only the
primary nodes are locked, as opposed to all of them.
Reviewed-by: iustinp
Michael Hanselmann [Fri, 5 Sep 2008 15:38:54 +0000 (15:38 +0000)]
utils.FileLock: Implement timeout
The timeout can be used in ganeti-noded to be more robust against
deadlocks.
Reviewed-by: iustinp
Michael Hanselmann [Fri, 5 Sep 2008 13:49:50 +0000 (13:49 +0000)]
Add lock documentation for job queue and ganeti-noded
Also change title formatting to match client-api.txt.
Reviewed-by: iustinp
Michael Hanselmann [Fri, 5 Sep 2008 12:29:58 +0000 (12:29 +0000)]
noded: Get job queue lock while purging queue content
Only one process should modify the queue at the same time.
Reviewed-by: iustinp
Michael Hanselmann [Fri, 5 Sep 2008 12:19:17 +0000 (12:19 +0000)]
QA: Remove dry run mode
It didn't work as planned because some commands depend on the return
value or output of some operations.
Reviewed-by: iustinp
Guido Trotter [Fri, 5 Sep 2008 11:00:36 +0000 (11:00 +0000)]
Add locking.ALL_SET constant and use it
Rather than specifying None in needed_locks every time, with a nice
comment saying to read what we mean rather than what we write, and that
None actually means All, in our magic world, we'll hide this secret
under the ALL_SET constant in the locking module, which has value, you
guessed it, None. After that we'll substitute all usage in cmdlib.
Some comments and examples have been fixed as well.
Reviewed-by: iustinp
Michael Hanselmann [Fri, 5 Sep 2008 10:57:25 +0000 (10:57 +0000)]
utils.SplitTime: More rounding fixes
SplitTime didn't round the same on different platforms. This patch changes
it to use microseconds and not care about rounding.
Reviewed-by: iustinp
Iustin Pop [Fri, 5 Sep 2008 10:40:15 +0000 (10:40 +0000)]
Remove bom-byte
This is not nice, removing it :)
Please use 'set nobomb' in your vi init file.
Reviewed-by: ultrotter
Guido Trotter [Thu, 4 Sep 2008 15:12:51 +0000 (15:12 +0000)]
Prevent mistakes using _GetWantedNodes
All the users of _GetWantedNodes have been converted to be concurrent
LUs, and thus cannot call this function with an empty list of nodes
anymore. This patch makes this restriction a part of the function
itself. This prevents mistakes in new concurrent LUs, and creates more
work for new non-concurrent LUs, which we shouldn't add anyway.
Reviewed-by: iustinp
Guido Trotter [Thu, 4 Sep 2008 15:12:38 +0000 (15:12 +0000)]
Paralleliza LUQueryNodeVolumes and LUQueryExports
Reviewed-by: iustinp
Guido Trotter [Thu, 4 Sep 2008 15:12:25 +0000 (15:12 +0000)]
Parallelize LUDiagnoseOS
Reviewed-by: iustinp
Guido Trotter [Thu, 4 Sep 2008 15:12:10 +0000 (15:12 +0000)]
LUQueryExports: make 'node' field mandatory
It turns out this fields was already mandatory. If it hadn't beed valid,
in fact, a value of None would have been passed to _GetWantedNodes which
would have thrown an exception.
Reviewed-by: iustinp
Guido Trotter [Thu, 4 Sep 2008 15:11:58 +0000 (15:11 +0000)]
s/Chain(OpQueryExports)/rpc.call_export_list(...)/
Parallel opcodes are not (yet?) supported for chaining. Turns out
though that chaining is used only four times in the code, and twice it's
for querying exports. But what's the need to chain the full opcode, when
the simple call_export_list rpc is going to provide the same result?
Reviewed-by: iustinp
Guido Trotter [Thu, 4 Sep 2008 15:11:43 +0000 (15:11 +0000)]
Fix wrong indentation in LUQueryNodes
Reviewed-by: iustinp
Alexander Schreiber [Thu, 4 Sep 2008 14:53:34 +0000 (14:53 +0000)]
Merge r1607 from branches/ganeti/ganeti-1.2
Use a default vnc_bind_address if None is specified
Reviewed-by: iustinp
Alexander Schreiber [Tue, 2 Sep 2008 16:23:18 +0000 (16:23 +0000)]
merge r1569 from branches/ganeti/ganeti-1.2
Implement more options for gnt-backup import
Reviewed-by: ultrotter
Alexander Schreiber [Tue, 2 Sep 2008 12:57:32 +0000 (12:57 +0000)]
merge r1568 from branches/ganeti/ganeti-1.2
Add more fields to gnt-instance list
Reviewed-by: imsnah
Alexander Schreiber [Tue, 2 Sep 2008 12:15:59 +0000 (12:15 +0000)]
merge r1548 from branches/ganeti/ganeti-1.2
Fix wrong wording of instance rename error message.
Reviewed-by: imsnah
Alexander Schreiber [Tue, 2 Sep 2008 12:12:56 +0000 (12:12 +0000)]
merge r1547 from branches/ganeti/ganeti-1.2
Document behaviour of gnt-instance console for HVM
Reviewed-by: imsnah
Alexander Schreiber [Tue, 2 Sep 2008 11:46:07 +0000 (11:46 +0000)]
merge r1542, r1543, r1573 from branches/ganeti/ganeti-1.2
Implement interactive instance OS reinstall.
Reviewed-by: ultrotter
Alexander Schreiber [Tue, 2 Sep 2008 09:09:25 +0000 (09:09 +0000)]
merge r1541 from branches/ganeti/ganeti-1.2
more information for VNC console port
Reviewed-by: ultrotter
Alexander Schreiber [Tue, 2 Sep 2008 08:42:11 +0000 (08:42 +0000)]
merge r1540 from branches/ganeti/ganeti-1.2
Allow access to HVM serial console
Reviewed-by: imsnah
Alexander Schreiber [Mon, 1 Sep 2008 16:05:32 +0000 (16:05 +0000)]
merge r1539 from branches/ganeti/ganeti-1.2
Display VNC console port in gnt-instance info.
Reviewed-by: iustinp
Alexander Schreiber [Mon, 1 Sep 2008 14:12:54 +0000 (14:12 +0000)]
merge r1538 from branches/ganeti/ganeti-1.2
Check HVM device type on instance modify as well.
Reviewed-by: imsnah
Guido Trotter [Mon, 1 Sep 2008 11:37:21 +0000 (11:37 +0000)]
Check memory size before setting it
With this change when a user asks for a new memory size for an instance,
the number is checked instead of just applied. The operation fails only
if the instance would not be able to restart on its primary node, but
generates warnings should it be impossible to failover the instance or
should the computation be impossible due to nodes being unreachable.
This is a forward-port from branches/ganeti-1.2
Original-Reviewed-by: iustinp
Reviewed-by: iustinp
Guido Trotter [Mon, 1 Sep 2008 11:37:06 +0000 (11:37 +0000)]
Pass the force param to SetInstanceParms
It was already allowed in gnt-instance modify, but ignored.
It will be used to force skipping parameter checks.
This is a forward-port from branches/ganeti-1.2
Original-Reviewed-by: imsnah
Reviewed-by: iustinp
Alexander Schreiber [Fri, 29 Aug 2008 16:57:54 +0000 (16:57 +0000)]
Merge r1534 from branches/ganeti/ganeti-1.2
Add HVM device type flag 4/4
Reviewed-by: ultrotter
Alexander Schreiber [Fri, 29 Aug 2008 16:30:56 +0000 (16:30 +0000)]
Merge r1537 from branches/ganeti/ganeti-1.2
Add HVM device type flags 3/4
Reviewed-by: ultrotter
Alexander Schreiber [Fri, 29 Aug 2008 16:17:21 +0000 (16:17 +0000)]
Merge r1536 from branches/ganeti/ganeti-1.2
Add HVM device type flags 2/3
Reviewed-by: ultrotter
Michael Hanselmann [Fri, 29 Aug 2008 15:04:59 +0000 (15:04 +0000)]
utils.SplitTime: Fix rounding of milliseconds
Reported by Iustin.
It used to return this:
>>> utils.SplitTime(1234.
999999999999)
(1234, 1000)
while it should've returned this:
>>> utils.SplitTime(1234.
999999999999)
(1235, 0)
Reviewed-by: ultrotter
Alexander Schreiber [Fri, 29 Aug 2008 15:01:11 +0000 (15:01 +0000)]
merge r1535 from branches/ganeti/ganeti-1.2
Add HVM device type flags 1/4
Reviewed-by: ultrotter
Alexander Schreiber [Fri, 29 Aug 2008 14:41:36 +0000 (14:41 +0000)]
Merge r1296 from branches/ganeti/ganeti-1.2
doc fix: Describe default values for HVM instance options & cleanup.
Reviewed-by: iustinp
Alexander Schreiber [Fri, 29 Aug 2008 13:57:56 +0000 (13:57 +0000)]
Merge r1295 from branches/ganeti/ganeti-1.2
Clarify cluster IP requirement.
Reviewed-by: iustinp
Iustin Pop [Fri, 29 Aug 2008 13:42:23 +0000 (13:42 +0000)]
Make WaitForJobChanges deal with long jobs
This patch alters the WaitForJobChanges luxi-RPC call to have a
configurable timeout, so that the call behaves nicely with long jobs
that have no update.
We do this by adding a timeout parameter in the RPC call, and returning
a special constant when the timeout is reached without an update. The
luxi client will repeatedly call the WaitForJobChanges until it gets a
real change. The timeout is hardcoded as half the RWTO value.
The patch also removes an unused variable (new_state) from the
WaitForJobChanges method.
Reviewed-by: imsnah,ultrotter
Alexander Schreiber [Fri, 29 Aug 2008 12:47:55 +0000 (12:47 +0000)]
merge r997 from branches/ganeti/ganeti-1.2
Fix gnt-instance modify for HVM parameters
This patch makes gnt-instance modify work again for the advanced
HVM parameters after it was broken by other changes.
Reviewed-by: ultrotter
Guido Trotter [Fri, 29 Aug 2008 12:45:33 +0000 (12:45 +0000)]
Add doc/locking.txt, documenting locking order
Reviewed-by: imsnah
Michael Hanselmann [Thu, 28 Aug 2008 15:35:55 +0000 (15:35 +0000)]
Fix error message when masterd is not listening
Reported by Iustin.
Reviewed-by: iustinp
Guido Trotter [Thu, 28 Aug 2008 10:29:46 +0000 (10:29 +0000)]
Fix issue when acquiring empty lock sets
By design if an empty list of locks is acquired from a set, no locks are
acquired, and thus release() cannot be called on the set. On the other
hand if None is passed instead of the list, the whole set is acquired,
and must later be released. When acquiring whole empty sets, a release
must happen too, because the set-lock is acquired.
Since we used to overwrite the required locks (needed_locks) with the
acquired ones, we weren't able to distinguish the two cases (empty list
of locks required, and all locks required, but an empty list returned
because the set is empty). Valid solutions include:
(1) forbidding the acquire of empty lists of locks
(2) skipping the acquire/release on empty lists of locks
(3) separating the to-acquire and the acquired list
This patch implements the third approach, and thus LUs will find
acquired locks in the acquired_locks dict, rather than in needed_locks.
The LUs which used this feature before have been updated. This makes it
easier because it doesn't force LUs to do more checks on corner cases,
which are easily forgettable (1) and allows more flexibility if we want
LUs to release (part-of) the locks (which is still a possibly scary
operation, but anyway). This easily combines with (2) should we choose
to implement it.
Reviewed-by: imsnah
Michael Hanselmann [Wed, 27 Aug 2008 14:52:34 +0000 (14:52 +0000)]
jqueue: Replace normal cache dict with weakref dict
A job should only exist once in memory. After the cache is cleaned,
there can still be references to a job somewhere else. If there
are multiple instances, one can get updated while a function is
waiting for changes on another instance. By using
weakref.WeakValueDictionary, which automatically removes instances as
soon as there are no strong references to it anymore, we can solve
this problem.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 27 Aug 2008 14:52:16 +0000 (14:52 +0000)]
jqueue: Keep timestamp of opcode start and end
Reviewed-by: ultrotter
Michael Hanselmann [Wed, 27 Aug 2008 14:48:16 +0000 (14:48 +0000)]
jqueue: Reset run_op_idx after job is done
It can be confusing otherwise.
Reviewed-by: ultrotter
Iustin Pop [Wed, 27 Aug 2008 10:05:34 +0000 (10:05 +0000)]
Another burnin fix
This is a result of the log timestamp changes.
Reviewed-by: imsnah
Iustin Pop [Wed, 27 Aug 2008 09:55:12 +0000 (09:55 +0000)]
Fix a small typo in a constant
Seems noone ran a burnin lately :)
Reviwed-by: amischenko,ultrotter
Michael Hanselmann [Wed, 27 Aug 2008 08:34:17 +0000 (08:34 +0000)]
Make sure that client programs get all messages
This is a large patch, but I can't figure out how to split it without
breaking stuff. The old way of getting messages by always getting the
last one didn't bring all messages to the client if they were added
too fast, thereby making commands like “gnt-cluster verify” less than
useful. These changes now introduce some sort a serial number per
log entry to keep track what message a client already received. They
also remove the log lock per opcode to make reading log entries thread
safe.
Reviewed-by: ultrotter
Michael Hanselmann [Tue, 26 Aug 2008 15:53:22 +0000 (15:53 +0000)]
QA: Use pseudo-tty via SSH
This gives continous output instead it being buffered.
Reviewed-by: ultrotter
Michael Hanselmann [Tue, 26 Aug 2008 15:44:34 +0000 (15:44 +0000)]
Add simple lock debug output
Currently it can only be enabled by modifying utils.py, but we can
add a command line parameter later if needed.
Reviewed-by: schreiberal
Michael Hanselmann [Mon, 25 Aug 2008 14:57:05 +0000 (14:57 +0000)]
Use python2.4 when developing
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 25 Aug 2008 14:56:47 +0000 (14:56 +0000)]
Remove references to YAML
I forgot to remove these when converting the QA configuration from YAML
to JSON.
Reviewed-by: ultrotter
Michael Hanselmann [Tue, 19 Aug 2008 12:17:18 +0000 (12:17 +0000)]
Add vim modeline to qa-sample.json
Vim doesn't recognize the format automatically.
Reviewed-by: ultrotter
Guido Trotter [Mon, 18 Aug 2008 12:51:58 +0000 (12:51 +0000)]
Parallelize LUQueryNodes
As for LUQueryInstances the first version just acquires a shared lock on all
nodes. In the future further optimizations are possible, as outlined by
comments in the code.
Reviewed-by: imsnah
Guido Trotter [Mon, 18 Aug 2008 12:51:35 +0000 (12:51 +0000)]
Parallelize LUQueryInstances
This first version acquires a shared lock on all requested instances and
their nodes. In the future it can be improved by acquiring less locks if
no dynamic fields have been asked, and/or by locking just primary nodes.
Reviewed-by: imsnah
Guido Trotter [Mon, 18 Aug 2008 12:51:13 +0000 (12:51 +0000)]
A few more locking unit tests
A few more tests written while bug-hunting. One of them shows a real
issue, at last. :)
Reviewed-by: imsnah
Guido Trotter [Mon, 18 Aug 2008 12:50:41 +0000 (12:50 +0000)]
Add lock-all-through-GLM unit test
I was hunting for a bug in my code and thought the culprit was in the
locking library, so I added a test to check. Unfortunately turns out it
wasn't. :( Committing the test anyway, while still trying to figure out
what's wrong...
Reviewed-by: imsnah
Guido Trotter [Mon, 18 Aug 2008 12:50:22 +0000 (12:50 +0000)]
LockSet: allow lists with duplicate values
If a list with a duplicate value is passed to a lockset what the code
now does is to try to acquire the lock twice, generating a
double-acquire exception in the SharedLock code. This is definitely an
issue. In order to solve it we can either forbit double values in a list
or just delete the duplicates. In this patch we go for the latter
solution, removing any duplicate values when creating the acquire_list.
Reviewed-by: imsnah
Guido Trotter [Mon, 18 Aug 2008 12:49:59 +0000 (12:49 +0000)]
Processor: lock all levels even if one is missing
If a locking level wasn't specified locking used to stop. This means
that if one, for example, didn't specify anything at the LEVEL_INSTANCE
level, no locks at the LEVEL_NODE level were acquired either. With this
patch we force _LockAndExecLU to be called for all existing levels, and
break the recursion if the level doesn't exist in locking.LEVELS.
Reviewed-by: imsnah
Guido Trotter [Mon, 18 Aug 2008 12:44:22 +0000 (12:44 +0000)]
LURebootInstance: move arg check in ExpandNames
The check for the reboot type can be done without any locks held, so
we'll move it to ExpandNames. Plus, we note in a FIXME that if the
reboot type is not full, we can probably just lock the primary node, and
leave the secondary unlocked.
Reviewed-by: imsnah
Michael Hanselmann [Mon, 18 Aug 2008 11:37:55 +0000 (11:37 +0000)]
QA: Convert configuration from YAML to JSON
We no longer use YAML in Ganeti at all. This patch converts the QA
configuration from YAML to JSON. JSON doesn't support comments and
I had to use a hack with fields starting with '#'.
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 18 Aug 2008 11:37:19 +0000 (11:37 +0000)]
LUVerifyCluster: Return boolean indication success
Reviewed-by: schreiberal
Michael Hanselmann [Mon, 18 Aug 2008 11:12:06 +0000 (11:12 +0000)]
Use Linux-specific way to name master socket
By using this Linux-specific way we don't have to care about removing the
socket file when quitting or starting (after an unclean shutdown). For a
more detailed description, see the comment in the patch.
Reviewed-by: schreiberal
Michael Hanselmann [Mon, 18 Aug 2008 10:51:42 +0000 (10:51 +0000)]
QA: Try to run more scripts with --version
This patch also sorts the list.
Reviewed-by: schreiberal
Michael Hanselmann [Mon, 18 Aug 2008 10:17:25 +0000 (10:17 +0000)]
QA: Always accept added node's SSH key
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 18 Aug 2008 09:59:00 +0000 (09:59 +0000)]
QA: Do not upload known_hosts file anymore
The cluster no longer keeps individual host's SSH key, but rather
aliases all of them to the cluster name.
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 18 Aug 2008 09:58:11 +0000 (09:58 +0000)]
Copy qa_utils.AssertIn from 1.2 branch
Apparently it was forgotten when import the remote API QA tests.
Reviewed-by: schreiberal
Michael Hanselmann [Fri, 15 Aug 2008 08:55:09 +0000 (08:55 +0000)]
gnt-node: Add option to always accept peer's SSH key
This option will be used to add nodes to the cluster without
asking the user to confirm the key. Together with key based
authentication this can be used in the QA tests.
Reviewed-by: ultrotter
Michael Hanselmann [Fri, 15 Aug 2008 08:47:02 +0000 (08:47 +0000)]
SshRunner: Add parameter to always accept peer's SSH key
This will be used to add nodes without user interaction, specifically
in QA tests.
Reviewed-by: ultrotter