Iustin Pop [Wed, 9 Jul 2008 10:41:40 +0000 (10:41 +0000)]
Move the master socket in the ganeti run dir
... as it was intended from the beggining, but by mistake left in the
top run dir.
Reviewed-by: ultrotter
Iustin Pop [Wed, 9 Jul 2008 10:41:30 +0000 (10:41 +0000)]
Reduce duplicate Attach() calls in bdev
Currently, the 'public' functions of bdev (FindDevice and
AttachOrAssemble) will call the Attach() method right after class
instantiation.
But the constructor itself calls this function, and therefore we have
duplicate Attach() calls (which are not cheap at all).
The patch introduces a new 'attached' instance attribute that tells if
the last Attach() was successful. The public functions reuse this so
that we only do the minimum required number of calls.
Reviewed-by: imsnah
Iustin Pop [Wed, 9 Jul 2008 10:41:21 +0000 (10:41 +0000)]
Convert bdev.py to the logging module
This does not enhance in any way the messages; it just switches to the
new module.
Reviewed-by: imsnah
Iustin Pop [Wed, 9 Jul 2008 10:41:13 +0000 (10:41 +0000)]
Convert utils.py to the logging module
The patch also logs all commands executed from RunCmd when we are at
debug level.
Reviewed-by: imsnah
Iustin Pop [Wed, 9 Jul 2008 10:41:03 +0000 (10:41 +0000)]
Remove the old locking functions
This removes (hopefully) all traces of the old locking functions and
uses.
Reviewed-by: imsnah
Michael Hanselmann [Wed, 9 Jul 2008 10:34:56 +0000 (10:34 +0000)]
Remove old job queue code
Reviewed-by: iustinp
Michael Hanselmann [Wed, 9 Jul 2008 10:34:42 +0000 (10:34 +0000)]
Change masterd/client RPC protocol
- Introduce abstraction class on client side
- Use constants for method names
- Adopt legacy function SubmitOpCode to use it
Reviewed-by: iustinp
Michael Hanselmann [Wed, 9 Jul 2008 10:34:28 +0000 (10:34 +0000)]
Make luxi RPC more flexible
- Use constants for dict entries
- Handle exceptions on server side
- Rename client function to CallMethod to match server side naming
Reviewed-by: iustinp
Michael Hanselmann [Wed, 9 Jul 2008 10:34:07 +0000 (10:34 +0000)]
Instantiate new job queue in master daemon
Reviewed-by: iustinp
Michael Hanselmann [Wed, 9 Jul 2008 10:33:32 +0000 (10:33 +0000)]
Add very simple job queue
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 16:32:28 +0000 (16:32 +0000)]
Add a more comment lines to testLockingConstants
This is to discourage even more whoever may think that this requirement
is not really useful and can be lifted, and to at least know where it's
used before trying to break it.
Reviewed-by: imsnah
Guido Trotter [Tue, 8 Jul 2008 16:32:18 +0000 (16:32 +0000)]
Convert LUTestDelay to concurrent usage
In order to do so:
- We set REQ_BGL to False
- We implement ExpandNames
That's it, really.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 16:32:07 +0000 (16:32 +0000)]
Processor: Acquire locks before executing an LU
If we're running in a "new style" LU we may need some locks, as required
by the ExpandNames function, to be able to run. We'll walk up the lock
levels present in the needed_locks dictionary and acquire them, then run
the actual LU. LUs can release some or all the acquired locks, if they
want, before terminating, provided they update their needed_locks
dictionary appropriately, so that we know not to release a level if they
have already done so.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 16:31:55 +0000 (16:31 +0000)]
LogicalUnit: add ExpandNames function
New concurrent LUs will need to call ExpandNames so that any names
passed in by the user are canonicalized, and can be used by hooks,
locking and other parts of the code. This was done in CheckPrereq
before, but it's now splitted out, as it's needed for locking, which in
turn CheckPrereq needs. Old LUs can be converted gradually.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 16:31:41 +0000 (16:31 +0000)]
Processor: Move LU execution to its own method
This makes the try...finally code simplier, and helps adding a more
complex locking structure before the actual execution. It also fixes a
concurrency bug caused by the fact that write_count was read before
acquiring the BGL, and thus spurious config update hooks run could have
been triggered. This doesn't solve the issue of running config update
hooks for concurrent LUs.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 16:31:24 +0000 (16:31 +0000)]
Add a new LockSet unittest
This test checks the LockSet behaviour when an empty list is passed.
The current behaviour is expected, but since this is a corner case,
we're safer to keep it under a check, and if we need a different one
monitor that everything is as we expect it to be.
Reviewed-by: imsnah
Michael Hanselmann [Tue, 8 Jul 2008 15:10:31 +0000 (15:10 +0000)]
constants: Add job and opcode status strings
Reviewed-by: iustinp
Michael Hanselmann [Tue, 8 Jul 2008 15:03:50 +0000 (15:03 +0000)]
workerpool: Don't notify if there was no task
Workers have to notify their pool if they finished a task to make
the WorkerPool.Quiesce function work. This is done in the finally:
clause to notify even in case of an exception. However, before
we notified on each run, even if there was no task, thereby creating
some sort of an endless loop of notifications. In a future patch
we should split the single condition object into several to
produce less spurious notifications.
While we're at this, this patch also adds two new functions to
BaseWorker to query whether it's currently running a task and then
uses one of these functions in the WorkerPool instead of querying
the internal variable directly.
Reviewed-by: iustinp
Iustin Pop [Tue, 8 Jul 2008 14:42:15 +0000 (14:42 +0000)]
Create all SUB_RUN_DIRS in ganeti-noded
Rather than just creating BDEV_CACHE_DIR we loop through the
SUB_RUN_DIRS list and create all its childs.
Reviewed-by: iustinp
Iustin Pop [Tue, 8 Jul 2008 14:42:05 +0000 (14:42 +0000)]
Add a top level RUN_GANETI_DIR constant
This patch creates a base RUN_GANETI_DIR and then moves the other run
dir constants to use that (even if just setting BDEV_CACHE_DIR as equal
to it, rather than putting it deeper, for now).
Also we create a constant list of all the subdirs we need in RUN_DIR to
work properly, which we'll use when creating them in ganeti-noded.
Reviewed-by: iustinp
Iustin Pop [Tue, 8 Jul 2008 14:41:46 +0000 (14:41 +0000)]
symlinks: Add DISK_LINKS_DIR constant
The DISK_LINKS_DIR points to the RUN_DIR/ganeti/instance-disks
directory, which will contain symlinks to the instances' disks. These
provide a stable name accross all nodes for them, and permit
live-migration to happen.
Unfortunately RUN_DIR/ganeti/instance-disks happens to be below ganeti
1.2's BDEV_CACHE_DIR, which will we need to address at some point
(possibly in 2.0).
Reviewed-by: iustinp
Michael Hanselmann [Tue, 8 Jul 2008 11:16:18 +0000 (11:16 +0000)]
luxi: Use serializer module instead of simplejson
Reviewed-by: iustinp
Michael Hanselmann [Tue, 8 Jul 2008 09:38:16 +0000 (09:38 +0000)]
serializer.DumpJson: Control indentation by parameter
If the simplejson module supports indentation, it's always used. There
are cases where we might not want to use it or enable it only for
debugging purposes, such as in RPC.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 09:14:07 +0000 (09:14 +0000)]
Add a missing import to cmdlib
cmdlib uses some constants from locking (ie. locking levels) but doesn't
import it. This patch fixes the issue.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 08:55:04 +0000 (08:55 +0000)]
Fix an error accessing the cfg
Since the context is passed to LogicalUnit, rather than the cfg, we can
only access the cfg as self.cfg, self.context.cfg, or context.cfg (in
the constructor). cfg is not valid anymore.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 08:49:51 +0000 (08:49 +0000)]
Add and remove instance/node locks
Whenever we add an instance or node to the cluster (i.e. to the config
and whenever we remove them we should add/remove locks as well). In the
future we may want to optimize this so that the configwriter does it, or
it's handled at the context level, but till we're adding/removing
instances and nodes with the BGL held it doesn't matter too much.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 08:49:41 +0000 (08:49 +0000)]
Pass context to LUs
Rather than passing a ConfigWriter to the LUs we'll pass the whole
context, from which a ConfigWriter can be extracted, but we can also
access the GanetiLockManager. This also fixes the places where a FakeLU
is created.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 08:49:29 +0000 (08:49 +0000)]
mocks: create a FakeContext object
This will be passed to FakeLUs
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 08:49:19 +0000 (08:49 +0000)]
Fix a typo in LUTestDelay docstring
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 08:41:04 +0000 (08:41 +0000)]
Locking: remove LEVEL_CONFIG lockset
Since the ConfigWriter now handles its own locking it's not necessary to
have a specific level for the config in the Locking Manager anymore.
This patch thus removes it, and all the unittest calls that used it, or
depended on it being present.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 08:40:51 +0000 (08:40 +0000)]
ConfigWriter: synchronize access
Since we share the ConfigWriter we need somehow to make sure that
accessing it is properly synchronized. We'll do it using the
locking.ssynchronized decorator and a module-private shared lock.
This patch also renames a few functions, which were called inside the
ConfigWriter, to a private version _UnlockedFunctionName, and exports
the synchronized public ones. The internal callers, which are already
synchronized, are then changed to use the _Unlocked version, to prevent
double locking.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 08:40:42 +0000 (08:40 +0000)]
Locking: add ssynchronized decorator
This patch creates a new decorator function ssynchronized in the locking
library, which takes as input a SharedLock, and synchronizes access to
the decorated functions using it. The usual SharedLock semantics apply,
so it's possible to call more than one synchronized function at the same
time, when the lock is acquired in shared mode, and still protect
against exclusive access.
The patch also adds a few unit test to check the basic decorator's
functionality, and to provide an example on how to use it.
Reviewed-by: iustinp
Guido Trotter [Tue, 8 Jul 2008 08:40:29 +0000 (08:40 +0000)]
ConfigWriter: remove _ReleaseLock
Remove empty function _ReleaseLock and all its calls. Since we only
have one configwriter per cluster the locking needs to cover all the
data in the object, and not just the file contents. Locking in
ConfigWriter will be handled using the ganeti locking library.
Reviewed-by: iustinp
Iustin Pop [Fri, 4 Jul 2008 16:01:56 +0000 (16:01 +0000)]
Fix some issues with the watcher
This patch fixes two bugs:
- the state file is not saved because we use the method for checking
for udpated data
- in two places 'Error' was used instead of 'Exception', which breaks
error handling
Additionally:
- the unused 're' import has been removed
- a variable named 'id' which collides with a builtin function has
been renamed
Note that comparing the serialized forms might create false negatives
(due to the dicts being reordered) but that will just cause an extra
write of the file, which is sub-optimal but harmless.
Reviewed-by: ultrotter
Michael Hanselmann [Fri, 4 Jul 2008 15:34:46 +0000 (15:34 +0000)]
Add generic worker pool implementation
Reviewed-by: ultrotter
Iustin Pop [Thu, 3 Jul 2008 12:06:46 +0000 (12:06 +0000)]
Reuse the luxi client in cli.SubmitOpCode
By a mistake, we don't reuse the luxi client. As such, we open and close
the connection at each poll cycle and spam the server logs.
Reviewed-by: ultrotter
Iustin Pop [Thu, 3 Jul 2008 12:06:35 +0000 (12:06 +0000)]
Add custom logging setup for daemons
It's better for daemons if:
- they log only to one log file
- the log level is included
- for debug runs, the filename/line number is included
This patch moves the custom formatter from the watcher to the logging
module and generalizes it; then it changes the master daemon to use this
function instead of the generic logging (which might be deprecated
anyway in the future).
Reviewed-by: imsnah
Iustin Pop [Thu, 3 Jul 2008 12:06:20 +0000 (12:06 +0000)]
Remove custom locking code from gnt-instance
The gnt-instance script doesn't run in the same process anymore, so we
can't and don't have to unlock.
Reviewed-by: ultrotter
Michael Hanselmann [Wed, 2 Jul 2008 11:58:39 +0000 (11:58 +0000)]
ganeti-masterd: Remove unused locking code
Reviewed-by: iustinp, ultrotter
Michael Hanselmann [Wed, 2 Jul 2008 11:58:20 +0000 (11:58 +0000)]
ganeti-masterd: Use logging module
Reviewed-by: ultrotter, iustinp
Guido Trotter [Tue, 1 Jul 2008 12:28:59 +0000 (12:28 +0000)]
Context: s/GLM/glm/
Make the GanetiLockManager instance of GanetiContext lowercase
Reviewed-by: imsnah
Michael Hanselmann [Tue, 1 Jul 2008 12:13:32 +0000 (12:13 +0000)]
Set locale when using docbook programs
At least docbook2man inserts a date formatted using the current
locale into its output.
Reviewed-by: iustinp
Iustin Pop [Tue, 1 Jul 2008 11:55:24 +0000 (11:55 +0000)]
Update .gitignore
Reviwed-by: imsnah
Iustin Pop [Tue, 1 Jul 2008 11:44:37 +0000 (11:44 +0000)]
Add a FirstFree function to utils.py
This function will return the first unused integer based on a list of
used integers (e.g. [0, 1, 3] will return 2).
Reviewed-by: imsnah
Guido Trotter [Tue, 1 Jul 2008 10:43:49 +0000 (10:43 +0000)]
Increase the thread size to 5
Now that we use the locking library to make sure running opcodes cannot
step on each other toes we can have a bigger thread size, and
potentially process many opcodes in a parallel manner.
Reviewed-by: iustinp
Guido Trotter [Tue, 1 Jul 2008 10:43:42 +0000 (10:43 +0000)]
Processor: acquire the BGL for LUs requiring it
If a LU required the BGL (all LUs do, right now, by default) we'll
acquire it in the Processor before starting them. For LUs that don't
we'll still acquire it, but in a shared fashion, so that they cannot run
together with LUs that do.
We'll also note down whether we own the BGL exclusively, and if we don't
and we try to chain a LU that does, we'll fail.
More work will need to be done, of course, to convert LUs not to require
the BGL, but this basic infrastructure should guarantee the coexistance
of the old and new world for the time being.
Reviewed-by: iustinp
Guido Trotter [Tue, 1 Jul 2008 10:43:32 +0000 (10:43 +0000)]
Processor: pass context in and use it.
The processor used to create a new ConfigWriter when it was initialized.
We now have one in the context, so we'll just recycle it. First of all
we'll pass the context in when creating a new Processor object, then
we'll just use context.cfg, which is granted to be initialized, wherever
we used self.cfg, and stop checking whether the config is already
initialized or not.
In the future the Processor will be able to use the context also to
acquire the BGL for LUs that require it, and to push the context down to
LUs that don't in order for them to manage their own locking.
Reviewed-by: iustinp
Guido Trotter [Tue, 1 Jul 2008 10:43:21 +0000 (10:43 +0000)]
Add REQ_BGL LogicalUnit run requirement
When logical units have REQ_BGL set (it is currently the default) they
need to be the only ganeti operation run on the cluster, and we'll
guarantee it at the master daemon level. Currently only one thread is
running at a time, so this requirement is never broken.
Reviewed-by: iustinp
Guido Trotter [Tue, 1 Jul 2008 10:43:11 +0000 (10:43 +0000)]
Burnin doesn't need a Processor
In 2.0 burnin submits job to the master daemon, so it doesn't need to
create an internal Processor anymore. Even if the processor is not used
anywhere in the burnin code it was still initialized as a leftover of
how burnin used to work. Fixing this.
Reviewed-by: iustinp
Iustin Pop [Tue, 1 Jul 2008 09:48:56 +0000 (09:48 +0000)]
Implement “gnt-job list -o +...”
This adds the same “-o +...” functionality in gnt-job as in the node and
instance scripts.
Reviewed-by: imsnah
Guido Trotter [Mon, 30 Jun 2008 16:11:48 +0000 (16:11 +0000)]
Fix sstore handling in Processor
- no need to keep the sstore as an object member, remove it
- don't reinitialize sstore only if self.cfg is None
This is not an issue, as the Processor is recycled for every opcode,
but in general we know that (a) we might need a different type of
sstore for different opcodes and (b) initializating them is cheap
- recreate sstore when chaining opcodes
Without this fix chaining an opcode which requires a writable sstore
to one which doesn't would fail. This doesn't happen today, but it's
better to fix it anyway
These changes are possible because nowadays all opcodes already require
a working cluster/configuration.
Reviewed-by: iustinp
Guido Trotter [Mon, 30 Jun 2008 16:11:01 +0000 (16:11 +0000)]
Remove duplicate code in hooks unittests
All the tests there used to creare a cfg, a sstore, an opcode and a LU.
Put all the duplicate code in the setUp function.
Reviewed-by: iustinp
Guido Trotter [Mon, 30 Jun 2008 12:37:48 +0000 (12:37 +0000)]
ganeti-masterd: init and distribute common context
This patch creates a new GanetiContext class, which is used to hold
context common to all ganeti worker threads. As for the
GanetiLockingManager class it is paramount that there is only one such
class throughout the execution of Ganeti, so the class checks for that,
and also forbids its own modification after it's been initialized. The
context for now contains a ConfigWriter and a GanetiLockingManager and
is created by the daemon and propagated to PoolWorker(s) and
JobRunner(s).
Reviewed-by: iustinp
Guido Trotter [Fri, 27 Jun 2008 14:28:50 +0000 (14:28 +0000)]
AddNode: move the initial setup to boostrap
From the master node we can't start ssh and connect to the remote node,
nor we can do it from ganeti-noded as this ssh section will possibly ask
for key confirmation and password. So the code to copy the ganeti-noded
password and SSL key has been moved to bootstrap.py, and it's called by
gnt-node before the AddNode opcode.
Reviewed-by: iustinp
Guido Trotter [Fri, 27 Jun 2008 14:28:37 +0000 (14:28 +0000)]
AddNode: Check for node existance
In the "new world" we'll need to setup ganeti-noded via ssh on the node
before calling the AddNode opcode. Before doing it we'll check that the
node is not already in the cluster, if --readd was not passed. This
guarantees we're not going to restart ganeti-noded on a running node.
This patch also incidentally fixes a non-style-guide conformant
docstring.
Reviewed-by: iustinp
Guido Trotter [Fri, 27 Jun 2008 14:28:27 +0000 (14:28 +0000)]
LUAddNode: use node-verify to check node hostname
As we can't use ssh.VerifyNodeHostname directly, we'll set up a mini
node-verify to do checking between the master and the new node. In the
future networking checks, or more nodes, can be added as well.
Reviewed-by: iustinp
Guido Trotter [Fri, 27 Jun 2008 14:28:17 +0000 (14:28 +0000)]
LUAddNode: use self.sstore, not a local ss
Since we're inside a LU we have access to self.sstore.
No need to use ss, which separate instantiation will disappear in a few
patches! ;)
Reviewed-by: iustinp
Guido Trotter [Fri, 27 Jun 2008 14:28:06 +0000 (14:28 +0000)]
LUAddNode: upload files via rpc, not scp
We used to scp all the ssconf files, and the vnc password file to the
new node. With this patch we use the upload_file rpc, specifying just
the new node as a destination. All the files previously copied by scp
are already allowed by the backend.
Reviewed-by: iustinp
Guido Trotter [Fri, 27 Jun 2008 14:27:56 +0000 (14:27 +0000)]
Allow VNC_PASSWORD_FILE to be rpc-uploaded
What could possibly go wrong?
Reviewed-by: iustinp
Guido Trotter [Fri, 27 Jun 2008 14:27:46 +0000 (14:27 +0000)]
Change fping to TcpPing in two LUs
Two LUs are using RunCmd to call fping, in order to check for an IP
presence on the network. Substituting it with TcpPing will get rid of
it, which makes it not break in the new world order, where the master
cannot fork.
Reviewed-by: iustinp
Guido Trotter [Fri, 27 Jun 2008 14:27:37 +0000 (14:27 +0000)]
raise QuitGanetiException in LeaveCluster
Reviewed-by: iustinp
Guido Trotter [Fri, 27 Jun 2008 14:27:27 +0000 (14:27 +0000)]
ganeti-noded: Fix handling of QuitGanetiException
- s/GanetiQuitException/QuitGanetiException/
- Look for the arguments in err.args, not err itself
Reviewed-by: iustinp
Guido Trotter [Fri, 27 Jun 2008 14:27:18 +0000 (14:27 +0000)]
Simplify QuitGanetiException instantiation
Rather than packing all the arguments in a tuple, let's pass them
plainly. The superclass won't complain.
Reviewed-by: iustinp
Michael Hanselmann [Fri, 27 Jun 2008 09:02:22 +0000 (09:02 +0000)]
logger: Set formatter for stderr
Having a timestamp on log messages is very useful. The default
format string doesn't include a timestamp.
Reviewed-by: ultrotter
Guido Trotter [Thu, 26 Jun 2008 14:42:44 +0000 (14:42 +0000)]
When removing a node don't ssh to it
Even in 1.2 this behaviour is broken, as the rpc call will remove the
ssh keys before we get a chance to log in. Now the rpc takes care of
shutting down the node daemon as well, so we definitely can avoid this.
This makes the LURemoveNode operation work again with the threaded
master daemon.
Reviewed-by: iustinp
Guido Trotter [Thu, 26 Jun 2008 14:42:24 +0000 (14:42 +0000)]
ganeti-noded: quit on QuitGanetiException
Accoring to the usage documented in the QuitGanetiException docstring,
if we receive such an exception we'll set the global _EXIT_GANETI_NODED
variable to True, and then return either a valid value or an error
message to the user. This will be the last request we serve, though,
because the main loop will be interrupted and the daemon will terminate.
Reviewed-by: iustinp
Guido Trotter [Thu, 26 Jun 2008 14:42:14 +0000 (14:42 +0000)]
Add errors.QuitGanetiException
This exception does not signal an error but serves the purpose of making
the ganeti daemon shut down after handling a request. Currently it will
be used by ganeti-noded but in the future ganeti-masterd might make use
of it as well. Its usage is documented in the docstring.
Reviewed-by: iustinp
Guido Trotter [Thu, 26 Jun 2008 14:42:05 +0000 (14:42 +0000)]
ganeti-noded: serve not quite forever
Rather than calling httpd.serve_forever() in ganeti-noded we'll call
httpd.handle_request() but just while a global variable, which we'll
call _EXIT_GANETI_NODED, remains false.
Reviewed-by: iustinp
Guido Trotter [Thu, 26 Jun 2008 14:41:56 +0000 (14:41 +0000)]
Add missing empty line in SshKeyError's docstring
Reviewed-by: iustinp
Guido Trotter [Thu, 26 Jun 2008 14:41:47 +0000 (14:41 +0000)]
Remove spurious check during LUAddNode
There is no point in checking whether the cluster VNC password file
exists as a prerequisite for AddNode, considering the check happens on
the master node, not the target one. Removing this check.
Reviewed-by: iustinp
Guido Trotter [Thu, 26 Jun 2008 14:41:38 +0000 (14:41 +0000)]
Improve LURemoveNode BuildHooksEnv docstring
Reviewed-by: iustinp
Michael Hanselmann [Thu, 26 Jun 2008 09:41:36 +0000 (09:41 +0000)]
devel/upload: Add --no-restart option
If --no-restart is passed to devel/upload, it'll not run
"/etc/init.d/ganeti restart" (which kills processes), making
debugging on a terminal a bit easier.
Reviewed-by: iustinp, ultrotter
Michael Hanselmann [Wed, 25 Jun 2008 08:07:19 +0000 (08:07 +0000)]
Cleanup old DRBD 0.7.x code
Apparently there were still some leftovers. While removing an instance,
I got the message "unhandled exception 'module' object has no attribute
'LD_MD_R1'".
Reviewed-by: iustinp
Iustin Pop [Wed, 25 Jun 2008 06:45:19 +0000 (06:45 +0000)]
Cleanup LV status computation
Currently, when seeing if a LV is degraded or not (i.e. virtual volume),
we first attach to the device (which does an lvdisplay), then do a lvs
in order to display the lv_attr. This generates two external commands to
do (almost) the same thing.
This patch changes the Attach() method for LVs to call lvs and display
both the major/minor (needed for attach) and the lv_status (needed for
GetSyncStatus). Thus, later in GetSyncStatus, we don't need to run lvs
again, and instead just return the value computed in Attach().
Reviewed-by: imsnah
Iustin Pop [Tue, 24 Jun 2008 14:30:54 +0000 (14:30 +0000)]
Add a .gitignore file
This makes it easier to setup new git repositories, and makes it more
likely all people have the same ignore rules.
Reviewed-by: imsnah
Michael Hanselmann [Mon, 23 Jun 2008 17:22:06 +0000 (17:22 +0000)]
Add unittests for ganeti.serializer
Reviewed-by: iustinp
Michael Hanselmann [Mon, 23 Jun 2008 17:21:40 +0000 (17:21 +0000)]
Remove lib/Makefile.libcommon
Reviewed-by: iustinp
Iustin Pop [Mon, 23 Jun 2008 16:55:22 +0000 (16:55 +0000)]
Fix gnt-cluster “command” and “copyfile”
Since the disabling of forking in the master daemon, the two ssh-based
subcommands were not working anymore. However, there is no need at all
for the commands to be run from the master daemon (permissions to read
the cluster private ssh key notwithstanding), they can be run directly
from the command line utilities.
The patch removes the two opcodes OpRunClusterCommand and
OpClusterCopyFile (and their associated LUs) and changes the code in
‘gnt-cluster’ to query the list of nodes and run directly the SshRunner
over the list. As such, all forking is done from the gnt-cluster script,
and the commands are working again.
Reviewed-by: imsnah
Guido Trotter [Mon, 23 Jun 2008 15:00:16 +0000 (15:00 +0000)]
Handle any exception in ganeti-masterd
If an uncaught exception is thrown currently it destroys the calling
thread. This patch changes the behaviour to failing the current job,
logging a message, but trying to keep the daemon up.
Reviewed-by: imsnah
Michael Hanselmann [Mon, 23 Jun 2008 13:39:17 +0000 (13:39 +0000)]
cfgupgrade: Implement upgrading to Ganeti 2.0 configuration
Reviewed-by: iustinp
Michael Hanselmann [Mon, 23 Jun 2008 13:15:18 +0000 (13:15 +0000)]
Makefile.am: Don't create "--" directory
Automake automatically appends "--" to @mkdir_p@. In case you have
a directory named "--" in your source tree, you can remove it using
the command "rm -rf -- --".
Reviewed-by: iustinp
Michael Hanselmann [Mon, 23 Jun 2008 13:00:45 +0000 (13:00 +0000)]
objects: Remove config_version from cluster configuration
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 23 Jun 2008 12:53:15 +0000 (12:53 +0000)]
cfgupgrade: Add main() function
Reviewed-by: iustinp
Michael Hanselmann [Mon, 23 Jun 2008 12:53:04 +0000 (12:53 +0000)]
cfgupgrade: Add logging module
Reviewed-by: iustinp
Guido Trotter [Mon, 23 Jun 2008 12:50:15 +0000 (12:50 +0000)]
Fix the zombie process unittest
The failure is because in high load, the parent gets to run before the
child has the chance to os._exit(), and therefore it is still running
when the parent does the check.
The fix removes the chance of this happening by waiting to receive a SIGCHLD
(but not calling wait()) before trying to test the pid.
Reviewed-by: imsnah
Michael Hanselmann [Mon, 23 Jun 2008 11:30:54 +0000 (11:30 +0000)]
Bump version to 2.0.0~alpha0
We decided to bump the major number to 2 a few weeks ago due to the huge number
of changes going into it.
Reviewed-by: iustinp
Michael Hanselmann [Mon, 23 Jun 2008 11:11:42 +0000 (11:11 +0000)]
Add functions to calculate version number to constants.py
In cfgupgrade, we need to extract parts of and build new version numbers.
Reviewed-by: iustinp
Michael Hanselmann [Mon, 23 Jun 2008 09:52:52 +0000 (09:52 +0000)]
utils.WriteFile: Remove optional check_abspath parameter
cfgupgrade will not work with relative paths at all, but rather get them
from constants.py.
Reviewed-by: iustinp
Iustin Pop [Sun, 22 Jun 2008 10:57:52 +0000 (10:57 +0000)]
Add a ‘tags’ field to instance and node listing
Currently there isn't any easy way to list all nodes or instance and
their tags; you have to query each node in turn, or list all the tags
via something like “gnt-cluster search-tags '.*'”. Of course, this is
not optimal.
The patch adds a new fields to “gnt-instance list” and “gnt-node list”
called ‘tags’, that will list the tags of the object in comma-separated
form. This field will be empty if there are no tags (when using a
separator this output can still be parsed by other scripts).
At opcode level, there is a new fields called ‘tags’ that returns a
(python) list of the object tags.
Reviewed-by: ultrotter
Iustin Pop [Sat, 21 Jun 2008 18:49:14 +0000 (18:49 +0000)]
Implement handling of luxi errors in cli.py
Currently the generic handling of ganeti errors in cli.py (GenericMain
and FormatError) only handles the core ganeti errors, and not the client
protocol errors (which live in a separate hierarchy).
This patch adds handling of luxi errors too, and also adds another luxi
error for the case when the master is not running. This gives us a nice:
gnta1:~# gnt-node list
Cannot communicate with the master daemon.
Is it running and listening on '/var/run/ganeti-master.sock'?
error message instead of a traceback.
Reviewed-by: amishchenko
Iustin Pop [Sat, 21 Jun 2008 11:27:22 +0000 (11:27 +0000)]
Remove twisted checks from configure.ac
Currently we don't use twisted, so we remove the twisted checks from the
configure stage.
Reviewed-by: amishchenko
Iustin Pop [Fri, 20 Jun 2008 11:04:27 +0000 (11:04 +0000)]
Add a rpc call for BlockDev.Close()
This patch adds rpc layer calls (in rpc.py and the equivalent in
ganeti-noded) to close a list of block devices, and the wrapper in
backend.py that takes a list of Disk objects, identifies them and
returns correctly formatted results.
The reason why this very basic call was missing until now from the rpc
layer is that we usually don't care about device closes (though we
should, and will do so in the future) as only drbd has a meaningful
Close() operation; right now we directly do Shutdown().
The patch is clean enough that it's actually independent of the live
migration implementation.
Reviewed-by: imsnah
Michael Hanselmann [Thu, 19 Jun 2008 14:06:28 +0000 (14:06 +0000)]
Check for docbook2{man,pdf,html}
docbook2{man,pdf,html} are mandatory. "configure" aborts if one
of them isn't found.
Reviewed-by: iustinp
Iustin Pop [Thu, 19 Jun 2008 13:37:08 +0000 (13:37 +0000)]
Small typo in gnt-instance manpage
Reviewed-by: manuel.franceschini
Michael Hanselmann [Thu, 19 Jun 2008 12:56:17 +0000 (12:56 +0000)]
Use a single Makefile.am instead of many
This change allows us to use cleaner dependencies between
directories. The build system is basically rewritten in large parts
and may contain bugs.
Reviewed-by: iustinp
Iustin Pop [Wed, 18 Jun 2008 15:09:08 +0000 (15:09 +0000)]
Fix bdev unittest when run under distcheck
The path to the filename for drbd8 proc data is not correctly computed
when using distcheck. The patch duplicates it from the other drbd tests.
Reviewed-by: ultrotter
Iustin Pop [Wed, 18 Jun 2008 15:08:53 +0000 (15:08 +0000)]
Rework the DRBD8 device status computation
Currently, compute the status of a drbd8 device in GetSyncStatus and
return only the values that we need (and fit in the framework of
GetSyncStatus). However, the full status details are useful (and needed)
in other places, so the patch attempts to improve this situation.
We abstract the status of a device outside in a separate class, that
knows how to parse contents from /proc/drbd and set easily accessible
attributes. We then simplify the GetSyncStatus to use this and return
the values that it needs, and add a separate method that returns the
full status object.
The move to a separate class cleans up a little bit the old
sync-progress computation from GetSyncStatus, but it's still many
regexes.
The patch also adds unittests for a few statuses, and modifies one
BaseDRBD call to accept a custom filename instead of '/proc/drbd' to
ease unittests.
Reviewed-by: imsnah
Michael Hanselmann [Wed, 18 Jun 2008 12:32:23 +0000 (12:32 +0000)]
ganeti-watcher: Replace custom exceptions with ganeti.error.*
Reviewed-by: iustinp
Michael Hanselmann [Wed, 18 Jun 2008 12:31:53 +0000 (12:31 +0000)]
ganeti-watcher: Don't write file if data didn't change
This is the safest way to detect changes and the amount of data
is small, so keeping a copy around is cheap enough.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 18 Jun 2008 12:31:34 +0000 (12:31 +0000)]
ganeti-watcher: Rename WatcherState.data to WatcherState._data
Cleanup: _data is private and should not be modified from outside
of this class.
Reviewed-by: iustinp