| Branch: | Tag: | Revision:

root / lib @ b9bddb6b

# Date Author Comment
b9bddb6b 10/10/2008 12:52 pm Iustin Pop

Cleanup in cmdlib for standalone function calls

This patch is a cleanup of the standalone functions in cmdlib. Many of
them too as argument a ConfigWriter instance, but some also took other
parameters from the lu (e.g. proc), and in the future, if we want to...

7b3a8fb5 10/10/2008 12:51 pm Iustin Pop

Small random fixes

Indentation in bootstrap was wrong and some names in were not

Reviewed-by: imsnah

4b2f38dd 10/09/2008 02:29 pm Iustin Pop

Move instance hypervisor check to ExpandNames

This check can be done earlier, in ExpandNames, and is needed here for
the hypervisor parameter check.

Reviewed-by: ultrotter

e49099a4 10/08/2008 08:31 pm Alexander Schreiber

Update scripts and qa config for changed hypervisor names.

Reviewed-by: ultrotter

00cd937c 10/08/2008 05:31 pm Iustin Pop

Sanitize the hypervisor names

Since in 2.0 the user will possibly have more interaction with the
hypervisor names, we sanitize them by removing the version numbers
(the version can be a prerequisite for the ganeti installation, we
shouldn't document it in variable names)....

02f99608 10/08/2008 04:04 pm Oleksiy Mishchenko

Fix for gnt-cluster init.

Reviewed-by: iustinp

e69d05fd 10/08/2008 01:36 pm Iustin Pop

Move the hypervisor attribute to the instances

This (big) patch moves the hypervisor type from the cluster to the
instance level; the cluster attribute remains as the default hypervisor,
and will be renamed accordingly in a next patch. The cluster also gains...

9f0e6b37 10/07/2008 02:39 pm Iustin Pop

rpc.call_instance_migrate: pass the whole instance

Currently the call_instance_migrate call only passes the instance name;
we need to pass the whole object for the hypervisor_type changes (all
the other individual instance rpc calls already pass the instance...

e92376d7 10/07/2008 11:03 am Iustin Pop

Implement job 'waiting' status

Background: when we have multiple jobs in the queue (more than just a
few), many of the jobs (up to the number of threads) will be in state
'running', although many of them could be actually blocked, waiting for
some locks. This is not good, as one cannot easily see what is...

07cd723a 10/06/2008 07:42 pm Iustin Pop

Implement job auto-archiving

This patch adds a new luxi call that implements auto-archiving of jobs
older than a certain age (or -1 for all completed jobs), and the gnt-job
command that makes use of this (with 'all' for -1).

Reviewed-by: imsnah

2241e2b9 10/06/2008 06:59 pm Iustin Pop

Add a simple timespec parsing function

This function will be used for auto-archiving jobs via the command line.
The function is pretty simple, we only support up to weeks since months
and higher are not 'precise' entities, and dealing with them would
require us to start using calendar functions....

62c9ec92 10/06/2008 06:58 pm Iustin Pop change to get cluster name from master

Currently there are three function in backend that need the cluster name
in order to instantiate an SshRunner. The patch changes these to get the
cluster name from the master in the rpc call; once the multi-hypervisor...

3d3a04bc 10/06/2008 06:56 pm Iustin Pop

Disable re-reading of config file

Since the objects read from the config file are passed to the various
threads, it's unsafe to re-read the config file (and throw away
ConfigWriter._config_data). As such, we disable the re-reading of the
file (since now the master is the owner the file, it makes not sense to...

e0ec0ff6 10/06/2008 04:40 pm Iustin Pop

Fix gnt-job list with empty timestamps

In case the job object doesn't have a timestamp (which is a separate
issue), the listing should not break. We fix this by changing the
FormatTimstamp function itself to return '?' in case the timestamp
doesn't look good (note that it still can break if non-integers are...

1daae384 10/06/2008 04:29 pm Iustin Pop

Increase the number of threads to 25

Since our locks are not gathered nicely, we can have jobs that are
actually blocking on locks (parallel burnin shows this), so at least we
need to increase the number of threads above the usual number of jobs we
could have in a such a case....

6b0469d2 10/06/2008 04:16 pm Iustin Pop

Fix SshRunner breakage from the changed API

More places actually use the SshRunner than just the gnt-cluster

Reviewed-by: ultrotter

56bece1f 10/06/2008 02:48 pm Iustin Pop

Change SshRunner usage

Currently the SshRunner uses a SimpleConfigReader instance, however this
is not best. We change it to use the cluster name directly (and its
constructor now takes this as parameter, instead of SCR), and its
callers are change to pass the name directly....

06dc5b44 10/05/2008 12:16 pm Iustin Pop

Fix ssconf.GetMasterAndMyself

The ssconf migration left this out.

Reviwed-by: imsnah,ultrotter

c259ce64 10/01/2008 08:37 pm Michael Hanselmann

Get rid of ssconf

Remove leftovers from ssconf.

Reviewed-by: iustinp

0b38cf6e 10/01/2008 08:37 pm Michael Hanselmann

Don't pass sstore to LUs anymore

sstore is no longer used in LUs.

Reviewed-by: iustinp

d23ef431 10/01/2008 08:35 pm Michael Hanselmann


Replace ssconf with configuration.

Reviewed-by: iustinp

d6a02168 10/01/2008 08:35 pm Michael Hanselmann


Replacing ssconf with configuration. Cluster rename is broken and stays
that way.

Reviewed-by: iustinp

7688d0d3 10/01/2008 08:35 pm Michael Hanselmann


Get rid of ssconf and convert to configuration instead.

Reviewed-by: iustinp

eb1328a9 10/01/2008 08:34 pm Michael Hanselmann


Replacing ssconf with utility functions.

Reviewed-by: iustinp

3707f851 10/01/2008 08:34 pm Michael Hanselmann

Convert hypervisor

Replacing ssconf with configuration.

Reviewed-by: iustinp

437138c9 10/01/2008 08:34 pm Michael Hanselmann


Replacing ssconf with configuration.

Reviewed-by: iustinp

5b263ed7 10/01/2008 08:34 pm Michael Hanselmann


The configuration version is now again in the configuration file.

Reviewed-by: iustinp

c657dcc9 10/01/2008 08:34 pm Michael Hanselmann


Replacing ssconf with simpleconfig.

Reviewed-by: iustinp

ae5849b5 10/01/2008 08:33 pm Michael Hanselmann

Add new query to get cluster config values

This can be used to retrieve certain cluster config values from
within clients.

OpDumpClusterConfig was not used anywhere, hence I'm just reusing
it. The way ConfigWriter.DumpConfig returned the configuration
was not thread-safe, anyway (no deepcopy)....

4a8b186a 10/01/2008 08:33 pm Michael Hanselmann

Move functions from elsewhere

These functions will be used to access config values instead of using

Reviewed-by: iustinp

856c67e1 10/01/2008 08:33 pm Michael Hanselmann

Add simple configuration reader/writer classes

This will be used to read the configuration file in the node daemon.
The write functionality is needed for master failover.

Reviewed-by: iustinp

5188ab37 10/01/2008 12:27 pm Iustin Pop

Remove last use of utils.RunCmd from the watcher

The watcher has one last use of ganeti commands as opposed to sending
requests via luxi. The patch changes this to use the cli functions.

The patch also has two other changes:
- fix the docstring for OpVerifyDisks (found out while converting...

f6bd6e98 10/01/2008 12:03 pm Michael Hanselmann

Add cluster options from ssconf to configuration

ssconf will become write-only from ganeti-masterd's point of view,
therefore all settings in there need to go into the main configuration

Reviewed-by: iustinp

b9eeeb02 10/01/2008 12:03 pm Michael Hanselmann

Move instantiation of config into

Future patches will add even more variables to the cluster config.
Adding more parameters wouldn't make the function easier to use and
it doesn't make sense to pass them to another function, as it's
only done once in on cluster initialization....

53c04d04 10/01/2008 11:29 am Iustin Pop

Change the results from cli.PollJob

Curently PollJob accepts a generic job, but will return (history
artifact) only the first opcode result. This is wrong, as it doesn't
allow polling of a job with multiple results.

Its only caller (for now) is also changed, so no functional changes...

c56ec146 09/30/2008 03:04 pm Iustin Pop

Enhance the job-related timestamps

This patch adds start, stop, and received timestamp for jobs (and allows
querying of them), and allows querying of the opcode timestamps.

Reviewed-by: imsnah

3386e7a9 09/30/2008 12:36 pm Iustin Pop

Abstract the timestamp formatting into

Currently we format the timestamp inside the gnt-job info function. We
will need this more times in the future, so move it to as a
separate, exported function.

Reviewed-by: imsnah

5b23c34c 09/29/2008 06:38 pm Iustin Pop

Add opcode execution log in job info

This patch adds the job execution log in “gnt-job info” and also allows
its selection in “gnt-job list” (however here it's not very useful as
it's not easy to parse). It does this by adding a new field in the query
job call, named ‘oplog’....

3c03759a 09/29/2008 04:15 pm Iustin Pop

Move a hardcoded constant to

For now we only use the ‘C’ protocol so we can put it in
instead of hardcoding it.

Reviewed-by: imsnah

2899d9de 09/29/2008 04:15 pm Iustin Pop

Enable the use of shared secrets

This patch enables the use of the shared secrets for DRBD8 disks, using
(hardcoded in the md5 digest algorithm.

For making this more flexible, either we implement a cluster parameter
(once the new model is in place), or we can make it ./configure-time...

f9518d38 09/29/2008 04:15 pm Iustin Pop

Extend DRBD disks with shared secret attribute

This patch, which is similar to r1679 (Extend DRBD disks with minors
attribute), extends the logical and physical id of the DRBD disks with a
shared secret attribute. This is generated at disk creation time and...

60dd1473 09/29/2008 04:09 pm Iustin Pop

Implement job summary in gnt-job list

It is not currently possibly to show a summary of the job in the output
of “gnt-job list”. The closes is listing the whole opcode(s), but that
is too verbose. Also, the default output (id, status) is not very
useful, unless one looks for (and knows about) an exact job ID....

3b87986e 09/29/2008 04:08 pm Iustin Pop

Nicely sort the job list

Unless we decide to change the job identifiers to integer, we should at
least sort the list returned by _GetJobIDsUnlocked.

Reviewed-by: imsnah

33081d90 09/28/2008 05:44 pm Iustin Pop

Move the pseudo-secret generation to

The bootstrap code needs a pseudo-secret and this is currently generated
inside the InitGanetiServerSetup function. Since more users will need
this, move it to

Reviewed-by: ultrotter

d48663e4 09/28/2008 05:43 pm Iustin Pop

Fix a bug related to static minors

When the node does not yet have any minors allocated, the first minor
(0) will not be entered in the ConfigWriter._temporary_drbds structure.
This does not happen for our current usage, since we always ask for two
minors (so the next call will not match this case), but it will be...

48ce9fd9 09/27/2008 11:45 pm Iustin Pop

Add checks for tcp/udp port collisions

In case the config file is manually modified, or in case of bugs, the
tcp/udp ports could be reused, which will create various problems
(instances not able to start, or drbd disks not able to communicate).

This patch extends the ConfigWriter.VerifyConfig() method (which is used...

b9f72b4e 09/27/2008 06:58 pm Iustin Pop

Update the cluster serial_no on certain operations

This patch adds update of the cluster serial number for:
- add/remove node (as the cluster's node list is changed)
- add/remove/rename instance (as the cluster's instance list is changed)
- change the volume group name...

38d7239a 09/27/2008 06:58 pm Iustin Pop

Allow listing of the serial_no via gnt-* list

This patch adds listing of the serial_no attribute in gnt-instance and
gnt-node list, and updates to the manpages to reflect the change.

Reviewed-by: ultrotter

b989e85d 09/27/2008 06:58 pm Iustin Pop

Initialize and update the serial_no on objects

This patch add initialization of the serial_no on instance and nodes,
and update of the field whenever an object is updated in the generic
case, via ConfigWriter.Update(obj) and in the specific case of
instances' state being modified manually....

9d38c6e1 09/27/2008 06:58 pm Iustin Pop

Switch the global serial_no to the top object

Currently the serial_no that is incremented every time the configuration
file is written is located on the 'cluster' object in the configuration
structure. However, this is wrong as the cluster serial_no should be...

be1fa613 09/27/2008 06:58 pm Iustin Pop

Add serial_no attributes to objects

This patch adds the ‘serial_no’ attribute to the other top-level objects
(the configuration object itself, the nodes and the instances).

Reviewed-by: ultrotter

97abc79f 09/27/2008 06:57 pm Iustin Pop

Replace a cfg.AddInstance with UpdateInstance

This seems to be the last (deprecated) use of AddInstance in order to
update an instance.

The patch also removes a whitespace-at-eol case.

Reviewed-by: ultrotter

1ce4bbe3 09/25/2008 12:40 pm René Nussbaumer

Fix iallocator name

port forward of patch from revision 1690 with following message:

Patch on revision 1686 used the wrong field:, which is the instance
name and not the iallocator name. self.op.iallocator is the right field.

Sorry for this inconvenience....

207a6c74 09/25/2008 11:42 am René Nussbaumer

Fix a broken format string

This patch fixes a broken format string. It's expecting 3 parameters, but only
gets 2. This change will add the missing parameter. This is a forward-port
of the fix in Ganeti 1.2

Reviewed-by: imsnah

74a48621 09/24/2008 04:43 pm Iustin Pop

Switch to logging

A couple of more modules are using the obsolete logger functions, config
being one of them.

Reviewed-by: imsnah

a1578d63 09/23/2008 03:10 pm Iustin Pop

Switch to static minors for DRBD

With some todos remaining, this patch switches the DRBD devices to use
the passed minors, and the cmdlib code (add instance and replace disks)
to request and assign minors to the DRBD disks.

- look at the disk RPC calls to see which can be optimized away, since...

a81c53c9 09/23/2008 03:10 pm Iustin Pop

Implement config support for drbd static minors

This patch adds support for allocating static minors.

Like for the LVM uuids, we add a new cache for the temporarily allocated
requests, and the users of the new methods must manually clear the
cache. If this doesn't happen, at worst we lose some minors....

468b46f9 09/23/2008 03:10 pm Iustin Pop

Fix disk replace secondary with static minors

The code in 'updating instance configuration' section of the replace
disks with change secondary node was setting a wrong new logical_id for
the drbd devices (only set the new node, not the new minor). The patch...

ffa1c0dc 09/22/2008 02:32 pm Iustin Pop

Extend DRBD disks with minors attribute

This patch converts the DRBD disks to contain also a minor (per each
node) attribute. This minor is not yet used and is always initialized
with None, so the patch does not have any real-world impact - except for
automatically upgrading config files (it adds the minors as None, None)....

3fa93523 09/18/2008 02:13 pm Guido Trotter

Apply filter properly in LUQuery{Nodes, Instances}

Currently when not locking all nodes/instances are returned, regardless
if the user asked only for some of them. With this patch we return to
the previous behaviour:
- if no names are specified return info on all current ones...

c2c2a903 09/18/2008 02:12 pm Guido Trotter

Remove auto_balance from burnin/cmdlib

There is no such feature in trunk yet.

Reviewed-by: iustinp

ca0aa6d0 09/17/2008 07:07 pm Michael Hanselmann

Add utils.ReadFile function

It abstracts exception handling and is like a complement to

Reviewed-by: iustinp

64d3bd52 09/11/2008 08:45 pm Guido Trotter

GetAllInstancesInfo, change internal iterator name

GetAllInstancesInfo used "node" as an iterator name. Change it to
instance to make it less confusing.

Reviewed-by: iustinp

8646adce 09/11/2008 08:45 pm Guido Trotter

Parallelize Tag operations

For now we lock the instance/node for adding/deleting tags from it, but
we could probably in the future do without, with more support from the
config for atomic operations.

Reviewed-by: iustinp

c53279cf 09/11/2008 08:44 pm Guido Trotter

Parallelize LUSetClusterParams (and add a FIXME)

Reviewed-by: imsnah

3656b3af 09/11/2008 12:44 pm Guido Trotter

Parallelize LURemoveExport

Reviewed-by: imsnah

cf472233 09/11/2008 12:44 pm Guido Trotter

Parallelize LURemoveInstance

Using the new add/remove infrastructure this becomes pretty easy! :)

Reviewed-by: imsnah

7baf741d 09/11/2008 12:44 pm Guido Trotter

Parallelize LUCreateInstance

Finally, instance create on different node, without iallocator, can run
in parallel. Iallocator usage still needs all nodes to be locked,
unfortunately. As a bonus most checks which could have been moved to
ExpandNames, before any locking is done....

ca2a79e1 09/11/2008 12:44 pm Guido Trotter

Implement adding/removal of locks by declaration

With this patch LUs can declare locks to be added when they start and/or
removed after they finish. For now locks can only be added in the
acquired state, and removed if owned, and added locks default to be...

d2aff862 09/11/2008 12:44 pm Guido Trotter

LockSet: forbid add() on a partially owned set

This patch bans add() on a half-acquired set. This behavior was
previously possible, but created a deadlock if someone tried to acquire
the set-lock in the meantime, and thus is now forbidden. The
testAddRemove unit test is fixed for this new behavior, and includes a...

ab62526c 09/11/2008 12:43 pm Guido Trotter

Fix typo in a comment

Reviewed-by: imsnah

80ee04a4 09/11/2008 12:43 pm Guido Trotter

Use is_owned to determine whether to unlock

Now that is_owned is public we don't need to play games at the end of an
LU. If we're still owning anything we just release it.

Reviewed-by: imsnah

d4f4b3e7 09/11/2008 12:43 pm Guido Trotter

Add GanetiLockManager.is_owned function

This is a public version of the private function we already had.
We don't just change the previous version because it had lots of users
in the library itself and in the testing code.

Reviewed-by: imsnah

d4803c24 09/11/2008 12:43 pm Guido Trotter

Fix LockSet._names() to work with the set-lock

If the set-lock is acquired, currently, the _names function will fail on
a double acquire of a non-recursive lock. This patch fixes the behavior,
and some lines of code added to the testAcquireSetLock test check that...

e74798c1 09/10/2008 08:46 pm Michael Hanselmann

jqueue: Add common RPC error handling function

We didn't decide yet what exactly it should do with failed nodes.

Reviewed-by: ultrotter

57a2fb91 09/10/2008 08:07 pm Iustin Pop

Remove locking of instances in certain queries

This patch is similar to the node patch (rev 1650). We disable locking
of instance (and nodes) if we only query static information.

Reviewed-by: ultrotter

0b2de758 09/10/2008 08:07 pm Iustin Pop

Add an atomic ConfigWrite.GetAllInstanceInfo()

In order to be able to query instance without locking them, we need the
same atomic query of multiple instances as for nodes.

Reviewed-by: ultrotter

94bbfece 09/10/2008 08:06 pm Iustin Pop

Add ConfigWriter._UnlockedGetInstanceList/Info()

This patch splits the GetInstanceInfo and GetInstanceList methods into
two parts, one locked one _Unlocked similar to the way nodes are

Reviewed-by: ultrotter

e9d741b6 09/10/2008 06:43 pm Iustin Pop

Rewrite the 'only submit job' handling in scripts

The "sys.exit(0)" was not nice as you couldn't differentiate it from
other exit codes. We change this to a specially defined exception for
this, so that multi-opcode commands can handle this nicely.

Reviewed-by: imsnah

c8d8b4c8 09/10/2008 02:03 pm Iustin Pop

Optimize the OpQueryNodes for names only

Currently, OpQueryNodes is locking all nodes (in shared mode), which
will also block the special case of querying only for the node names
(this is needed for gnt-cluster command, for example). There is no
logical requirement to not give the administrator enough power if she/he...

d65e5776 09/10/2008 02:02 pm Iustin Pop

Add a way to export all node information at once

The patch adds a new function to export all node information at once
(i.e. atomically with respect to the configuration lock).

Reviewed-by: ultrotter

1bc59f76 09/09/2008 03:47 pm Michael Hanselmann

Never remove job queue lock in node daemon

Otherwise, corruption could occur in some corner cases. E.g. when
LeaveNode is running in a child and is in the process of removing
queue files, the main process gets killed, started again and gets
a request to update the queue. This is rather extreme corner case,...

4e071d3b 09/09/2008 03:24 pm Iustin Pop

Export backend.GetMasterInfo over the rpc layer

We create a multi-node call so that querying all nodes for agreement
will be fast.

Reviewed-by: imsnah

bd1e4562 09/09/2008 03:24 pm Iustin Pop

Change backend._GetMasterInfo to return more data

The _GetMasterInfo() function needs to export the master name too to be
useful in master safety checks. This patch makes it a public (no _)
function and adds a third element in the return tuple. Its callers are...

a987fa48 09/09/2008 01:42 pm Guido Trotter

Parallelize LUQueryInstanceData

Reviewed-by: iustinp

d4b9d97f 09/09/2008 01:42 pm Guido Trotter

Parallelize LUVerify{Cluster,Disks}

These are two easy querying LUs which require shared access to all

Reviewed-by: iustinp

efd990e4 09/09/2008 01:41 pm Guido Trotter

Parallelize LUReplaceDisks

This is the most complex parallelization so far. We have to lock one
instance (and its nodes) plus one more node if doing a remote replace,
or all nodes if doing a remote replace with iallocator.

Reviewed-by: iustinp

9513b6ab 09/09/2008 01:41 pm Guido Trotter

_LockInstancesNodes: support append mode

This will be used to lock the instance's nodes in addition to some more.

Reviewed-by: iustinp

b2751b57 09/09/2008 01:41 pm Guido Trotter

Processor: remove ChainOpCode

This function was incompatible with the new locking system, and its
usage has been removed from the code. For now LUs share code by calling
common module-private functions in, in the future they will
use tasklets (when those will be implemented)....

f22a8ba3 09/09/2008 01:41 pm Guido Trotter

Parallelize LU{A,Dea}ctivateInstanceDisks

Now that they are not used in other opcodes by chaining,
this can easily be done.

Reviewed-by: iustinp

023e3296 09/09/2008 01:40 pm Guido Trotter

LUReplaceDisks: remove use of ChainOpCode

The calls to OpActivateInstanceDisks and OpDeactivateInstanceDisks has
been replaced by _StartInstanceDisks and _SafeShutdownInstanceDisks
respectively. This is the last usage of ChainOpCode.

Reviewed-by: iustinp

155d6c75 09/09/2008 01:40 pm Guido Trotter

Create new _SafeShutdownInstanceDisks function

This new function checks whether an instance is running, before shutting
down its disks. This is what the Exec() of LUDeactivateInstanceDisks
did, so that is replaced by a call to this function.

Reviewed-by: iustinp

3a5d7305 09/09/2008 01:40 pm Guido Trotter

Fix a typo in LogicalUnit.ExpandNames docstring


Reviewed-by: iustinp

f6d9a522 09/09/2008 01:40 pm Guido Trotter

Use constants.LOCKS_REPLACE instead of hardcoding

This constant replaces what we used to write in recalculate_locks, and
represents the lock recalculation mode. It lives in because
it's used only in cmdlib, and thus doesn't deal with the locking library...

de8c7666 09/09/2008 12:39 pm Guido Trotter

Fix LUReplaceDisks with iallocator

self._RunAllocator() sets self.op.remote_node, but doesn't return the
new remote node. If we set it to the return value of the function we
basically reset it to None, and iallocator is never run.

Reviewed-by: imsnah

86de84dd 09/08/2008 06:54 pm Guido Trotter

Fix LUGrowDisk

The rpc library returns a list, not a tuple, so we'll accept both.

Reviewed-by: iustinp

43f5ea7a 09/08/2008 06:53 pm Guido Trotter

Fix iallocator run

The rpc library returns a list, not a tuple, so we'll accept both.

Reviewed-by: iustinp

6657590e 09/08/2008 04:44 pm Guido Trotter

Parallelize LUExportInstance

Unfortunately for the first version we need to lock all nodes. The patch
discusses why this is and discuss ways to improve this in the future.

Reviewed-by: iustinp

31e63dbf 09/08/2008 04:44 pm Guido Trotter

Parallelize LUGrowDisk

Reviewed-by: iustinp

849da276 09/08/2008 04:43 pm Guido Trotter

LURebootInstance: lock only primary when possible

When rebooting an instance and we're not changing it's disks status (all
the cases except in a "full" reboot) we can lock just its primary node.

Reviewed-by: iustinp