Statistics
| Branch: | Tag: | Revision:

root / lib @ 34ca3914

# Date Author Comment
34ca3914 08/18/2008 03:50 pm Guido Trotter

LockSet: allow lists with duplicate values

If a list with a duplicate value is passed to a lockset what the code
now does is to try to acquire the lock twice, generating a
double-acquire exception in the SharedLock code. This is definitely an
issue. In order to solve it we can either forbit double values in a list...

8a2941c4 08/18/2008 03:49 pm Guido Trotter

Processor: lock all levels even if one is missing

If a locking level wasn't specified locking used to stop. This means
that if one, for example, didn't specify anything at the LEVEL_INSTANCE
level, no locks at the LEVEL_NODE level were acquired either. With this...

0fcc5db3 08/18/2008 03:44 pm Guido Trotter

LURebootInstance: move arg check in ExpandNames

The check for the reboot type can be done without any locks held, so
we'll move it to ExpandNames. Plus, we note in a FIXME that if the
reboot type is not full, we can probably just lock the primary node, and...

34290825 08/18/2008 02:37 pm Michael Hanselmann

LUVerifyCluster: Return boolean indication success

Reviewed-by: schreiberal

9894ece7 08/18/2008 02:12 pm Michael Hanselmann

Use Linux-specific way to name master socket

By using this Linux-specific way we don't have to care about removing the
socket file when quitting or starting (after an unclean shutdown). For a
more detailed description, see the comment in the patch.

Reviewed-by: schreiberal

c4b6c29c 08/15/2008 11:55 am Michael Hanselmann

gnt-node: Add option to always accept peer's SSH key

This option will be used to add nodes to the cluster without
asking the user to confirm the key. Together with key based
authentication this can be used in the QA tests.

Reviewed-by: ultrotter

652d6694 08/15/2008 11:47 am Michael Hanselmann

SshRunner: Add parameter to always accept peer's SSH key

This will be used to add nodes without user interaction, specifically
in QA tests.

Reviewed-by: ultrotter

f6d9f4c3 08/15/2008 11:44 am Michael Hanselmann

Move SSH option building into a function

I'm going to add another option and it would make maintaining
them in constants even more complicated.

Reviewed-by: ultrotter

54ab6aec 08/15/2008 11:44 am Michael Hanselmann

SshRunner.Run: Pass all arguments to BuildCmd

This patch changes SshRunner.Run to pass all arguments to
SshRunner.BuildCmd. They had the same arguments before
and should stay that way. This change makes it easier
to add new or change existing arguments.
...

4f0afaf5 08/14/2008 01:27 pm Guido Trotter

Pass hypervisor type to the OS scripts

It's handy to make the os scripts know which hypervisor the instance is
going to run under. In order not to change the os API we pass this
information in the environment, where the os scripts can access it if
they're hypervisor-aware....

2557ff82 08/14/2008 01:26 pm Guido Trotter

RunCmd: add optional environment overriding

If the user passes an env dict to RunCmd we'll override the environment
passed to the to-be-executed command with the values in the dict. This
allows us to pass arbitrary environment values to commands we run.
...

d47d3d38 08/13/2008 07:41 pm Guido Trotter

KVM Hypervisor Cleanup

- Remove a few experiemental code lines left as comments
- Rework first disks' boot=on addition, which was calculated twice
- Remove an empty line
- Remove reference to hvm_pae which doesn't apply to kvm

Reviewed-by: imsnah

eb58f9b1 08/13/2008 05:25 pm Guido Trotter

Add KVM hypervisor code

ht_kvm.py contains the code for ganeti to work under kvm.
This patch also modifies Makefile.am to ship that file, and
lib/hypervisor/__init__.py to import it, and add kvm to the
hypervisors map.

Reviewed-by: imsnah

550e49b9 08/13/2008 05:25 pm Guido Trotter

constants: add HT_KVM

Add a new hypervisor type, HT_KVM, to constants, and register it in the
HYPER_TYPES set.

Reviewed-by: imsnah

7e2c5b9e 08/13/2008 05:24 pm Guido Trotter

Add --with-kvm-path configure option

This allows to configure a different path to the kvm binary. By default
/usr/bin/kvm is used, which is the one found in debian and ubuntu.

Reviewed-by: imsnah

a5f723a2 08/13/2008 05:24 pm Guido Trotter

FakeHypervisor: fix a function signature

StartInstance takes 'block_devices', not 'force' as its third argument.
Even if this is not used in the fake hypervisor it's better to have the
correct argument name to avoid confusion.

Reviewed-by: imsnah

e326d4e5 08/13/2008 05:23 pm Guido Trotter

Convert RunCmd to an epydoc docstring

Reviewed-by: imsnah

51144e33 08/13/2008 03:55 pm Michael Hanselmann

Fix adding pristine nodes

If a node hasn't been part of the cluster before being added it'll not
have the cluster's SSH key. This patch makes sure to accept those by
not aliasing the machine name to the cluster name.

Reviewed-by: ultrotter

f56377a3 08/12/2008 08:00 pm Michael Hanselmann

Fix race locking issue in noded

Noded didn't release the job queue lock after initialising it. This
patch makes sure to unlock once the work is done.

Reviewed-by: ultrotter

853e7f3d 08/11/2008 07:28 pm Michael Hanselmann

cli: Use new RPC call instead of polling

This means commands will not take at least one second anymore.

Reviewed-by: ultrotter

dfe57c22 08/11/2008 07:27 pm Michael Hanselmann

Add RPC call to wait for job changes

This way clients can react faster to status or message changes and
don't have to poll anymore.

Reviewed-by: ultrotter

d5e317ba 08/11/2008 07:27 pm Michael Hanselmann

jqueue: Change log message time format

See the comment in the patch.

Reviewed-by: ultrotter

739be818 08/11/2008 07:26 pm Michael Hanselmann

Add functions to split time into tuple and merge it back

These will be used for job logs.

Reviewed-by: ultrotter

32f93223 08/08/2008 02:29 pm Michael Hanselmann

Add query function for exports

Reviewed-by: iustinp

24fc781f 08/08/2008 02:23 pm Michael Hanselmann

Don't always remove queue lock when queue is purged

The lock should only be removed if ganeti-noded is going to quit.
Otherwise it needs to be kept to prevent another process from creating
it again while we're still holding the (removed) lock. This is due to...

76ab5558 08/08/2008 02:22 pm Michael Hanselmann

backend: Add optional exclusion list to _CleanDirectory

The code cleaning the queue will make use of it.

Reviewed-by: iustinp

abc1f2ce 08/08/2008 02:21 pm Michael Hanselmann

jqueue: Move archived jobs on all nodes

Otherwise one might have archived jobs back in the list after a master
failover.

Reviewed-by: iustinp

af5ebcb1 08/08/2008 02:21 pm Michael Hanselmann

noded: Add RPC function to rename job queue files

This will be used to archive jobs.

Reviewed-by: iustinp

dc31eae3 08/08/2008 02:21 pm Michael Hanselmann

backend: Add function to check whether file is in queue dir

Another function will need to check whether its parameters
are job queue files.

Reviewed-by: iustinp

0a7bed64 08/08/2008 02:19 pm Michael Hanselmann

Two small style fixes

Reviewed-by: iustinp

5d6fb8eb 08/08/2008 01:03 pm Michael Hanselmann

jstore: Change to not always require a lock

This way we can do locking when both noded and masterd are running
on the same machine, the latter holding an exclusive lock on the
queue.

Reviewed-by: iustinp

aa65ed72 08/08/2008 01:02 pm Michael Hanselmann

Log only unexpected errors in utils.FileLock

Otherwise users might be confused by errors in log files.

Reviewed-by: iustinp

553f1c1d 08/08/2008 01:02 pm Michael Hanselmann

Disallow uploading job queue files through upload_file

The job queue is now updated through its own RPC functions.

Reviewed-by: iustinp

9f774ee8 08/08/2008 01:01 pm Michael Hanselmann

jqueue: Use new job queue RPC functions

Reviewed-by: iustinp

ca52cdeb 08/08/2008 01:01 pm Michael Hanselmann

Add job queue RPC functions

jobqueue_update: Uploads a job queue file's content to a node. The
most common operation is to upload something that we already have
in a string. Unlike in the upload_file function, the file is not
read again when distributing changes, but content has to be passed...

3956cee1 08/08/2008 01:00 pm Michael Hanselmann

Move function cleaning directory to module level

JobQueuePurge() will be used by an RPC function.

Reviewed-by: iustinp

281606c1 08/07/2008 12:07 pm Michael Hanselmann

Fix cli.PollJob

feedback_fn wasn't passed to it.

Reviewed-by: iustinp

d8470559 08/06/2008 05:56 pm Michael Hanselmann

Implement {Add,Readd,Remove}Node in GanetiContext

By doing this we've a central place which coordinates what needs to be
done when adding or removing nodes. Another patch will add calls into
the job queue.

Two log messages move to config.py.

When removing a node, node_leave_cluster is now called after it has...

d2e03a33 08/06/2008 04:36 pm Michael Hanselmann

jqueue: Implement {Add,Remove}Node

These functions will be used to notify the queue about newly added
or removed nodes.

Reviewed-by: iustinp

4c848b18 08/06/2008 04:35 pm Michael Hanselmann

jqueue: Don't pass the list of nodes to SubmitJob anymore

The job queue now maintains its own list and is updated when
nodes are added or removed from the cluster.

Reviewed-by: iustinp

8e00939c 08/06/2008 04:35 pm Michael Hanselmann

Maintain node list in job queue

The code makes sure not to include the master in the list.

Reviewed-by: iustinp

f78346f5 08/06/2008 02:27 pm Michael Hanselmann

Clean job queue directories when leaving cluster

Old job files shouldn't be left on nodes removed from a cluster.

Reviewed-by: iustinp

02f7fe54 08/06/2008 11:26 am Michael Hanselmann

Implement query for nodes

Reviewed-by: iustinp

ee6c7b94 08/06/2008 11:25 am Michael Hanselmann

Implement query for instances

Queries don't create jobs and are more efficient. Log messages
are not yet stored anywhere.

Reviewed-by: iustinp

23752136 08/05/2008 01:33 pm Michael Hanselmann

jqueue: Replicate jobs to all nodes

Newly added nodes are not yet taken care of. Queue locking on
non-master nodes is not yet correct.

Reviewed-by: iustinp

04ab05ce 08/04/2008 03:27 pm Michael Hanselmann

jqueue: Use new jstore module

Reviewed-by: iustinp

8b537bb0 08/04/2008 03:27 pm Michael Hanselmann

jstore: Add queue helper functions

This will be used to move common code out of jqueue.

Reviewed-by: iustinp

94428652 08/04/2008 12:47 pm Iustin Pop

Implement job submission for scripts

This patch adds the infrastructure for executing a job in background,
instead of foreground, via a new “--submit” option. The behaviour is
that the job ID is printed and the script will immediately exit.

The patch also converts gnt-node list to this model (yes, this will be a...

db37da70 07/31/2008 06:03 pm Michael Hanselmann

jqueue: Move assert into decorator

This reduces code duplication. A later patch will modify the job queue
a bit more and will need a change of this assert. The assertion is
also removed from all class-internal functions.

Reviewed-by: iustinp

0a1e74d9 07/31/2008 05:52 pm Iustin Pop

Split cli.SubmitOpCode in two parts

The current SubmitOpCode function is not flexible enough to be used for
submitters that don't want to wait for the job finish.

The patch splits this in two, a SendJob function and a PollJob one, and
the old SubmitOpCode becomes a wrapper. Note that the new SendJob takes...

afee8008 07/31/2008 05:42 pm Michael Hanselmann

Allow job queue files to be uploaded through ganeti-noded

This is needed for job queue replication.

Reviewed-by: iustinp

a87b4824 07/31/2008 05:33 pm Michael Hanselmann

Add FileLock utility class

This class is a wrapper around fcntl.flock and abstracts opening and
closing the lockfile. It'll used for the job queue.

(The patch also removes a duplicate import of tempfile into the unittest)

Reviewed-by: iustinp

5bdce580 07/31/2008 05:33 pm Michael Hanselmann

jqueue: Store context in job queue instead of worker pool

The job queue will need to access to configuration, which is provided
through the context object, to get a list of nodes.

Reviewed-by: iustinp

15fd9fd5 07/31/2008 03:58 pm Oleksiy Mishchenko

RAPI Implement DELETE for tags

Reviewed-by: imsnah

441e7cfd 07/31/2008 12:06 pm Oleksiy Mishchenko

First write operation (add tag) for Ganeti RAPI

Add instance tag handling, improved error logging.
...oh, yes adopt instance listing for RAPI2!

Reviewed-by: iustinp

140aa4a8 07/30/2008 06:58 pm Iustin Pop

Fix cluster destroy

With the recent startup/shutdown changes (and with the master daemon in
place), the cluster destroy needs some fixing.

This patch moves the finalization of the destroy out from cmdlib into
bootstrap, so we can nicely shutdown the rapi and master daemons....

97efde45 07/30/2008 06:49 pm Guido Trotter

Xen: remove two end-of-line semicolons

It's python, isn't it?

Reviewed-by: iustinp

b3f1cf6f 07/30/2008 06:17 pm Iustin Pop

Fix cluster init

With the recent changes, I forgot the extra parameter to this rpc call.
Also the rpc call needs to be done after we setup the config data, for
the master daemon to be able to start, so we move it after all other
init steps.

Reviewed-by: ultrotter

b33e986b 07/30/2008 06:06 pm Iustin Pop

Make gnt-* commands fail nicely on non-masters

This patch adds a check that we are on the master after failing to
connect to the socket, and log nicely the master name.

Reviewed-by: ultrotter

c9e5c064 07/30/2008 06:04 pm Guido Trotter

Parallelize LUFailoverInstance

Reviewed-by: iustinp

64381ad7 07/30/2008 06:04 pm Guido Trotter

ChainOpCode is still BGL-only

Prevent mistakes with an assert.

Reviewed-by: iustinp

8161a646 07/30/2008 06:00 pm Iustin Pop

Fix a misuse of exc_info in logging.info

This is my fault, sorry.

Reviewed-by: imsnah

38206f3c 07/30/2008 05:04 pm Iustin Pop

Fix pylint-detected issues

This is mostly:
- whitespace fix (space at EOL in some files, not all, broken
indentation, etc)
- variable names overriding others (one is a real bug in there)
- too-long-lines
- cleanup of most unused imports (not all)...

3b9e6a30 07/30/2008 04:27 pm Iustin Pop

Fix some errors detected by pylint

Reviewed-by: imsnah

59f187eb 07/30/2008 03:32 pm Iustin Pop

Unify SetupDaemon/SetupLogging

The 'old-style' info, error, debug logs do not make much sense. This
patch unifies the SetupLogging and SetupDaemon functions. As a result,
all the commands logs to a 'commands.log' file.

The patch also changes the log setup to keep going if there's an error...

9936bd63 07/30/2008 03:29 pm Iustin Pop

Simplify the log constants and add another one

The patch changes the log constants by moving the slash to the end of
the log dir instead of at the beginning of each log file name.

It also adds a new LOG_COMMANDS constant (to be used in a next patch).
...

e873317a 07/30/2008 02:31 pm Guido Trotter

Parallelize {Startup,Shutdown,Reboot}Instance

Reviewed-by: iustinp

4e0b4d2d 07/30/2008 02:30 pm Guido Trotter

Parallelize LUReinstallInstance

self.recalculate_locks[locking.LEVEL_NODE] could have any value and
everything would work anyway. We'll use the string 'replace' by
convention because in the future we might want an 'append' mode.

Reviewed-by: iustinp

c4a2fee1 07/30/2008 02:30 pm Guido Trotter

LogicalUnit._LockInstancesNodes helper function

This function is used to lock instances' primary and secondary nodes
after locking instances themselves.

Reviewed-by: iustinp

3977a4c1 07/30/2008 02:30 pm Guido Trotter

Make sharing locks possible

LUs can declare which locks they need by populating the
self.needed_locks dictionary, but those locks are always acquired as
exclusive. Make it possible to acquire shared locks as well, by
declaring a particular level as shared in the self.share_locks...

fb8dcb62 07/30/2008 02:29 pm Guido Trotter

Add LogicalUnit.DeclareLocks

This additional LogicalUnit function is optional to implement, but lets
you change your locking needs for one level just before locking it, but
after the previous levels have been already locked. It is useful for
example to calculate what nodes to lock after locking an instance....

74b5913f 07/30/2008 02:29 pm Guido Trotter

LURenameInstance, add/remove relevant locks

LURenameInstance forgot to remove the old lock name and add the new one,
making it impossible for parallel LUs to act on the instance (without a
master daemon restart). This also fixes burning+rename with the
parallelization of {Start,Stop}Instance....

85f03e0d 07/30/2008 01:02 pm Michael Hanselmann

Rewrite job queue

We found several issues in the old job queue implementation. It had race
conditions, deadlocks and other deficiencies.

Short summary:
- _QueuedOpCode and _QueuedJob are now more or less data structures with a few
utility functions. __Setup is gone....

c0a8eb9e 07/30/2008 11:56 am Michael Hanselmann

workerpool: Log when waiting for a thread

Reviewed-by: iustinp

b1b6ea87 07/30/2008 11:43 am Iustin Pop

Rework master startup/shutdown/failover

This (big) patch reworks the master startup/shutdown and the fixes the
master failover.

What does the patch do?

For master start/stop:
- remove the old ganeti-master script and its associated man page
- moves the ip start/stop directly into the backend.(Start|Stop)Master...

53beffbb 07/30/2008 11:34 am Iustin Pop

Expose utils.DaemonPidFileName

Since we need to compute this from outside utils.py, we change this to a
public function.

Reviewed-by: ultrotter

5675cd1f 07/30/2008 11:33 am Iustin Pop

Implement checking for the master role in rapi

This patch moves the CheckMaster function from ganeti-masterd to ssconf
(most logical place, it cannot go in utils since we would have recursive
imports between ssconf and utils) and changes ganeti-rapi to also call...

1c65840b 07/30/2008 11:32 am Iustin Pop

Add a new parameter to backend.(Start|Stop)Master

This patch adds a new, unused for now, parameter to the start and stop
master operations in backend. The idea behind it is that we need to be
able to control whether the IP (de)activation is coupled with daemon...

6aff91f6 07/29/2008 05:07 pm Michael Hanselmann

Log thread name when debug output is enabled

Reviewed-by: iustinp

8090e19f 07/29/2008 05:07 pm Michael Hanselmann

jqueue: Fix error logging

The passed parameters were not correct.

Reviewed-by: iustinp, ultrotter

bff2ddc5 07/29/2008 01:42 pm Iustin Pop

Fix constants typo

Reviewed-by: imsnah

99e88451 07/29/2008 12:06 pm Iustin Pop

Use constants for the pid file stems

Reviewed-by: imsnah

b2a1f511 07/29/2008 11:49 am Iustin Pop

Add a KillProcess function

We cannot depend on all environments to have a start-stop-daemon or
similar tool. We instead implement a KillProcess function that behaves
similar to “start-stop-daemon --retry”.

Note that the attached unittest can hang in foreground if the child...

d9f311d7 07/29/2008 11:49 am Iustin Pop

Change IsPidFileAlive into ReadPidFile

We already have a function to test if a PID is alive, so it makes more
sense to use function composition that force calling (since we need to
read PIDs from files in other places too). Now IsProcessAlive returns
False for PIDs <= 0, since this is the error return from ReadPidFile....

3cd62121 07/28/2008 01:17 pm Michael Hanselmann

Move ganeti-rapi core code to daemon

All other daemons have their main code in themselves and not in a module.
This patch does the same to ganeti-rapi by moving the code from
lib/rapi/RESTHTTPServer.py to daemons/ganeti-rapi.

Reviewed-by: iustinp

e2ae9123 07/28/2008 01:16 pm Michael Hanselmann

Replace httperror module with ganeti.http

The generic HTTP server doesn't know about httperror based exceptions
and would treat them as unknown exceptions, thereby not doing the right
thing with HTTP errors.

Reviewed-by: iustinp

188c5e0a 07/28/2008 01:13 pm Michael Hanselmann

Implement job canceling on server side

Locking is not completeley right due to a deadlock when the job calls
UpdateJob after changing its status.

Reviewed-by: ultrotter

533bb4b1 07/28/2008 12:16 pm Michael Hanselmann

Fix exception class name in utils.WritePidFile

Reviewed-by: iustinp

4cb1d919 07/28/2008 12:16 pm Michael Hanselmann

Add “canceled” status for opcodes

Reviewed-by: ultrotter

fae737ac 07/25/2008 03:47 pm Michael Hanselmann

Move code extracting job ID into function

It might come in handy at some point and makes the code a bit easier
to read.

Reviewed-by: iustinp

5d414478 07/25/2008 03:32 pm Oleksiy Mishchenko

Convert set to a list in LUGetTags

The set triggers exception on a list-tags command and RAPI calls for tags
since it is not serializable by JSON.

Reviewed-by: iustinp

a0638838 07/24/2008 07:34 pm Oleksiy Mishchenko

Switch RAPI to ganeti.http module

Reviewed-by: imsnah

c609f802 07/24/2008 02:32 pm Michael Hanselmann

Implement job archiving on the server side

So far no error reporting to the client is done. Clients don't get
noticed if a job doesn't exist or couldn't be archived because of
its current status.

The internal cache is always cleaned when the preconditions didn't...

0cb94105 07/24/2008 02:32 pm Michael Hanselmann

Add directory for archived jobs

Reviewed-by: iustinp

ce594241 07/23/2008 07:56 pm Michael Hanselmann

Move code formatting job ID into a base class

A later patch will add a memory based job storage class, hence this
code is going into a separate class. It also changes the number format
to always use at least 10 digits, allowing up to 9'999'999'999 jobs to...

b330ac0b 07/23/2008 05:23 pm Guido Trotter

Add utils.{Write,Remove}PidFile

WritePidFile is a helper function that writes the current pid in a
pidfile within the ganeti run directory. RemovePidFile tries to delete
it.

Reviewed-by: iustinp

fee80e90 07/23/2008 05:23 pm Guido Trotter

Add utils.IsPidFileAlive function

This helper function reads a pid from a file containing it and checks
whether it refers to a live process.

Reviewed-by: iustinp

04e1bfaf 07/23/2008 05:23 pm Guido Trotter

Invert nodes/instances locking order

An implementation mistake from the original design caused nodes to be
locked before instances, rather than after. This patch inverts the level
numbering, changing also the relevant unittests and the recursive
locking function starting point....

51ee2f49 07/23/2008 05:16 pm Oleksiy Mishchenko

Generalization of bulk output mapping

Reviewed-by: iustinp

21cc1fbd 07/23/2008 04:30 pm Michael Hanselmann

Rename JobStorage to DiskJobStorage

Reviewed-by: iustinp