Statistics
| Branch: | Tag: | Revision:

root / lib @ d65e5776

# Date Author Comment
d65e5776 09/10/2008 02:02 pm Iustin Pop

Add a way to export all node information at once

The patch adds a new function to export all node information at once
(i.e. atomically with respect to the configuration lock).

Reviewed-by: ultrotter

1bc59f76 09/09/2008 03:47 pm Michael Hanselmann

Never remove job queue lock in node daemon

Otherwise, corruption could occur in some corner cases. E.g. when
LeaveNode is running in a child and is in the process of removing
queue files, the main process gets killed, started again and gets
a request to update the queue. This is rather extreme corner case,...

4e071d3b 09/09/2008 03:24 pm Iustin Pop

Export backend.GetMasterInfo over the rpc layer

We create a multi-node call so that querying all nodes for agreement
will be fast.

Reviewed-by: imsnah

bd1e4562 09/09/2008 03:24 pm Iustin Pop

Change backend._GetMasterInfo to return more data

The _GetMasterInfo() function needs to export the master name too to be
useful in master safety checks. This patch makes it a public (no _)
function and adds a third element in the return tuple. Its callers are...

a987fa48 09/09/2008 01:42 pm Guido Trotter

Parallelize LUQueryInstanceData

Reviewed-by: iustinp

d4b9d97f 09/09/2008 01:42 pm Guido Trotter

Parallelize LUVerify{Cluster,Disks}

These are two easy querying LUs which require shared access to all
nodes/instances.

Reviewed-by: iustinp

efd990e4 09/09/2008 01:41 pm Guido Trotter

Parallelize LUReplaceDisks

This is the most complex parallelization so far. We have to lock one
instance (and its nodes) plus one more node if doing a remote replace,
or all nodes if doing a remote replace with iallocator.

Reviewed-by: iustinp

9513b6ab 09/09/2008 01:41 pm Guido Trotter

_LockInstancesNodes: support append mode

This will be used to lock the instance's nodes in addition to some more.

Reviewed-by: iustinp

b2751b57 09/09/2008 01:41 pm Guido Trotter

Processor: remove ChainOpCode

This function was incompatible with the new locking system, and its
usage has been removed from the code. For now LUs share code by calling
common module-private functions in cmdlib.py, in the future they will
use tasklets (when those will be implemented)....

f22a8ba3 09/09/2008 01:41 pm Guido Trotter

Parallelize LU{A,Dea}ctivateInstanceDisks

Now that they are not used in other opcodes by chaining,
this can easily be done.

Reviewed-by: iustinp

023e3296 09/09/2008 01:40 pm Guido Trotter

LUReplaceDisks: remove use of ChainOpCode

The calls to OpActivateInstanceDisks and OpDeactivateInstanceDisks has
been replaced by _StartInstanceDisks and _SafeShutdownInstanceDisks
respectively. This is the last usage of ChainOpCode.

Reviewed-by: iustinp

155d6c75 09/09/2008 01:40 pm Guido Trotter

Create new _SafeShutdownInstanceDisks function

This new function checks whether an instance is running, before shutting
down its disks. This is what the Exec() of LUDeactivateInstanceDisks
did, so that is replaced by a call to this function.

Reviewed-by: iustinp

3a5d7305 09/09/2008 01:40 pm Guido Trotter

Fix a typo in LogicalUnit.ExpandNames docstring

s/locking.LEVEL_INSTANCES/locking.LEVEL_INSTANCE/

Reviewed-by: iustinp

f6d9a522 09/09/2008 01:40 pm Guido Trotter

Use constants.LOCKS_REPLACE instead of hardcoding

This constant replaces what we used to write in recalculate_locks, and
represents the lock recalculation mode. It lives in constants.py because
it's used only in cmdlib, and thus doesn't deal with the locking library...

de8c7666 09/09/2008 12:39 pm Guido Trotter

Fix LUReplaceDisks with iallocator

self._RunAllocator() sets self.op.remote_node, but doesn't return the
new remote node. If we set it to the return value of the function we
basically reset it to None, and iallocator is never run.

Reviewed-by: imsnah

86de84dd 09/08/2008 06:54 pm Guido Trotter

Fix LUGrowDisk

The rpc library returns a list, not a tuple, so we'll accept both.

Reviewed-by: iustinp

43f5ea7a 09/08/2008 06:53 pm Guido Trotter

Fix iallocator run

The rpc library returns a list, not a tuple, so we'll accept both.

Reviewed-by: iustinp

6657590e 09/08/2008 04:44 pm Guido Trotter

Parallelize LUExportInstance

Unfortunately for the first version we need to lock all nodes. The patch
discusses why this is and discuss ways to improve this in the future.

Reviewed-by: iustinp

31e63dbf 09/08/2008 04:44 pm Guido Trotter

Parallelize LUGrowDisk

Reviewed-by: iustinp

849da276 09/08/2008 04:43 pm Guido Trotter

LURebootInstance: lock only primary when possible

When rebooting an instance and we're not changing it's disks status (all
the cases except in a "full" reboot) we can lock just its primary node.

Reviewed-by: iustinp

a82ce292 09/08/2008 04:43 pm Guido Trotter

Add primary_only flag to _LockInstancesNodes

As the name says when the flag is on (the default is off) only the
primary nodes are locked, as opposed to all of them.

Reviewed-by: iustinp

aa74b828 09/05/2008 06:38 pm Michael Hanselmann

utils.FileLock: Implement timeout

The timeout can be used in ganeti-noded to be more robust against
deadlocks.

Reviewed-by: iustinp

e310b019 09/05/2008 02:00 pm Guido Trotter

Add locking.ALL_SET constant and use it

Rather than specifying None in needed_locks every time, with a nice
comment saying to read what we mean rather than what we write, and that
None actually means All, in our magic world, we'll hide this secret
under the ALL_SET constant in the locking module, which has value, you...

45bc5e4a 09/05/2008 01:57 pm Michael Hanselmann

utils.SplitTime: More rounding fixes

SplitTime didn't round the same on different platforms. This patch changes
it to use microseconds and not care about rounding.

Reviewed-by: iustinp

ea47808a 09/04/2008 06:12 pm Guido Trotter

Prevent mistakes using _GetWantedNodes

All the users of _GetWantedNodes have been converted to be concurrent
LUs, and thus cannot call this function with an empty list of nodes
anymore. This patch makes this restriction a part of the function
itself. This prevents mistakes in new concurrent LUs, and creates more...

21a15682 09/04/2008 06:12 pm Guido Trotter

Paralleliza LUQueryNodeVolumes and LUQueryExports

Reviewed-by: iustinp

6bf01bbb 09/04/2008 06:12 pm Guido Trotter

Parallelize LUDiagnoseOS

Reviewed-by: iustinp

895ecd9c 09/04/2008 06:12 pm Guido Trotter

LUQueryExports: make 'node' field mandatory

It turns out this fields was already mandatory. If it hadn't beed valid,
in fact, a value of None would have been passed to _GetWantedNodes which
would have thrown an exception.

Reviewed-by: iustinp

204f2086 09/04/2008 06:11 pm Guido Trotter

s/Chain(OpQueryExports)/rpc.call_export_list(...)/

Parallel opcodes are not (yet?) supported for chaining. Turns out
though that chaining is used only four times in the code, and twice it's
for querying exports. But what's the need to chain the full opcode, when...

b91a34a5 09/04/2008 06:11 pm Guido Trotter

Fix wrong indentation in LUQueryNodes

Reviewed-by: iustinp

d0c11cf7 09/04/2008 05:53 pm Alexander Schreiber

Merge r1607 from branches/ganeti/ganeti-1.2

Use a default vnc_bind_address if None is specified

Reviewed-by: iustinp

3fb1e1c5 09/02/2008 03:57 pm Alexander Schreiber

merge r1568 from branches/ganeti/ganeti-1.2

Add more fields to gnt-instance list

Reviewed-by: imsnah

6291574d 09/02/2008 03:15 pm Alexander Schreiber

merge r1548 from branches/ganeti/ganeti-1.2

Fix wrong wording of instance rename error message.

Reviewed-by: imsnah

a4273aba 09/02/2008 12:09 pm Alexander Schreiber

merge r1541 from branches/ganeti/ganeti-1.2

more information for VNC console port

Reviewed-by: ultrotter

04c4330c 09/02/2008 11:42 am Alexander Schreiber

merge r1540 from branches/ganeti/ganeti-1.2

Allow access to HVM serial console

Reviewed-by: imsnah

34b6ab97 09/01/2008 07:05 pm Alexander Schreiber

merge r1539 from branches/ganeti/ganeti-1.2

Display VNC console port in gnt-instance info.

Reviewed-by: iustinp

5bc84f33 09/01/2008 05:12 pm Alexander Schreiber

merge r1538 from branches/ganeti/ganeti-1.2

Check HVM device type on instance modify as well.

Reviewed-by: imsnah

cfefe007 09/01/2008 02:37 pm Guido Trotter

Check memory size before setting it

With this change when a user asks for a new memory size for an instance,
the number is checked instead of just applied. The operation fails only
if the instance would not be able to restart on its primary node, but
generates warnings should it be impossible to failover the instance or...

4300c4b6 09/01/2008 02:37 pm Guido Trotter

Pass the force param to SetInstanceParms

It was already allowed in gnt-instance modify, but ignored.
It will be used to force skipping parameter checks.

This is a forward-port from branches/ganeti-1.2

Original-Reviewed-by: imsnah
Reviewed-by: iustinp

5397e0b7 08/29/2008 07:17 pm Alexander Schreiber

Merge r1536 from branches/ganeti/ganeti-1.2

Add HVM device type flags 2/3

Reviewed-by: ultrotter

b77ba978 08/29/2008 06:04 pm Michael Hanselmann

utils.SplitTime: Fix rounding of milliseconds

Reported by Iustin.

It used to return this:

utils.SplitTime(1234.999999999999)

(1234, 1000)

while it should've returned this:

utils.SplitTime(1234.999999999999)

(1235, 0)

Reviewed-by: ultrotter

b894f5a8 08/29/2008 06:01 pm Alexander Schreiber

merge r1535 from branches/ganeti/ganeti-1.2

Add HVM device type flags 1/4

Reviewed-by: ultrotter

5c735209 08/29/2008 04:42 pm Iustin Pop

Make WaitForJobChanges deal with long jobs

This patch alters the WaitForJobChanges luxi-RPC call to have a
configurable timeout, so that the call behaves nicely with long jobs
that have no update.

We do this by adding a timeout parameter in the RPC call, and returning...

3fc175f0 08/29/2008 03:47 pm Alexander Schreiber

merge r997 from branches/ganeti/ganeti-1.2

Fix gnt-instance modify for HVM parameters

This patch makes gnt-instance modify work again for the advanced
HVM parameters after it was broken by other changes.

Reviewed-by: ultrotter

082c5adb 08/28/2008 06:35 pm Michael Hanselmann

Fix error message when masterd is not listening

Reported by Iustin.

Reviewed-by: iustinp

6683bba2 08/28/2008 01:29 pm Guido Trotter

Fix issue when acquiring empty lock sets

By design if an empty list of locks is acquired from a set, no locks are
acquired, and thus release() cannot be called on the set. On the other
hand if None is passed instead of the list, the whole set is acquired,...

5685c1a5 08/27/2008 05:52 pm Michael Hanselmann

jqueue: Replace normal cache dict with weakref dict

A job should only exist once in memory. After the cache is cleaned,
there can still be references to a job somewhere else. If there
are multiple instances, one can get updated while a function is
waiting for changes on another instance. By using...

70552c46 08/27/2008 05:52 pm Michael Hanselmann

jqueue: Keep timestamp of opcode start and end

Reviewed-by: ultrotter

65548ed5 08/27/2008 05:48 pm Michael Hanselmann

jqueue: Reset run_op_idx after job is done

It can be confusing otherwise.

Reviewed-by: ultrotter

6abe9194 08/27/2008 12:55 pm Iustin Pop

Fix a small typo in a constant

Seems noone ran a burnin lately :)

Reviwed-by: amischenko,ultrotter

6c5a7090 08/27/2008 11:34 am Michael Hanselmann

Make sure that client programs get all messages

This is a large patch, but I can't figure out how to split it without
breaking stuff. The old way of getting messages by always getting the
last one didn't bring all messages to the client if they were added...

e67bd559 08/26/2008 06:44 pm Michael Hanselmann

Add simple lock debug output

Currently it can only be enabled by modifying utils.py, but we can
add a command line parameter later if needed.

Reviewed-by: schreiberal

35705d8f 08/18/2008 03:51 pm Guido Trotter

Parallelize LUQueryNodes

As for LUQueryInstances the first version just acquires a shared lock on all
nodes. In the future further optimizations are possible, as outlined by
comments in the code.

Reviewed-by: imsnah

7eb9d8f7 08/18/2008 03:51 pm Guido Trotter

Parallelize LUQueryInstances

This first version acquires a shared lock on all requested instances and
their nodes. In the future it can be improved by acquiring less locks if
no dynamic fields have been asked, and/or by locking just primary nodes.

Reviewed-by: imsnah

34ca3914 08/18/2008 03:50 pm Guido Trotter

LockSet: allow lists with duplicate values

If a list with a duplicate value is passed to a lockset what the code
now does is to try to acquire the lock twice, generating a
double-acquire exception in the SharedLock code. This is definitely an
issue. In order to solve it we can either forbit double values in a list...

8a2941c4 08/18/2008 03:49 pm Guido Trotter

Processor: lock all levels even if one is missing

If a locking level wasn't specified locking used to stop. This means
that if one, for example, didn't specify anything at the LEVEL_INSTANCE
level, no locks at the LEVEL_NODE level were acquired either. With this...

0fcc5db3 08/18/2008 03:44 pm Guido Trotter

LURebootInstance: move arg check in ExpandNames

The check for the reboot type can be done without any locks held, so
we'll move it to ExpandNames. Plus, we note in a FIXME that if the
reboot type is not full, we can probably just lock the primary node, and...

34290825 08/18/2008 02:37 pm Michael Hanselmann

LUVerifyCluster: Return boolean indication success

Reviewed-by: schreiberal

9894ece7 08/18/2008 02:12 pm Michael Hanselmann

Use Linux-specific way to name master socket

By using this Linux-specific way we don't have to care about removing the
socket file when quitting or starting (after an unclean shutdown). For a
more detailed description, see the comment in the patch.

Reviewed-by: schreiberal

c4b6c29c 08/15/2008 11:55 am Michael Hanselmann

gnt-node: Add option to always accept peer's SSH key

This option will be used to add nodes to the cluster without
asking the user to confirm the key. Together with key based
authentication this can be used in the QA tests.

Reviewed-by: ultrotter

652d6694 08/15/2008 11:47 am Michael Hanselmann

SshRunner: Add parameter to always accept peer's SSH key

This will be used to add nodes without user interaction, specifically
in QA tests.

Reviewed-by: ultrotter

f6d9f4c3 08/15/2008 11:44 am Michael Hanselmann

Move SSH option building into a function

I'm going to add another option and it would make maintaining
them in constants even more complicated.

Reviewed-by: ultrotter

54ab6aec 08/15/2008 11:44 am Michael Hanselmann

SshRunner.Run: Pass all arguments to BuildCmd

This patch changes SshRunner.Run to pass all arguments to
SshRunner.BuildCmd. They had the same arguments before
and should stay that way. This change makes it easier
to add new or change existing arguments.
...

4f0afaf5 08/14/2008 01:27 pm Guido Trotter

Pass hypervisor type to the OS scripts

It's handy to make the os scripts know which hypervisor the instance is
going to run under. In order not to change the os API we pass this
information in the environment, where the os scripts can access it if
they're hypervisor-aware....

2557ff82 08/14/2008 01:26 pm Guido Trotter

RunCmd: add optional environment overriding

If the user passes an env dict to RunCmd we'll override the environment
passed to the to-be-executed command with the values in the dict. This
allows us to pass arbitrary environment values to commands we run.
...

d47d3d38 08/13/2008 07:41 pm Guido Trotter

KVM Hypervisor Cleanup

- Remove a few experiemental code lines left as comments
- Rework first disks' boot=on addition, which was calculated twice
- Remove an empty line
- Remove reference to hvm_pae which doesn't apply to kvm

Reviewed-by: imsnah

eb58f9b1 08/13/2008 05:25 pm Guido Trotter

Add KVM hypervisor code

ht_kvm.py contains the code for ganeti to work under kvm.
This patch also modifies Makefile.am to ship that file, and
lib/hypervisor/__init__.py to import it, and add kvm to the
hypervisors map.

Reviewed-by: imsnah

550e49b9 08/13/2008 05:25 pm Guido Trotter

constants: add HT_KVM

Add a new hypervisor type, HT_KVM, to constants, and register it in the
HYPER_TYPES set.

Reviewed-by: imsnah

7e2c5b9e 08/13/2008 05:24 pm Guido Trotter

Add --with-kvm-path configure option

This allows to configure a different path to the kvm binary. By default
/usr/bin/kvm is used, which is the one found in debian and ubuntu.

Reviewed-by: imsnah

a5f723a2 08/13/2008 05:24 pm Guido Trotter

FakeHypervisor: fix a function signature

StartInstance takes 'block_devices', not 'force' as its third argument.
Even if this is not used in the fake hypervisor it's better to have the
correct argument name to avoid confusion.

Reviewed-by: imsnah

e326d4e5 08/13/2008 05:23 pm Guido Trotter

Convert RunCmd to an epydoc docstring

Reviewed-by: imsnah

51144e33 08/13/2008 03:55 pm Michael Hanselmann

Fix adding pristine nodes

If a node hasn't been part of the cluster before being added it'll not
have the cluster's SSH key. This patch makes sure to accept those by
not aliasing the machine name to the cluster name.

Reviewed-by: ultrotter

f56377a3 08/12/2008 08:00 pm Michael Hanselmann

Fix race locking issue in noded

Noded didn't release the job queue lock after initialising it. This
patch makes sure to unlock once the work is done.

Reviewed-by: ultrotter

853e7f3d 08/11/2008 07:28 pm Michael Hanselmann

cli: Use new RPC call instead of polling

This means commands will not take at least one second anymore.

Reviewed-by: ultrotter

dfe57c22 08/11/2008 07:27 pm Michael Hanselmann

Add RPC call to wait for job changes

This way clients can react faster to status or message changes and
don't have to poll anymore.

Reviewed-by: ultrotter

d5e317ba 08/11/2008 07:27 pm Michael Hanselmann

jqueue: Change log message time format

See the comment in the patch.

Reviewed-by: ultrotter

739be818 08/11/2008 07:26 pm Michael Hanselmann

Add functions to split time into tuple and merge it back

These will be used for job logs.

Reviewed-by: ultrotter

32f93223 08/08/2008 02:29 pm Michael Hanselmann

Add query function for exports

Reviewed-by: iustinp

24fc781f 08/08/2008 02:23 pm Michael Hanselmann

Don't always remove queue lock when queue is purged

The lock should only be removed if ganeti-noded is going to quit.
Otherwise it needs to be kept to prevent another process from creating
it again while we're still holding the (removed) lock. This is due to...

76ab5558 08/08/2008 02:22 pm Michael Hanselmann

backend: Add optional exclusion list to _CleanDirectory

The code cleaning the queue will make use of it.

Reviewed-by: iustinp

abc1f2ce 08/08/2008 02:21 pm Michael Hanselmann

jqueue: Move archived jobs on all nodes

Otherwise one might have archived jobs back in the list after a master
failover.

Reviewed-by: iustinp

af5ebcb1 08/08/2008 02:21 pm Michael Hanselmann

noded: Add RPC function to rename job queue files

This will be used to archive jobs.

Reviewed-by: iustinp

dc31eae3 08/08/2008 02:21 pm Michael Hanselmann

backend: Add function to check whether file is in queue dir

Another function will need to check whether its parameters
are job queue files.

Reviewed-by: iustinp

0a7bed64 08/08/2008 02:19 pm Michael Hanselmann

Two small style fixes

Reviewed-by: iustinp

5d6fb8eb 08/08/2008 01:03 pm Michael Hanselmann

jstore: Change to not always require a lock

This way we can do locking when both noded and masterd are running
on the same machine, the latter holding an exclusive lock on the
queue.

Reviewed-by: iustinp

aa65ed72 08/08/2008 01:02 pm Michael Hanselmann

Log only unexpected errors in utils.FileLock

Otherwise users might be confused by errors in log files.

Reviewed-by: iustinp

553f1c1d 08/08/2008 01:02 pm Michael Hanselmann

Disallow uploading job queue files through upload_file

The job queue is now updated through its own RPC functions.

Reviewed-by: iustinp

9f774ee8 08/08/2008 01:01 pm Michael Hanselmann

jqueue: Use new job queue RPC functions

Reviewed-by: iustinp

ca52cdeb 08/08/2008 01:01 pm Michael Hanselmann

Add job queue RPC functions

jobqueue_update: Uploads a job queue file's content to a node. The
most common operation is to upload something that we already have
in a string. Unlike in the upload_file function, the file is not
read again when distributing changes, but content has to be passed...

3956cee1 08/08/2008 01:00 pm Michael Hanselmann

Move function cleaning directory to module level

JobQueuePurge() will be used by an RPC function.

Reviewed-by: iustinp

281606c1 08/07/2008 12:07 pm Michael Hanselmann

Fix cli.PollJob

feedback_fn wasn't passed to it.

Reviewed-by: iustinp

d8470559 08/06/2008 05:56 pm Michael Hanselmann

Implement {Add,Readd,Remove}Node in GanetiContext

By doing this we've a central place which coordinates what needs to be
done when adding or removing nodes. Another patch will add calls into
the job queue.

Two log messages move to config.py.

When removing a node, node_leave_cluster is now called after it has...

d2e03a33 08/06/2008 04:36 pm Michael Hanselmann

jqueue: Implement {Add,Remove}Node

These functions will be used to notify the queue about newly added
or removed nodes.

Reviewed-by: iustinp

4c848b18 08/06/2008 04:35 pm Michael Hanselmann

jqueue: Don't pass the list of nodes to SubmitJob anymore

The job queue now maintains its own list and is updated when
nodes are added or removed from the cluster.

Reviewed-by: iustinp

8e00939c 08/06/2008 04:35 pm Michael Hanselmann

Maintain node list in job queue

The code makes sure not to include the master in the list.

Reviewed-by: iustinp

f78346f5 08/06/2008 02:27 pm Michael Hanselmann

Clean job queue directories when leaving cluster

Old job files shouldn't be left on nodes removed from a cluster.

Reviewed-by: iustinp

02f7fe54 08/06/2008 11:26 am Michael Hanselmann

Implement query for nodes

Reviewed-by: iustinp

ee6c7b94 08/06/2008 11:25 am Michael Hanselmann

Implement query for instances

Queries don't create jobs and are more efficient. Log messages
are not yet stored anywhere.

Reviewed-by: iustinp

23752136 08/05/2008 01:33 pm Michael Hanselmann

jqueue: Replicate jobs to all nodes

Newly added nodes are not yet taken care of. Queue locking on
non-master nodes is not yet correct.

Reviewed-by: iustinp

04ab05ce 08/04/2008 03:27 pm Michael Hanselmann

jqueue: Use new jstore module

Reviewed-by: iustinp