Statistics
| Branch: | Tag: | Revision:

root / daemons @ 07813a9e

# Date Author Comment
07813a9e 02/24/2009 05:25 pm Iustin Pop

Remove the extra_args parameter in instance start

This patch removes the extra_args parameter and instead switches the
instance to the HV_KERNEL_ARGS hypervisor option.

This is a big change, but it's a needed cleanup, this extra parameter on
all RPC calls is not generic and we also need to have a persistent value...

3448aa22 02/16/2009 01:08 pm Iustin Pop

watcher: fix checking of boot IDs

The recent change (commit 2151) to the watcher to make it handle offline
nodes also saves the offline attribute to the state file, but this is
not needed and also breaks the checking of the boot ID. This patch
simply removes it, restoring the correct behaviour....

f07521e5 02/16/2009 01:08 pm Iustin Pop

watcher: autoarchive old jobs

This patch adds auto-archiving of jobs older than 6 hours to the
watcher.

Reviewed-by: imsnah

6e99c5a0 02/13/2009 05:54 pm Iustin Pop

RAPI: fixes related to write mode

This patch fixes many small issues related to write functions:
- update documentations w.r.t. how to add users
- update the instance add function for latest API
- add instance delete
- fix addition of tags
- update some error messages...

1f8588f6 02/13/2009 01:38 pm Iustin Pop

RAPI: format error messages as JSON

This patch changes the format of the HTTP error messages from text/html, which
is hard to parse from RAPI clients, to JSON which can be automatically parsed.

The error message is an object, which contains always three keys:...

77e1d753 02/13/2009 01:38 pm Iustin Pop

Make RAPI return 502/504 errors for luxi errors

This changes the RAPI error codes for luxi errors; a timeout error is
now reported properly as 504, while any other luxi error is reported as
502.

It would be good to convert even more errors into proper return codes in...

6bb258a7 02/13/2009 01:37 pm Iustin Pop

Fix ganeti-rapi startup with missing certificate

This patch displays a nicer error message compared to the default
stacktrace.

Reviewed-by: imsnah

5de4474d 02/12/2009 09:32 am Iustin Pop

master daemon: allow skipping the voting process

This patch introduces a 'force' mode for the master daemon startup where
the voting process is not done, but the user has to confirm manually the
startup (before forking, of course).

Reviewed-by: imsnah

1fae010f 02/12/2009 09:30 am Iustin Pop

Switch the instance_shutdown rpc to (status, data)

This patch changes the return type from this RPC call to include status
information and renames the backend method to match the RPC call name.

The patch is a little bigger than the reboot one, since this call is...

489fcbe9 02/12/2009 09:30 am Iustin Pop

Switch the instance_reboot rpc to (status, data)

This small patch changes the return type from this RPC call to include
status information and renames the backend method to match the RPC call
name.

Reviewed-by: ultrotter

f2ffd244 02/11/2009 12:20 pm Guido Trotter

ganeti-noded: Create LOCK_DIR if missing

We need this directory for locks, so if for any reason it's not there
we'll create it. The permissions are the standard /var/lock permissions.

Reviewed-by: iustinp

821d1bd1 02/09/2009 04:03 pm Iustin Pop

Uniformize some function names in backend.py

Currently, the names of the functions in backend.py that are actually
RPC procedures and are called from ganeti-noded are not corresponding to
the RPC names. This makes it hard to actually see which functions are...

23829f6f 02/09/2009 04:03 pm Iustin Pop

rpc.call_blockdev_find: convert to (status, data)

This patch converts the call_blockdev_find - which searches for block
devices and returns their status - to the (status, data) format. We also
modify the backend function name to match the rpc call.

Reviewed-by: ultrotter

1268d6fd 02/09/2009 12:31 pm Iustin Pop

Fix handling OS errors in AddOSToInstance

This patch fixes the error handling in the add OS to instance function
with regard to invalid OSes. Previously, we didn't handle any such
errors, with the end result that the user would have to look in the node
daemon log....

2ed6a7d6 02/05/2009 04:09 pm Iustin Pop

rapi: fix SSL mode and use SSL by default

This patch fixes the SSL mode (by actually constructing SSL parameters
from the command line options) and enables SSL by default; the old “-S”
option which enabled SSL is now changed to “--no-ssl”. The certificate...

85414b69 02/04/2009 05:11 pm Iustin Pop

rapi: fix authentication and queries

For queries, we don't want to require authentication. We fix this by adding an
override GetAuthRealm in the rapi daemon.

We also fix a method name.

Reviewed-by: imsnah

66baeccc 02/04/2009 05:11 pm Iustin Pop

Add one new luxi query: cluster info

This is the last query that RAPI executes via opcodes and is purely
static (config values only). As such, we can convert it safely to a
query instead of job.

Reviewed-by: imsnah

ec79568d 02/04/2009 12:30 pm Iustin Pop

Implement lockless query operations

This patch adds the framework for, and enables lockless OpQueryInstances. This
means that instances will be shown in ERROR_up or ERROR_down state, even though
this is not an error (but just an in-progress job).

The framework is implemented as follows:...

c979d253 01/21/2009 04:12 pm Iustin Pop

Fix some more pylint errors

Two are real errors (invalid names) and one is style error (overriding
name from outer scope).

Reviewed-by: ultrotter

6906a9d8 01/21/2009 11:54 am Guido Trotter

Add calls in the intra-node migration protocol

Currently the hypervisor is expected to do all the migration from the
source side. With this patch we also add the option of passing some
information to the target side, and starting some operation there.

As a bonus, a function to cleanup any started operation is included....

d21d09d6 01/20/2009 07:19 pm Iustin Pop

Update the logging output of job processing

(this is related to the master daemon log)

Currently it's not possible to follow (in the non-debug runs) the
logical execution thread of jobs. This is due to the fact that we don't
log the thread name (so we lose the association of log messages to jobs)...

6b93ec9d 01/13/2009 05:20 pm Iustin Pop

Forward-port DrbdNetReconfig

This is a modified forward-port of DrbdNetReconfig and their associated
RPCs. In Ganeti 2.0, these functions will be used for two things:
- live migration (as in 1.2)
- and for other network reconfiguration tasks, since DRBD8.Attach()...

4bffa7f7 01/13/2009 10:04 am Iustin Pop

Small typo in ganeti-watcher

Reviewed-by: imsnah

7d88772a 01/09/2009 02:52 pm Iustin Pop

Rework the daemonization sequence

The current fork+close fds sequence has deficiencies which are hard to
work around:
- logging can start logging before we fork (e.g. if we need to emit
messages related to master checking), and thus use FDs which we...

56e7640c 01/08/2009 04:16 pm Iustin Pop

Add an instance_migratable rpc call

This is a forward-port of commit 1194 on the 1.2 branch:

This call will check whether an instance is up on its primary, and that
it has been started with symlinks. We currently have no on-secondary
checks, nor any hypervisor specific call....
b2e7666a 01/07/2009 07:02 pm Iustin Pop

Pass instance name to rpc call blockdev_close

This is an extract of commit 1166 on the 1.2 branch (Add a rpc call for
drbd network reconfiguration), but only the blockdev_close part.

The patch changes the blockdev_close call to take the instance so that...

e09fdcfa 01/06/2009 11:57 am Iustin Pop

Fix some pylint-detected issues

Two bad indentation cases and a missing variable.

Reviewed-by: imsnah

b5b67ef9 12/19/2008 02:58 pm Michael Hanselmann

ganeti-rapi: Implement HTTP authentication

Passwords are stored in "$localstatedir/lib/ganeti/rapi_users". User
options specify the access permissions of a user (see docstring for
ganeti.http.ReadPasswordFile), for which only "write" is supported
to grant write access. Every other user has read-only access....

7e9760c3 12/19/2008 02:58 pm Michael Hanselmann

ganeti-rapi: Introduce per-request context

This will be used to evaluate access permissions to resources.

Reviewed-by: amishchenko

dd875d32 12/18/2008 06:39 pm Michael Hanselmann

Job queue: Allow more than one file rename per RPC call

Reviewed-by: ultrotter

f8ad5591 12/18/2008 06:38 pm Michael Hanselmann

Prevent RPC timeout on auto-archiving jobs

With a large job queue, auto-archiving jobs can take a very long time,
causing timeouts on the luxi RPC layer. With this change, auto-
archive returns after half of the RPC timeout has passed. The user
will see how many jobs are left unchecked....

c41eea6e 12/11/2008 07:13 pm Iustin Pop

Fix epydoc format warnings

This patch should fix all outstanding epydoc parsing errors; as such, we
switch epydoc into verbose mode so that any new errors will be visible.

Reviewed-by: imsnah

56aa9fd5 12/05/2008 05:01 am Iustin Pop

Cleanup the config file on demotion from candidate

This patch adds a simple rpc which makes a backup of the config file and
then removes it. This is done so that cluster verify doesn't complain
immediately after demoting a node.

Reviewed-by: imsnah

cbfc4681 12/05/2008 04:58 am Iustin Pop

watcher: handle offline nodes better

This patch changes the LUQueryInstances to show a different state for
offline nodes and also modifies the watcher to understand the offline
state in its checks.

Reviewed-by: ultrotter

bc2929fc 12/04/2008 05:24 pm Michael Hanselmann

ganeti-rapi: Convert to new HTTP server

Reviewed-by: amishchenko

19205c39 12/04/2008 05:23 pm Michael Hanselmann

ganeti-noded: Migrate to new HTTP server

Reviewed-by: amishchenko

84f2756e 12/04/2008 05:23 pm Michael Hanselmann

Rename all HTTP classes to camel case

It should be consistent.

Reviewed-by: amishchenko

bbe19c17 12/02/2008 07:07 am Iustin Pop

Fix master failover

The ssconf files were not updated by the master failover. We need to
push them, and since we already have RPC initialized, we can use the
standard ConfigWriter to do so - this will take care of both the config
file and the ssconf files....

1cb8d376 11/26/2008 06:49 pm Guido Trotter

ganeti-masterd: create RUN_GANETI_DIR as well

Since we're not sure ganeti-noded has started yet, we need to create
RUN_GANETI_DIR before SOCKET_DIR as well, with the proper permissions.

Reviewed-by: imsnah

817a030d 11/26/2008 06:49 pm Guido Trotter

convert run dir mode to constant

ganeti-noded used to create all directories under /var/run with an
hard-coded mode. convert it to a constant.

Reviewed-by: imsnah

227647ac 11/25/2008 07:11 pm Guido Trotter

Move the MASTER_SOCKET to SOCKET_DIR

Before it was in the abstract linux namespace, where unfortunately we
couldn't easily check from python the credentials of the connecting
clients. Now we also have to remove the file on exit and when starting.

Reviewed-by: imsnah

d823660a 11/25/2008 07:11 pm Guido Trotter

ganeti-masterd: create SOCKET_DIR

If SOCKET_DIR doesn't exist we create it in the master daemon, before
trying to put a socket inside it.

Reviewed-by: imsnah

03d1dba2 11/25/2008 02:37 pm Michael Hanselmann

Pass ssconf values from master to node

Instead of parsing the configuration on the node, we pass the ssconf
values from the master.

Reviewed-by: iustinp

eafd8762 11/21/2008 12:47 pm Michael Hanselmann

Use SSL for master/node RPC

This patch enables SSL between masterd and noded.

Reviewed-by: iustinp

ec17d09c 11/21/2008 12:46 pm Michael Hanselmann

Get rid of node daemon password

With the new SSL client certificate stuff it's no longer needed.

Reviewed-by: iustinp

15486fa7 11/21/2008 12:46 pm Michael Hanselmann

ganeti-masterd: Remove PID file at the end

Removing the PID file should be the last thing done. This patch makes
sure it's also removed when master.server_cleanup() throws an exception.

Also initialize logging only after writing the PID file.

Reviewed-by: iustinp

4331f6cd 11/21/2008 12:45 pm Michael Hanselmann

Reuse HTTP client pool for RPC

ganeti-masterd: Add initialization and shutdown of RPC pool. It needs
to be shutdown before forking.

ganeti.cli: Add decorator function to initialize and shutdown RPC pool.

ganeti.rpc: Add functions to initialize and shutdown RPC pool. Throw...

6ddc95ec 11/21/2008 12:42 pm Michael Hanselmann

Add RPC call to update ssconf files

Reviewed-by: iustinp

8adbffaa 11/11/2008 12:58 pm Iustin Pop

Abstract runtime creation of dirs into a function

Currently the dir creation in ganeti-noded is in the main function. This
is not nice: we move it into a separate function and also add creation
of the OS_LOG_DIR (with different permissions, but in the same way)....

23e46494 10/24/2008 02:31 pm Michael Hanselmann

Document HttpServer.__init__

At the same time, simplify the interface a bit by not using a tuple.

Reviewed-by: killerfoxi, ultrotter

74c47259 10/23/2008 05:19 pm Iustin Pop

Export the disk index in the import/export scripts

We want to export the disk index as some OSes will only want to export
the first disk (or the second one, etc.), even if we have multiple
disks.

The patch also updates the backend.ExportSnapshot docstring....

6c0af70e 10/22/2008 05:09 pm Guido Trotter

Convert ImportOSIntoInstance to OS API 10

- Change ImportOSIntoInstance not to get any "os_disk" and "swap_disk"
arguments but to accept multiple target images to import, and to
return a list of booleans with the result of each import
- Change the relevant rpc call and the only caller to conform...

7a8f64da 10/21/2008 11:32 pm Oleksiy Mishchenko

Pass request headers in to RAPI handlers.

Reviewed-by: iustinp

99aabbed 10/20/2008 05:47 pm Iustin Pop

Convert the job queue rpcs to address-based

The two main multi-node job queue RPC calls (jobqueue_update,
jobqueue_rename) are converted to address-based calls, in order to speed
up queue changes. For this, we need to change the _nodes attribute on
the jobqueue to be a dict {name: ip}, instead of a set....

82d9caef 10/20/2008 03:50 pm Iustin Pop

Remove the logger.py module

Since now we use only one function from the logger module
(SetupLogging), we move it to utils.py (which is already imported by all
users of this function), and we remove the module.

Reviewed-by: imsnah

d15a9ad3 10/17/2008 05:37 pm Guido Trotter

Cleanup os_add/rename rpc for OS API 10

- remove now unused osdev and swapdev arguments from backend, noded,
rpc, cmdlib
- convert docstrings to epydoc

Reviewed-by: iustinp

713faea6 10/17/2008 04:06 pm Oleksiy Mishchenko

ETag passing support.

Reviewed-by: imsnah

16a8967d 10/16/2008 07:54 pm Michael Hanselmann

rapi: Convert to new HTTP server class

Requests are no longer logged to a separate file.

Reviewed-by: amishchenko

d7cdb55d 10/16/2008 02:36 pm Iustin Pop

Improvements to the master startup checks

In order to account for future improvements to master failover, we move
the actual data gathering capabilities from ganeti-masterd into
bootstrap.py, and we leave only the verification into masterd.

The verification procedure is then changed to retry multiple times (up...

3ccafd0e 10/16/2008 11:37 am Iustin Pop

Add an interface for the drain flag changes/query

This adds the set/reset in the jqueue and luxi modules, and a way to
query it in OpQueryConfigValues, and also the comand line interface for
it:
$ gnt-cluster queue info
The drain flag is unset
$ gnt-cluster queue drain...

5d672980 10/15/2008 04:13 pm Iustin Pop

Add a rpc call for changing the drain flag

A new multi-node call is added that sets/resets the drain flag.

Reviewed-by: imsnah

6797ec29 10/15/2008 01:51 pm Iustin Pop

Implement transport of ganeti errors across luxi

This patch adds a generic method to identify the ganeti error given its
class name, and implements this across the luxi protocol.

Reviewed-by: imsnah

a2f92677 10/15/2008 11:22 am Michael Hanselmann

rapi: Whitespace fixes

Reviewed-by: ultrotter

6217e295 10/14/2008 01:19 pm Iustin Pop

Export the hypervisor.ValidateParameters over RPC

The newly-added node-specific ValidateParams hypervisor method is
exported over RPC, using the semi-standard (success, message) return
value. Multi-node call, so that we call on both primary and secondary at...

16ad1a83 10/13/2008 04:49 pm Iustin Pop

Fix a few rpc-related errors

This fixes:
- whitespace change, double lines between methods
- duplication of call_upload_file, introduced by mistake in rev 1795
and which went undetected because of the many changes in that ref
(only diff -b shows it clearly)...

caad16e2 10/12/2008 11:40 pm Iustin Pop

Abstract checking own address into a function

Currently, we check if we have a given ip address (i.e. it's alive on
one of our interfaces) but manually calling TcpPing(source=localhost).
This works, but having it spread all over the code makes it hard to...

cc28af80 10/10/2008 07:00 pm Michael Hanselmann

Convert ganeti-noded to new HTTP server class

Reviewed-by: iustinp

72737a7f 10/10/2008 12:55 pm Iustin Pop

Convert rpc module to RpcRunner

This big patch changes the call model used in internode-rpc from
standalong function calls in the rpc module to via a RpcRunner class,
that holds all the methods. This can be used in the future to enable
smarter processing in the RPC layer itself (some quick examples are not...

e69d05fd 10/08/2008 01:36 pm Iustin Pop

Move the hypervisor attribute to the instances

This (big) patch moves the hypervisor type from the cluster to the
instance level; the cluster attribute remains as the default hypervisor,
and will be renamed accordingly in a next patch. The cluster also gains...

9f0e6b37 10/07/2008 02:39 pm Iustin Pop

rpc.call_instance_migrate: pass the whole instance

Currently the call_instance_migrate call only passes the instance name;
we need to pass the whole object for the hypervisor_type changes (all
the other individual instance rpc calls already pass the instance...

e92376d7 10/07/2008 11:03 am Iustin Pop

Implement job 'waiting' status

Background: when we have multiple jobs in the queue (more than just a
few), many of the jobs (up to the number of threads) will be in state
'running', although many of them could be actually blocked, waiting for
some locks. This is not good, as one cannot easily see what is...

07cd723a 10/06/2008 07:42 pm Iustin Pop

Implement job auto-archiving

This patch adds a new luxi call that implements auto-archiving of jobs
older than a certain age (or -1 for all completed jobs), and the gnt-job
command that makes use of this (with 'all' for -1).

Reviewed-by: imsnah

62c9ec92 10/06/2008 06:58 pm Iustin Pop

backend.py change to get cluster name from master

Currently there are three function in backend that need the cluster name
in order to instantiate an SshRunner. The patch changes these to get the
cluster name from the master in the rpc call; once the multi-hypervisor...

a42872ff 10/01/2008 08:36 pm Michael Hanselmann

Convert ganeti-master

Use simpleconfig instead of ssconf.

Reviewed-by: iustinp

2859b87b 10/01/2008 08:36 pm Michael Hanselmann

Convert ganeti-watcher

Use RPC calls instead of ssconf.

Reviewed-by: iustinp

8594f271 10/01/2008 08:36 pm Michael Hanselmann

Convert ganeti-noded

Replace ssconf with utility functions.

Reviewed-by: iustinp

ae5849b5 10/01/2008 08:33 pm Michael Hanselmann

Add new query to get cluster config values

This can be used to retrieve certain cluster config values from
within clients.

OpDumpClusterConfig was not used anywhere, hence I'm just reusing
it. The way ConfigWriter.DumpConfig returned the configuration
was not thread-safe, anyway (no deepcopy)....

37b77b18 10/01/2008 12:27 pm Iustin Pop

Fix the watcher with down nodes

The watcher didn't handle the down nodes, fix this by ignoring (in
secondary node reboot checks) any node that doesn't return a boot id.

Reviewed-by: imsnah

b7309a0d 10/01/2008 12:27 pm Iustin Pop

Fix the watcher not restarting instance bug

The watcher was using conflicting attributes of the instance:
- it queried the admin_/oper_state, which are booleans
- but it compared those to the status (which is a text field)

The code was changed to query the aggregated 'status' field, as that...

5188ab37 10/01/2008 12:27 pm Iustin Pop

Remove last use of utils.RunCmd from the watcher

The watcher has one last use of ganeti commands as opposed to sending
requests via luxi. The patch changes this to use the cli functions.

The patch also has two other changes:
- fix the docstring for OpVerifyDisks (found out while converting...

8785cb30 09/09/2008 03:57 pm Michael Hanselmann

ganeti-noded: Add constant for queue lock timeout

Reviewed-by: iustinp

36205981 09/09/2008 03:25 pm Iustin Pop

Implement master startup safety check

This is an initial version of the master startup checks. It's a very
rudimentary change, however in normal usage (an old master was started,
the rest of the cluster is functioning normally) it will succeed in
preventing wrong startups....

4e071d3b 09/09/2008 03:24 pm Iustin Pop

Export backend.GetMasterInfo over the rpc layer

We create a multi-node call so that querying all nodes for agreement
will be fast.

Reviewed-by: imsnah

506cff12 09/09/2008 12:01 pm Michael Hanselmann

Use lock timeout for queue updates in ganeti-noded

This helps to prevent complete deadlocks.

Reviewed-by: iustinp

f1f3f45c 09/05/2008 03:29 pm Michael Hanselmann

noded: Get job queue lock while purging queue content

Only one process should modify the queue at the same time.

Reviewed-by: iustinp

5c735209 08/29/2008 04:42 pm Iustin Pop

Make WaitForJobChanges deal with long jobs

This patch alters the WaitForJobChanges luxi-RPC call to have a
configurable timeout, so that the call behaves nicely with long jobs
that have no update.

We do this by adding a timeout parameter in the RPC call, and returning...

6c5a7090 08/27/2008 11:34 am Michael Hanselmann

Make sure that client programs get all messages

This is a large patch, but I can't figure out how to split it without
breaking stuff. The old way of getting messages by always getting the
last one didn't bring all messages to the client if they were added...

9894ece7 08/18/2008 02:12 pm Michael Hanselmann

Use Linux-specific way to name master socket

By using this Linux-specific way we don't have to care about removing the
socket file when quitting or starting (after an unclean shutdown). For a
more detailed description, see the comment in the patch.

Reviewed-by: schreiberal

dfe57c22 08/11/2008 07:27 pm Michael Hanselmann

Add RPC call to wait for job changes

This way clients can react faster to status or message changes and
don't have to poll anymore.

Reviewed-by: ultrotter

32f93223 08/08/2008 02:29 pm Michael Hanselmann

Add query function for exports

Reviewed-by: iustinp

af5ebcb1 08/08/2008 02:21 pm Michael Hanselmann

noded: Add RPC function to rename job queue files

This will be used to archive jobs.

Reviewed-by: iustinp

7f30777b 08/08/2008 02:20 pm Michael Hanselmann

noded: Add decorator for job queue lock

The lock will also be needed by another function.

Reviewed-by: iustinp

25d6d12a 08/08/2008 01:03 pm Michael Hanselmann

Implement queue locking in node daemon

Reviewed-by: iustinp

aa9075c5 08/08/2008 01:02 pm Michael Hanselmann

More logging for errors during noded RPC calls

Reviewed-by: iustinp

ca52cdeb 08/08/2008 01:01 pm Michael Hanselmann

Add job queue RPC functions

jobqueue_update: Uploads a job queue file's content to a node. The
most common operation is to upload something that we already have
in a string. Unlike in the upload_file function, the file is not
read again when distributing changes, but content has to be passed...

e125c67c 08/07/2008 04:03 pm Michael Hanselmann

Use API instead of command line utilities in watcher

Reviewed-by: iustinp

c36176cc 08/06/2008 05:56 pm Michael Hanselmann

Notify job queue about added/removed nodes

The job queue maintains its own node list and must be notified
when nodes are added/removed.

Reviewed-by: iustinp

d8470559 08/06/2008 05:56 pm Michael Hanselmann

Implement {Add,Readd,Remove}Node in GanetiContext

By doing this we've a central place which coordinates what needs to be
done when adding or removing nodes. Another patch will add calls into
the job queue.

Two log messages move to config.py.

When removing a node, node_leave_cluster is now called after it has...

4c848b18 08/06/2008 04:35 pm Michael Hanselmann

jqueue: Don't pass the list of nodes to SubmitJob anymore

The job queue now maintains its own list and is updated when
nodes are added or removed from the cluster.

Reviewed-by: iustinp

9113300d 08/06/2008 04:35 pm Michael Hanselmann

masterd: Move job queue into context object

The job queue must be called from cmdlib when adding or removing
nodes to the cluster. Moving it to the context objects makes
this possible.

Reviewed-by: iustinp