Statistics
| Branch: | Tag: | Revision:

root / daemons @ 82d9caef

# Date Author Comment
82d9caef 10/20/2008 03:50 pm Iustin Pop

Remove the logger.py module

Since now we use only one function from the logger module
(SetupLogging), we move it to utils.py (which is already imported by all
users of this function), and we remove the module.

Reviewed-by: imsnah

d15a9ad3 10/17/2008 05:37 pm Guido Trotter

Cleanup os_add/rename rpc for OS API 10

- remove now unused osdev and swapdev arguments from backend, noded,
rpc, cmdlib
- convert docstrings to epydoc

Reviewed-by: iustinp

713faea6 10/17/2008 04:06 pm Oleksiy Mishchenko

ETag passing support.

Reviewed-by: imsnah

16a8967d 10/16/2008 07:54 pm Michael Hanselmann

rapi: Convert to new HTTP server class

Requests are no longer logged to a separate file.

Reviewed-by: amishchenko

d7cdb55d 10/16/2008 02:36 pm Iustin Pop

Improvements to the master startup checks

In order to account for future improvements to master failover, we move
the actual data gathering capabilities from ganeti-masterd into
bootstrap.py, and we leave only the verification into masterd.

The verification procedure is then changed to retry multiple times (up...

3ccafd0e 10/16/2008 11:37 am Iustin Pop

Add an interface for the drain flag changes/query

This adds the set/reset in the jqueue and luxi modules, and a way to
query it in OpQueryConfigValues, and also the comand line interface for
it:
$ gnt-cluster queue info
The drain flag is unset
$ gnt-cluster queue drain...

5d672980 10/15/2008 04:13 pm Iustin Pop

Add a rpc call for changing the drain flag

A new multi-node call is added that sets/resets the drain flag.

Reviewed-by: imsnah

6797ec29 10/15/2008 01:51 pm Iustin Pop

Implement transport of ganeti errors across luxi

This patch adds a generic method to identify the ganeti error given its
class name, and implements this across the luxi protocol.

Reviewed-by: imsnah

a2f92677 10/15/2008 11:22 am Michael Hanselmann

rapi: Whitespace fixes

Reviewed-by: ultrotter

6217e295 10/14/2008 01:19 pm Iustin Pop

Export the hypervisor.ValidateParameters over RPC

The newly-added node-specific ValidateParams hypervisor method is
exported over RPC, using the semi-standard (success, message) return
value. Multi-node call, so that we call on both primary and secondary at...

16ad1a83 10/13/2008 04:49 pm Iustin Pop

Fix a few rpc-related errors

This fixes:
- whitespace change, double lines between methods
- duplication of call_upload_file, introduced by mistake in rev 1795
and which went undetected because of the many changes in that ref
(only diff -b shows it clearly)...

caad16e2 10/12/2008 11:40 pm Iustin Pop

Abstract checking own address into a function

Currently, we check if we have a given ip address (i.e. it's alive on
one of our interfaces) but manually calling TcpPing(source=localhost).
This works, but having it spread all over the code makes it hard to...

cc28af80 10/10/2008 07:00 pm Michael Hanselmann

Convert ganeti-noded to new HTTP server class

Reviewed-by: iustinp

72737a7f 10/10/2008 12:55 pm Iustin Pop

Convert rpc module to RpcRunner

This big patch changes the call model used in internode-rpc from
standalong function calls in the rpc module to via a RpcRunner class,
that holds all the methods. This can be used in the future to enable
smarter processing in the RPC layer itself (some quick examples are not...

e69d05fd 10/08/2008 01:36 pm Iustin Pop

Move the hypervisor attribute to the instances

This (big) patch moves the hypervisor type from the cluster to the
instance level; the cluster attribute remains as the default hypervisor,
and will be renamed accordingly in a next patch. The cluster also gains...

9f0e6b37 10/07/2008 02:39 pm Iustin Pop

rpc.call_instance_migrate: pass the whole instance

Currently the call_instance_migrate call only passes the instance name;
we need to pass the whole object for the hypervisor_type changes (all
the other individual instance rpc calls already pass the instance...

e92376d7 10/07/2008 11:03 am Iustin Pop

Implement job 'waiting' status

Background: when we have multiple jobs in the queue (more than just a
few), many of the jobs (up to the number of threads) will be in state
'running', although many of them could be actually blocked, waiting for
some locks. This is not good, as one cannot easily see what is...

07cd723a 10/06/2008 07:42 pm Iustin Pop

Implement job auto-archiving

This patch adds a new luxi call that implements auto-archiving of jobs
older than a certain age (or -1 for all completed jobs), and the gnt-job
command that makes use of this (with 'all' for -1).

Reviewed-by: imsnah

62c9ec92 10/06/2008 06:58 pm Iustin Pop

backend.py change to get cluster name from master

Currently there are three function in backend that need the cluster name
in order to instantiate an SshRunner. The patch changes these to get the
cluster name from the master in the rpc call; once the multi-hypervisor...

a42872ff 10/01/2008 08:36 pm Michael Hanselmann

Convert ganeti-master

Use simpleconfig instead of ssconf.

Reviewed-by: iustinp

2859b87b 10/01/2008 08:36 pm Michael Hanselmann

Convert ganeti-watcher

Use RPC calls instead of ssconf.

Reviewed-by: iustinp

8594f271 10/01/2008 08:36 pm Michael Hanselmann

Convert ganeti-noded

Replace ssconf with utility functions.

Reviewed-by: iustinp

ae5849b5 10/01/2008 08:33 pm Michael Hanselmann

Add new query to get cluster config values

This can be used to retrieve certain cluster config values from
within clients.

OpDumpClusterConfig was not used anywhere, hence I'm just reusing
it. The way ConfigWriter.DumpConfig returned the configuration
was not thread-safe, anyway (no deepcopy)....

37b77b18 10/01/2008 12:27 pm Iustin Pop

Fix the watcher with down nodes

The watcher didn't handle the down nodes, fix this by ignoring (in
secondary node reboot checks) any node that doesn't return a boot id.

Reviewed-by: imsnah

b7309a0d 10/01/2008 12:27 pm Iustin Pop

Fix the watcher not restarting instance bug

The watcher was using conflicting attributes of the instance:
- it queried the admin_/oper_state, which are booleans
- but it compared those to the status (which is a text field)

The code was changed to query the aggregated 'status' field, as that...

5188ab37 10/01/2008 12:27 pm Iustin Pop

Remove last use of utils.RunCmd from the watcher

The watcher has one last use of ganeti commands as opposed to sending
requests via luxi. The patch changes this to use the cli functions.

The patch also has two other changes:
- fix the docstring for OpVerifyDisks (found out while converting...

8785cb30 09/09/2008 03:57 pm Michael Hanselmann

ganeti-noded: Add constant for queue lock timeout

Reviewed-by: iustinp

36205981 09/09/2008 03:25 pm Iustin Pop

Implement master startup safety check

This is an initial version of the master startup checks. It's a very
rudimentary change, however in normal usage (an old master was started,
the rest of the cluster is functioning normally) it will succeed in
preventing wrong startups....

4e071d3b 09/09/2008 03:24 pm Iustin Pop

Export backend.GetMasterInfo over the rpc layer

We create a multi-node call so that querying all nodes for agreement
will be fast.

Reviewed-by: imsnah

506cff12 09/09/2008 12:01 pm Michael Hanselmann

Use lock timeout for queue updates in ganeti-noded

This helps to prevent complete deadlocks.

Reviewed-by: iustinp

f1f3f45c 09/05/2008 03:29 pm Michael Hanselmann

noded: Get job queue lock while purging queue content

Only one process should modify the queue at the same time.

Reviewed-by: iustinp

5c735209 08/29/2008 04:42 pm Iustin Pop

Make WaitForJobChanges deal with long jobs

This patch alters the WaitForJobChanges luxi-RPC call to have a
configurable timeout, so that the call behaves nicely with long jobs
that have no update.

We do this by adding a timeout parameter in the RPC call, and returning...

6c5a7090 08/27/2008 11:34 am Michael Hanselmann

Make sure that client programs get all messages

This is a large patch, but I can't figure out how to split it without
breaking stuff. The old way of getting messages by always getting the
last one didn't bring all messages to the client if they were added...

9894ece7 08/18/2008 02:12 pm Michael Hanselmann

Use Linux-specific way to name master socket

By using this Linux-specific way we don't have to care about removing the
socket file when quitting or starting (after an unclean shutdown). For a
more detailed description, see the comment in the patch.

Reviewed-by: schreiberal

dfe57c22 08/11/2008 07:27 pm Michael Hanselmann

Add RPC call to wait for job changes

This way clients can react faster to status or message changes and
don't have to poll anymore.

Reviewed-by: ultrotter

32f93223 08/08/2008 02:29 pm Michael Hanselmann

Add query function for exports

Reviewed-by: iustinp

af5ebcb1 08/08/2008 02:21 pm Michael Hanselmann

noded: Add RPC function to rename job queue files

This will be used to archive jobs.

Reviewed-by: iustinp

7f30777b 08/08/2008 02:20 pm Michael Hanselmann

noded: Add decorator for job queue lock

The lock will also be needed by another function.

Reviewed-by: iustinp

25d6d12a 08/08/2008 01:03 pm Michael Hanselmann

Implement queue locking in node daemon

Reviewed-by: iustinp

aa9075c5 08/08/2008 01:02 pm Michael Hanselmann

More logging for errors during noded RPC calls

Reviewed-by: iustinp

ca52cdeb 08/08/2008 01:01 pm Michael Hanselmann

Add job queue RPC functions

jobqueue_update: Uploads a job queue file's content to a node. The
most common operation is to upload something that we already have
in a string. Unlike in the upload_file function, the file is not
read again when distributing changes, but content has to be passed...

e125c67c 08/07/2008 04:03 pm Michael Hanselmann

Use API instead of command line utilities in watcher

Reviewed-by: iustinp

c36176cc 08/06/2008 05:56 pm Michael Hanselmann

Notify job queue about added/removed nodes

The job queue maintains its own node list and must be notified
when nodes are added/removed.

Reviewed-by: iustinp

d8470559 08/06/2008 05:56 pm Michael Hanselmann

Implement {Add,Readd,Remove}Node in GanetiContext

By doing this we've a central place which coordinates what needs to be
done when adding or removing nodes. Another patch will add calls into
the job queue.

Two log messages move to config.py.

When removing a node, node_leave_cluster is now called after it has...

4c848b18 08/06/2008 04:35 pm Michael Hanselmann

jqueue: Don't pass the list of nodes to SubmitJob anymore

The job queue now maintains its own list and is updated when
nodes are added or removed from the cluster.

Reviewed-by: iustinp

9113300d 08/06/2008 04:35 pm Michael Hanselmann

masterd: Move job queue into context object

The job queue must be called from cmdlib when adding or removing
nodes to the cluster. Moving it to the context objects makes
this possible.

Reviewed-by: iustinp

02f7fe54 08/06/2008 11:26 am Michael Hanselmann

Implement query for nodes

Reviewed-by: iustinp

ee6c7b94 08/06/2008 11:25 am Michael Hanselmann

Implement query for instances

Queries don't create jobs and are more efficient. Log messages
are not yet stored anywhere.

Reviewed-by: iustinp

441e7cfd 07/31/2008 12:06 pm Oleksiy Mishchenko

First write operation (add tag) for Ganeti RAPI

Add instance tag handling, improved error logging.
...oh, yes adopt instance listing for RAPI2!

Reviewed-by: iustinp

59f187eb 07/30/2008 03:32 pm Iustin Pop

Unify SetupDaemon/SetupLogging

The 'old-style' info, error, debug logs do not make much sense. This
patch unifies the SetupLogging and SetupDaemon functions. As a result,
all the commands logs to a 'commands.log' file.

The patch also changes the log setup to keep going if there's an error...

b1b6ea87 07/30/2008 11:43 am Iustin Pop

Rework master startup/shutdown/failover

This (big) patch reworks the master startup/shutdown and the fixes the
master failover.

What does the patch do?

For master start/stop:
- remove the old ganeti-master script and its associated man page
- moves the ip start/stop directly into the backend.(Start|Stop)Master...

5675cd1f 07/30/2008 11:33 am Iustin Pop

Implement checking for the master role in rapi

This patch moves the CheckMaster function from ganeti-masterd to ssconf
(most logical place, it cannot go in utils since we would have recursive
imports between ssconf and utils) and changes ganeti-rapi to also call...

1c65840b 07/30/2008 11:32 am Iustin Pop

Add a new parameter to backend.(Start|Stop)Master

This patch adds a new, unused for now, parameter to the start and stop
master operations in backend. The idea behind it is that we need to be
able to control whether the IP (de)activation is coupled with daemon...

99e88451 07/29/2008 12:06 pm Iustin Pop

Use constants for the pid file stems

Reviewed-by: imsnah

f71245a0 07/29/2008 11:48 am Iustin Pop

Make the rapi daemon create a pidfile

This is needed for controlling it cleanly with start-stop daemon.

Reviewed-by: ultrotter

cfe3c70f 07/28/2008 01:17 pm Michael Hanselmann

Implement signal handling in ganeti-rapi

Reviewed-by: iustinp

3cd62121 07/28/2008 01:17 pm Michael Hanselmann

Move ganeti-rapi core code to daemon

All other daemons have their main code in themselves and not in a module.
This patch does the same to ganeti-rapi by moving the code from
lib/rapi/RESTHTTPServer.py to daemons/ganeti-rapi.

Reviewed-by: iustinp

3a2c7775 07/24/2008 02:32 pm Michael Hanselmann

Fix RPC parameters for {Cancel,Archive}Job

They aren't be tuples on the client side.

Reviewed-by: iustinp

8feda3ad 07/23/2008 05:23 pm Guido Trotter

ganeti-masterd: write and remove pidfile

Reviewed-by: iustinp

73d927a2 07/23/2008 05:23 pm Guido Trotter

ganeti-noded: write and remove pid file

Reviewed-by: iustinp

c3f0a12f 07/23/2008 01:06 pm Iustin Pop

Distribute the queue serial file after each update

This patch adds distribution of the queue serial file after each write
to it (but before a new job is created and written with that ID, and
before a response is returned, so we should be safe from crashes in...

84b58db2 07/21/2008 06:32 pm Michael Hanselmann

Handle signals in node daemon

This also fixes a TODO added by ultrotter by killing the parent
process when QuitGanetiException is raised.

Reviewed-by: ultrotter

610bc9ee 07/21/2008 06:32 pm Michael Hanselmann

Use new signal handler class in master daemon

Reviewed-by: ultrotter

8075ce7e 07/16/2008 03:17 pm Oleksiy Mishchenko

Breath life in to RAPI for trunk

Reviewed-by: imsnah

761ce945 07/16/2008 12:48 pm Guido Trotter

Fork ganeti-noded

Create a new ForkingHTTPServer in ganeti-noded by deriving both from
NodeDaemonHttpServer and ForkingMixin. This will allow us to process
concurrent requests.

Reviewed-by: imsnah

36088c4c 07/14/2008 06:52 pm Michael Hanselmann

Fix previous patch using workerpool in masterd

The function to stop a worker pool is TerminateWorkers(), not Shutdown().

Reviewed-by: iustinp

23e50d39 07/14/2008 06:42 pm Michael Hanselmann

Use workerpool in master daemon

Reusing threads instead of starting one for each request is more efficient.

Reviewed-by: iustinp

1df6506c 07/11/2008 03:20 pm Michael Hanselmann

Use new HTTP server classes in ganeti-noded

Reviewed-by: iustinp

8c229cc7 07/11/2008 12:47 pm Oleksiy Mishchenko

Initial copy of RAPI filebase to the trunk

Reviewed-by: iustinp

0ed468d3 07/10/2008 03:48 pm Michael Hanselmann

Remove more old job queue code

Apparently I forgot to this code when removing the rest.

Reviewed-by: iustinp

eb0f0ce0 07/10/2008 03:38 pm Michael Hanselmann

Move watcher's LockFile function to utils

Reviewed-by: iustinp

ff5fac04 07/09/2008 05:46 pm Iustin Pop

Fix double-logging in daemons

Currently, in debug mode, both the logfile handler and the stderr
handler will log debug messages. Since the stderr is redirected to the
same logfile (to catch non-logged errors), it means log entries are
doubled.

The patch adds an extra parameter to the logger.SetupDaemon() function...

c89189b1 07/09/2008 05:43 pm Iustin Pop

ganeti-noded logging improvements

The patch adds some more logging to the node daemon:

- log methods at beggining not only at the end
- log method parameters (they are very verbose, but useful)

A separate change is to initialize the global variable in the global...

d4fa5c23 07/09/2008 01:41 pm Iustin Pop

Remove the old locking functions

This removes (hopefully) all traces of the old locking functions and
uses.

Reviewed-by: imsnah

2467e0d3 07/09/2008 01:34 pm Michael Hanselmann

Remove old job queue code

Reviewed-by: iustinp

0bbe448c 07/09/2008 01:34 pm Michael Hanselmann

Change masterd/client RPC protocol

- Introduce abstraction class on client side
- Use constants for method names
- Adopt legacy function SubmitOpCode to use it

Reviewed-by: iustinp

3d8548c4 07/09/2008 01:34 pm Michael Hanselmann

Make luxi RPC more flexible

- Use constants for dict entries
- Handle exceptions on server side
- Rename client function to CallMethod to match server side naming

Reviewed-by: iustinp

50a3fbb2 07/09/2008 01:34 pm Michael Hanselmann

Instantiate new job queue in master daemon

Reviewed-by: iustinp

195c7f91 07/08/2008 05:42 pm Iustin Pop

Create all SUB_RUN_DIRS in ganeti-noded

Rather than just creating BDEV_CACHE_DIR we loop through the
SUB_RUN_DIRS list and create all its childs.

Reviewed-by: iustinp

26517d45 07/04/2008 07:01 pm Iustin Pop

Fix some issues with the watcher

This patch fixes two bugs:
- the state file is not saved because we use the method for checking
for udpated data
- in two places 'Error' was used instead of 'Exception', which breaks
error handling

Additionally:...

3b316acb 07/03/2008 03:06 pm Iustin Pop

Add custom logging setup for daemons

It's better for daemons if:
- they log only to one log file
- the log level is included
- for debug runs, the filename/line number is included

This patch moves the custom formatter from the watcher to the logging...

cc2bea8b 07/02/2008 02:58 pm Michael Hanselmann

ganeti-masterd: Remove unused locking code

Reviewed-by: iustinp, ultrotter

96cb3986 07/02/2008 02:58 pm Michael Hanselmann

ganeti-masterd: Use logging module

Reviewed-by: ultrotter, iustinp

984f7c32 07/01/2008 03:28 pm Guido Trotter

Context: s/GLM/glm/

Make the GanetiLockManager instance of GanetiContext lowercase

Reviewed-by: imsnah

a478cd7e 07/01/2008 01:43 pm Guido Trotter

Increase the thread size to 5

Now that we use the locking library to make sure running opcodes cannot
step on each other toes we can have a bigger thread size, and
potentially process many opcodes in a parallel manner.

Reviewed-by: iustinp

1c901d13 07/01/2008 01:43 pm Guido Trotter

Processor: pass context in and use it.

The processor used to create a new ConfigWriter when it was initialized.
We now have one in the context, so we'll just recycle it. First of all
we'll pass the context in when creating a new Processor object, then
we'll just use context.cfg, which is granted to be initialized, wherever...

39dcf2ef 06/30/2008 03:37 pm Guido Trotter

ganeti-masterd: init and distribute common context

This patch creates a new GanetiContext class, which is used to hold
context common to all ganeti worker threads. As for the
GanetiLockingManager class it is paramount that there is only one such
class throughout the execution of Ganeti, so the class checks for that,...

c3d7f69b 06/27/2008 05:27 pm Guido Trotter

ganeti-noded: Fix handling of QuitGanetiException

- s/GanetiQuitException/QuitGanetiException/
- Look for the arguments in err.args, not err itself

Reviewed-by: iustinp

9ae49f27 06/26/2008 05:42 pm Guido Trotter

ganeti-noded: quit on QuitGanetiException

Accoring to the usage documented in the QuitGanetiException docstring,
if we receive such an exception we'll set the global _EXIT_GANETI_NODED
variable to True, and then return either a valid value or an error
message to the user. This will be the last request we serve, though,...

3b3db8fd 06/26/2008 05:42 pm Guido Trotter

ganeti-noded: serve not quite forever

Rather than calling httpd.serve_forever() in ganeti-noded we'll call
httpd.handle_request() but just while a global variable, which we'll
call _EXIT_GANETI_NODED, remains false.

Reviewed-by: iustinp

0db7ac4d 06/23/2008 06:00 pm Guido Trotter

Handle any exception in ganeti-masterd

If an uncaught exception is thrown currently it destroys the calling
thread. This patch changes the behaviour to failing the current job,
logging a message, but trying to keep the daemon up.

Reviewed-by: imsnah

d61cbe76 06/20/2008 02:04 pm Iustin Pop

Add a rpc call for BlockDev.Close()

This patch adds rpc layer calls (in rpc.py and the equivalent in
ganeti-noded) to close a list of block devices, and the wrapper in
backend.py that takes a list of Disk objects, identifies them and
returns correctly formatted results....

e8230860 06/19/2008 03:56 pm Michael Hanselmann

Use a single Makefile.am instead of many

This change allows us to use cleaner dependencies between
directories. The build system is basically rewritten in large parts
and may contain bugs.

Reviewed-by: iustinp

7bca53e4 06/18/2008 03:32 pm Michael Hanselmann

ganeti-watcher: Replace custom exceptions with ganeti.error.*

Reviewed-by: iustinp

2fb96d39 06/18/2008 03:31 pm Michael Hanselmann

ganeti-watcher: Don't write file if data didn't change

This is the safest way to detect changes and the amount of data
is small, so keeping a copy around is cheap enough.

Reviewed-by: iustinp

b76f660d 06/18/2008 03:31 pm Michael Hanselmann

ganeti-watcher: Rename WatcherState.data to WatcherState._data

Cleanup: _data is private and should not be modified from outside
of this class.

Reviewed-by: iustinp

1b052f42 06/18/2008 03:31 pm Michael Hanselmann

Don't log SystemExit exception in ganeti-watcher

Reviewed-by: iustinp

fc428e32 06/18/2008 03:31 pm Michael Hanselmann

Replace watcher state file atomically

- Lock it before renaming
- Code cleanup; close() automatically unlocks it

Reviewed-by: iustinp

78f3bd30 06/18/2008 03:30 pm Michael Hanselmann

Write ganeti-watcher status file even if something failed

Reviewed-by: iustinp

67fe61c4 06/18/2008 03:29 pm Michael Hanselmann

Use ganeti.serializer module in ganeti-watcher

Reviewed-by: ultrotter