Guido Trotter [Fri, 7 Aug 2009 10:07:37 +0000 (11:07 +0100)]
serializer.DumpSignedJson
Don't indent the final message.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 6 Aug 2009 17:15:15 +0000 (18:15 +0100)]
constants: confd node roles
confd will return the node role as an integer, which represents one of
the mutually exclusive roles a node can be in.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 6 Aug 2009 17:08:34 +0000 (18:08 +0100)]
constants: confd query types
Initially confd will support only two queries:
CONFD_REQ_NODE_ROLE_BYNAME
Given a node name, return its role.
CONFD_REQ_NODE_PIP_BY_INSTANCE_IP
Given an instance ip, return its node primary ip.
This rather weird query is the basis for ganeti nbma lookup.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Fri, 7 Aug 2009 10:05:04 +0000 (11:05 +0100)]
design-2.1: detail confd wire protocol
Until now it was being kept too vague, so here we give some real
examples of how things are going to be.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 3 Aug 2009 15:09:28 +0000 (16:09 +0100)]
SimpleConfigReader.Reload()
Rather than initializing the config statically at class creation time,
we load it every time Reload() is called.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 5 Aug 2009 12:30:51 +0000 (13:30 +0100)]
Confd{Request,Reply} objects
These objects are used to store confd queries and replies.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 6 Aug 2009 13:44:58 +0000 (14:44 +0100)]
Serializer, remove salt_verifier functionality
The salt needs to be returned anyway, so we don't have to add another
key for the sender to recognize which request an answer is answering, so
all that infrastructure is useless. :(
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 6 Aug 2009 09:45:01 +0000 (10:45 +0100)]
pyinotify: configure checks and documentation
After
74d519e3b91845a17ae095eb7d58dd9e3d1303e8 Ganeti depends on
pyinotify. Updating the documentation accordingly and checking for its
presence at configure time.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 6 Aug 2009 09:18:41 +0000 (10:18 +0100)]
asycnotifier.AsyncNotifier
AsyncNotifier is a special asyncore class that delivers inotify events
asynchronously.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 17 Jul 2009 12:52:31 +0000 (14:52 +0200)]
SimpleConfigReader: Handle errors when loading
Handling both IOErrors and ValueErrors (thrown by the simplejson loader)
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 14 Jul 2009 10:35:04 +0000 (12:35 +0200)]
ssconf.CheckMasterCandidate
This function checks that the current node is a master candidate, and
terminates otherwise. It will be used upon ganeti-confd startup.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 6 Aug 2009 11:26:14 +0000 (13:26 +0200)]
Convert ldisk_degraded to tri-state value
This allows us to report “uncertain” states (LDS_UNKNOWN) for cases
where the code can't easily detect or report what's wrong with a
block device.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 6 Aug 2009 11:28:16 +0000 (13:28 +0200)]
Add constants for local disk status
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 6 Aug 2009 11:27:27 +0000 (13:27 +0200)]
Handle None result from BlockdevFind
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 5 Aug 2009 11:21:27 +0000 (13:21 +0200)]
objects.BlockDevStatus: Remove ToLegacyStatus
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Luca Bigliardi [Wed, 5 Aug 2009 16:31:38 +0000 (17:31 +0100)]
Add master candidates IPs informations to ssconf
This will be used when querying confd, in order not to rely on DNS being
available.
Signed-off-by: Luca Bigliardi <shammash@google.com>
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Wed, 5 Aug 2009 16:39:41 +0000 (17:39 +0100)]
TestParameterNames: also check nic parameters
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Wed, 5 Aug 2009 12:46:43 +0000 (13:46 +0100)]
ConfigObject.ToDict() only export non-None values
The method is changed to a normal loop, to avoid calling getattr()
twice. Also __getstate__ is changed to just use ToDict() by default.
This should also make __getstate__ work for objects which have to
override the ToDict function because they contain other objects.
__setstate__ is probably still broken in this case, but so it was
before, and it's not used inside our code, so I'll pretend not to have
noticed, as there is no "nice" way to fix it, without overriding it all
over the place :(
Some unittests are added as a bonus, to make sure we behave well.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Luca Bigliardi [Wed, 5 Aug 2009 13:44:49 +0000 (14:44 +0100)]
Add nodes IPs informations to ssconf
Having a list of primary/secondary IPs of all the nodes in ssconf can be useful
for scripts/hooks which need to automatically configure network properties for
the whole cluster (e.g.: ipsec/netfilter rules) without relying on a
working DNS.
Signed-off-by: Luca Bigliardi <shammash@google.com>
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Wed, 5 Aug 2009 10:17:17 +0000 (11:17 +0100)]
serializer: fix a few docstrings
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Tue, 4 Aug 2009 16:22:27 +0000 (18:22 +0200)]
Use objects for blockdev_getmirrorstatus RPC call result
This patch changes the return type for backend.BlockdevGetmirrorstatus from
a list of tuples to a list of objects.BlockDevStatus instances.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 4 Aug 2009 16:21:13 +0000 (18:21 +0200)]
Use object for blockdev_find RPC call result
This patch changes the return type for backend.BlockdevFind to an object
(objects.BlockDevStatus). Before a tuple was used. Adding more values to
this tuple causes a lot of work. Converting the result to an object with
properties will make this a bit simpler.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Tue, 4 Aug 2009 11:31:45 +0000 (13:31 +0200)]
gnt-node physical-volumes: Add storage type parameter
This way the user can also show storage types other than lvm-pv.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 4 Aug 2009 11:30:32 +0000 (13:30 +0200)]
cmdlib: Fix parameters for storage.FileStorage
It wants a list of directories, not a string.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 30 Jul 2009 14:44:21 +0000 (16:44 +0200)]
Add “gnt-node modify-volume” command
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 30 Jul 2009 14:38:14 +0000 (16:38 +0200)]
cmdlib: Add opcode to modify storage unit fields
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 30 Jul 2009 14:37:29 +0000 (16:37 +0200)]
Add RPC calls to modify storage fields
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Tue, 4 Aug 2009 09:43:03 +0000 (11:43 +0200)]
storage: Add function to modify fields
This allows the “allocatable” flag on LVM PVs to be changed.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Tue, 4 Aug 2009 08:52:10 +0000 (10:52 +0200)]
Merge commit 'origin/branch-2.1' into feature/containers
* commit 'origin/branch-2.1': (66 commits)
Add automated disk repair changes to design doc
Add review script
Implement “gnt-node physical-volumes” command
Add new opcode to list physical volumes
storage: Use constants.py instead of local constants
storage: Fix semantics for directory size
Add “gnt-job watch” command
jqueue: Fix error when WaitForJobChange gets invalid ID
jqueue: Update message for cancelling running job
cmdlib: Change tasklet logging to debug level
rapi: Add /2/nodes/[node_name]/migrate resource
gnt-node: Use new opcode to migrate node
cmdlib: Add new opcode to migrate node
rapi: Add default parameter to _checkIntVariable
cmdlib: Add logging for tasklets
cmdlib: Fix tasklets handling if no tasklets are added
rapi: Add /2/[node_name]/evacuate resource
Add information about storage units framework
Add RPC calls for storage unit list
Add first implementation of generic storage unit framework
...
Michael Hanselmann [Mon, 3 Aug 2009 13:22:05 +0000 (15:22 +0200)]
Add automated disk repair changes to design doc
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 3 Aug 2009 13:24:31 +0000 (15:24 +0200)]
Add review script
I've been using this script for a while to update commits before
pushing them to the main repository. It copies all commits in a
range to another branch using git cherry-pick and starts an editor
to modify the Reviewed-by: line(s) for each commit. The script is
certainly not perfect, but it does the job.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 29 Jul 2009 16:01:44 +0000 (18:01 +0200)]
Implement “gnt-node physical-volumes” command
This command can be used to list all physical volumes on nodes.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 29 Jul 2009 16:01:06 +0000 (18:01 +0200)]
Add new opcode to list physical volumes
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 29 Jul 2009 13:57:20 +0000 (15:57 +0200)]
storage: Use constants.py instead of local constants
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 29 Jul 2009 13:52:00 +0000 (15:52 +0200)]
storage: Fix semantics for directory size
The actual directory size is "used" space, not the total space on
the filesystem.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 3 Aug 2009 09:25:06 +0000 (11:25 +0200)]
Merge branch 'next' into branch-2.1
* next:
Add “gnt-job watch” command
jqueue: Fix error when WaitForJobChange gets invalid ID
jqueue: Update message for cancelling running job
Michael Hanselmann [Fri, 31 Jul 2009 14:49:26 +0000 (16:49 +0200)]
Add “gnt-job watch” command
This command can be used to follow the output of a job. It's useful
together with the --submit parameter for other commands.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 31 Jul 2009 14:48:27 +0000 (16:48 +0200)]
jqueue: Fix error when WaitForJobChange gets invalid ID
When JobQueue.WaitForJobChange gets an invalid or no longer existing job ID it
tries to return job_info and log_entries, both of which aren't defined yet.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 31 Jul 2009 12:55:45 +0000 (14:55 +0200)]
jqueue: Update message for cancelling running job
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 31 Jul 2009 11:34:12 +0000 (13:34 +0200)]
cmdlib: Change tasklet logging to debug level
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 30 Jul 2009 16:02:33 +0000 (18:02 +0200)]
rapi: Add /2/nodes/[node_name]/migrate resource
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 30 Jul 2009 15:53:10 +0000 (17:53 +0200)]
gnt-node: Use new opcode to migrate node
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 30 Jul 2009 15:52:49 +0000 (17:52 +0200)]
cmdlib: Add new opcode to migrate node
It migrates all primary instances from the node to their secondaries.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 30 Jul 2009 16:00:31 +0000 (18:00 +0200)]
rapi: Add default parameter to _checkIntVariable
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 30 Jul 2009 15:25:50 +0000 (17:25 +0200)]
cmdlib: Add logging for tasklets
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 30 Jul 2009 15:02:46 +0000 (17:02 +0200)]
cmdlib: Fix tasklets handling if no tasklets are added
If no tasklets are added, self.tasklets evaluates to None. The LU base
class will throw an exception because it thinks the derived class doesn't
implement the right methods.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 30 Jul 2009 10:46:23 +0000 (12:46 +0200)]
rapi: Add /2/[node_name]/evacuate resource
This can be used to evacuate a node.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 30 Jul 2009 09:40:02 +0000 (11:40 +0200)]
Add information about storage units framework
This updates the 2.1 design document with storage units framework information.
Signed-off-by: Iustin Pop <iustin@google.com>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Wed, 29 Jul 2009 12:24:59 +0000 (14:24 +0200)]
Add RPC calls for storage unit list
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 29 Jul 2009 10:49:53 +0000 (12:49 +0200)]
Add first implementation of generic storage unit framework
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 28 Jul 2009 17:29:47 +0000 (19:29 +0200)]
utils: Add functions to calc directory size and free space on filesystem
These will be used by the new storage unit framework.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Fri, 24 Jul 2009 13:31:34 +0000 (15:31 +0200)]
Build HTML from Ganeti 2.1 design
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Sat, 25 Jul 2009 18:04:03 +0000 (20:04 +0200)]
Collapse SSL key checking/overriding for daemons
Signed-off-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Thu, 23 Jul 2009 15:46:37 +0000 (16:46 +0100)]
Collapse daemon's main function
With three ganeti daemons, and one or two more coming, the daemon's main
function started becoming too much cut&pasted code. Collapsing most of
it in a daemon.GenericMain function. Some more code could be collapsed
between the two http-based daemons, but since the new daemons won't be
http-based we won't do it right now.
As a bonus a functionality for overriding the network port on the
command line for all network based nodes is added.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Thu, 23 Jul 2009 16:23:06 +0000 (17:23 +0100)]
Remove <DAEMON>_PID constants
The <DAEMON>_PID constants were created to reference a daemon pid file,
but actually contain a daemon's name, because the various functions that
work with pidfiles abstract the filename from the daemon name
themselves. Removing the constants and using the actual daemon name
constants in their place.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 23 Jul 2009 16:12:43 +0000 (17:12 +0100)]
Slightly abstract the daemon logfile lookup
The original LOG_<DAEMON_NAME> constants for daemon logfiles are gone.
In their place there is a DAEMONS_LOGFILES dict, indexed by daemon name.
This is a minor change with the objective to uniform most of the
daemon's main() functions code, which is very similar one to the other.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 23 Jul 2009 13:11:54 +0000 (14:11 +0100)]
Move rapi to GetDaemonPort
Currently rapi is the only daemon which accepts a port option, rather
than querying its own port from services, and failing back to the
default if not found. Changing this to conform to what other daemons do.
Also update the ganeti-rapi(8) manpage
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Wed, 22 Jul 2009 16:57:26 +0000 (17:57 +0100)]
Change GetNodeDaemonPort to GetDaemonPort in utils
GetNodeDaemonPort is used to lookup the node daemon port in the services
file, and if not found to return the default one. We make it a generic
function, which accepts the daemon name in input, so that it can be used
by confd as well, to lookup its own udp port.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Fri, 24 Jul 2009 12:01:49 +0000 (14:01 +0200)]
Merge branch 'next' into branch-2.1
* next:
lvmstrap: Change diskinfo to use GenerateTable
Get rid of constants.RAPI_ENABLE
Remove references to utils.debug
ganeti-rapi, replace hardcoded exit value
Add the bind-address option to ganeti-rapi
noded: Abstract hard-coded sys.exit value
Add an example "ethers" hook
burnin: move batch init/commit into a decorator
burnin: move instance alive checks to a decorator
burnin: Implement retryable operations
Ignore vim swap files
burnin: fix removal errors hiding real errors
Stephen Shirley [Thu, 23 Jul 2009 17:14:20 +0000 (19:14 +0200)]
lvmstrap: Change diskinfo to use GenerateTable
This way the produced table is formatted nicely.
Signed-off-by: Stephen Shirley <diamond@google.com>
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 23 Jul 2009 13:41:02 +0000 (14:41 +0100)]
Get rid of constants.RAPI_ENABLE
This constant is unused, except in qa. Removing it since it's always True.
This patch also removes the unused qa_rapi.PrintRemoteAPIWarning
function, and removes a comment about temporary constants "until we have
cluster parameters".
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Thu, 23 Jul 2009 08:43:57 +0000 (10:43 +0200)]
cmdlib: Add __init__ to Tasklet class
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Thu, 23 Jul 2009 08:58:33 +0000 (09:58 +0100)]
Remove references to utils.debug
Various modules set it to True when called in debugging mode, but the
utils module supports no such global.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 23 Jul 2009 07:55:53 +0000 (08:55 +0100)]
ganeti-rapi, replace hardcoded exit value
substitute exit(1) with exit(constants.EXIT_FAILURE).
Also fix a wrongly indented line.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 23 Jul 2009 07:48:14 +0000 (08:48 +0100)]
Add the bind-address option to ganeti-rapi
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Tue, 21 Jul 2009 17:24:47 +0000 (19:24 +0200)]
cmdlib: Move LUMigrateInstance functionality to tasklet
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Tue, 21 Jul 2009 12:28:28 +0000 (14:28 +0200)]
gnt-node: Use new opcode to evacuate nodes
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 22 Jul 2009 17:31:15 +0000 (19:31 +0200)]
Add new opcode to evacuate nodes
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Tue, 21 Jul 2009 16:17:20 +0000 (18:17 +0200)]
cmdlib: Convert _DiskReplacer to tasklet
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Tue, 21 Jul 2009 15:45:45 +0000 (17:45 +0200)]
cmdlib: Function to get all secondary instances on a certain node
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 22 Jul 2009 17:07:54 +0000 (18:07 +0100)]
noded: Abstract hard-coded sys.exit value
On machines without the ssl file noded exists '5'.
Changing this to constants.EXIT_NOTCLUSTER.
Also utils.GetNodeDaemonPort hasn't risen errors.ConfigurationError for
a while, so removing that try/except block.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Wed, 22 Jul 2009 13:57:09 +0000 (15:57 +0200)]
cmdlib: Add tasklet support to logical unit base class
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Wed, 22 Jul 2009 11:01:20 +0000 (13:01 +0200)]
cmdlib: Add tasklet base class
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 21 Jul 2009 12:53:40 +0000 (13:53 +0100)]
Add an example "ethers" hook
This hook can be used to update /etc/ethers with instance's mac
addresses. A dhcp server on the nodes can then serve to the instances
their correct address. (This has been tested with dnsmasq's dhcp
implementation)
Signed-off-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Thu, 16 Jul 2009 14:27:17 +0000 (16:27 +0200)]
ganeti-confd design doc
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 21 Jul 2009 09:55:19 +0000 (11:55 +0200)]
burnin: move batch init/commit into a decorator
Many burnin steps initialize the batch queue at the beginning and commit
it at the end of their operation. This patch moves this code to a
decorator, in order to reduce redundant code.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Iustin Pop [Tue, 21 Jul 2009 09:41:12 +0000 (11:41 +0200)]
burnin: move instance alive checks to a decorator
Many burn steps to a manual check of instance aliveness, via duplicate
code. This patch moves this code to a decorator.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 21 Jul 2009 08:53:27 +0000 (10:53 +0200)]
burnin: Implement retryable operations
Some burnin steps are idempotent: e.g. reinstalling an instance (from
burning p.o.v.) can be done multiple times without any side-effects that
would affect later burnin steps. As such, failing the whole burnin
process due a reinstall failure is undesirable.
This patch modifies burnin by marking each opcode (in case of individual
execution) and job set retryable or not. Retryable actions will be
retried up to a number of times, after which we give up and return
failure.
One side-effect is that in case of full-failure in retryable job sets we
lose the original exception (but we do log its string format), so we
have a little bit less information in this case.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Thu, 16 Jul 2009 15:30:44 +0000 (17:30 +0200)]
Generate a shared HMAC key at cluster init time
This key is shared on all nodes (via cmdlib._RedistributeAncillaryFiles)
and will be used for HMAC authentication of confd messages.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 20 Jul 2009 15:49:55 +0000 (17:49 +0200)]
Fix unittests broken by commit
2bb5c9115f
File "../test/ganeti.hooks_unittest.py", line 239, in setUp
self.lu = FakeLU(FakeProc(), self.op, self.context, None)
File "…/ganeti/cmdlib.py", line 92, in __init__
self.LogStep = processor.LogStep
AttributeError: FakeProc instance has no attribute 'LogStep'
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 20 Jul 2009 15:38:01 +0000 (17:38 +0200)]
cmdlib: Move code doing disk replacements into separate class
This class will be used for a new opcode to evacuate nodes.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 14 Jul 2009 13:23:36 +0000 (15:23 +0200)]
cmdlib: Pass config and rpc objects directly to IAllocator
Before IAllocator would access them using “self.lu.cfg” and “self.lu.rpc”.
It shouldn't know about the internals of the LU.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 20 Jul 2009 11:26:39 +0000 (13:26 +0200)]
Ignore vim swap files
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Mon, 20 Jul 2009 10:29:55 +0000 (12:29 +0200)]
Fix backend import errors from GetHypervisorClass
The merge of commit 360b0dc into branch-2.1 broke import of backend,
since it uses hypervisor.GetHypervisor() which returns an instance of
the hypervisor. Some of the hypervisors create directories at init time,
thus the import of backend failed due this chain if it's not done on a
(proper) ganeti node, such as during unittest time.
This patch adds in hypervisor a GetHypervisorClass() function, which
returns the class not the instance of the hypervisor, and uses that in
_BuildUploadFiles(). The existing GetHypervisor is then changed to use
this function.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sun, 19 Jul 2009 18:34:08 +0000 (20:34 +0200)]
burnin: fix removal errors hiding real errors
A long-standing bug in burnin makes errors during the removal phase
(e.g. because an import has failed, or because the initial creation has
failed) hide the original error.
This patch suppresses removal errors if we are already in ‘has_err’
mode, and otherwise it displays them normally.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 19 Jul 2009 18:26:48 +0000 (20:26 +0200)]
Merge branch 'next' into branch-2.1
Conflicts:
lib/backend.py: non-trivial conflict but easy to solve
Iustin Pop [Sun, 19 Jul 2009 13:27:12 +0000 (15:27 +0200)]
backend: Only build once the list of upload files
The list of upload files is built currently at every UploadFile() call.
This patch moves it to a separate variable which is initialized only
once.
This won't make much difference but I regard it as cleanup.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 19 Jul 2009 16:47:09 +0000 (18:47 +0200)]
Merge commit 'origin/next' into branch-2.1
Conflicts:
lib/cli.py: trivial extra empty line
Iustin Pop [Wed, 10 Jun 2009 15:37:51 +0000 (17:37 +0200)]
Fix gnt-instance reinstall
Commit
55efe6dabe48e5c37dc1ff6099e0bb8afde7a468 "Convert instance
reinstall to multi instance model" actually broke instance reinstall for
single-instance cases. This one-liner fixes it.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit
b6e243ab010d1df2b6c211b9edc9fe1978e52391)
Iustin Pop [Sun, 19 Jul 2009 14:40:57 +0000 (16:40 +0200)]
Fix a couple of epydoc warnings
It seems epydoc needs fully-qualified references, and doesn't deal with
relative ones (not even in the current module) if there are any
ambiguities.
There are other epydoc warnings, in the rapi docstrings, but those are
left as-is as they're removed in 2.1.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 19 Jul 2009 01:45:45 +0000 (03:45 +0200)]
job queue: fix loss of finalized opcode result
Currently, unclean master daemon shutdown overwrites all of a job's
opcode status and result with error/None. This is incorrect, since the
any already finished opcode(s) should have their status and result
preserved, and only not-yet-processed opcodes should be marked as
‘error’. Cancelling jobs between opcodes does the same (but this is not
allowed currently by the code, so it's not as important as unclean
shutdown).
This patch adds a new _QueuedJob function that only overwrites the
status and result of finalized opcodes, which is then used in job queue
init and in the cancel job functions. The patch also adds some comments
and a new set constants in constants.py highlighting the finalized vs.
non-finalized opcode statuses.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sat, 18 Jul 2009 23:51:04 +0000 (01:51 +0200)]
Switch gnt-debug submit-job to JobExecutor
Currently gnt-debug submits jobs individually, but in 2.1 JobExecutor
uses the optimized SubmitManyJobs luxi call and as such should be used
whenever multiple jobs need to be submitted.
This patch converts gnt-debug submit-job to use it and also removes an
extra empty line in the JobExecutor class.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Fri, 22 May 2009 12:27:46 +0000 (14:27 +0200)]
Convert instance reinstall to multi instance model
This patch converts ‘gnt-instance reinstall’ from single-instance to
multi-instance model; since this is dangerours, it's required to pass
“--force --force-multiple” to skip the confirmation.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit
55efe6dabe48e5c37dc1ff6099e0bb8afde7a468)
Iustin Pop [Fri, 22 May 2009 11:01:35 +0000 (13:01 +0200)]
gnt-instance batch-create: use the job executor
This small patch changed the batch create functionality to use the job
executor instead of single-job submits.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit
d4dd4b74a786cd0f31e5fc530f140aaf438c68e7)
Iustin Pop [Fri, 22 May 2009 10:25:31 +0000 (12:25 +0200)]
Modify cli.JobExecutor to use SubmitManyJobs
This patch changes the generic "multiple job executor" to use the many
jobs submit model, which automatically makes all its users use the new
model.
This makes, for example, startup/shutdown of a full cluster much more
logical (all the submitted job IDs are visible fast, and then waiting
for them proceeds normally).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit
23b4b983afc9b9e81d558f06e4e0cde53703e575)
Iustin Pop [Thu, 21 May 2009 16:02:42 +0000 (18:02 +0200)]
Add a luxi call for multi-job submit
As a workaround for the job submit timeouts that we have, this patch
adds a new luxi call for multi-job submit; the advantage is that all the
jobs are added in the queue and only after the workers can start
processing them.
This is definitely faster than per-job submit, where the submission of
new jobs competes with the workers processing jobs.
On a pure no-op OpDelay opcode (not on master, not on nodes), we have:
- 100 jobs:
- individual: submit time ~21s, processing time ~21s
- multiple: submit time 7-9s, processing time ~22s
- 250 jobs:
- individual: submit time ~56s, processing time ~57s
run 2: ~54s ~55s
- multiple: submit time ~20s, processing time ~51s
run 2: ~17s ~52s
which shows that we indeed gain on the client side, and maybe even on
the total processing time for a high number of jobs. For just 10 or so I
expect the difference to be just noise.
This will probably require increasing the timeout a little when
submitting too many jobs - 250 jobs at ~20 seconds is close to the
current rw timeout of 60s.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit
2971c9132b8b798178921a389b18d893edec06fb)
Iustin Pop [Sun, 19 Jul 2009 02:12:11 +0000 (04:12 +0200)]
job queue: fix interrupted job processing
If a job with more than one opcodes is being processed, and the master
daemon crashes between two opcodes, we have the first N opcodes marked
successful, and the rest marked as queued. This means that the overall
jbo status is queued, and thus on master daemon restart it will be
resent for completion.
However, the RunTask() function in jqueue.py doesn't deal with
partially-completed jobs. This patch makes it simply skip such opcodes.
An alternative option would be to not mark partially-completed jobs as
QUEUED but instead RUNNING, which would result in aborting of the job at
restart time.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 19 Jul 2009 02:01:16 +0000 (04:01 +0200)]
Fix an error path in job queue worker's RunTask
In case the job fails, we try to set the job's run_op_idx to -1.
However, this is a wrong variable, which wasn't detected until the
__slots__ addition. The correct variable is run_op_index.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 19 Jul 2009 02:58:23 +0000 (04:58 +0200)]
Merge commit 'origin/branch-2.1' into feature/containers
Iustin Pop [Fri, 17 Jul 2009 15:16:51 +0000 (17:16 +0200)]
Add __slots__ on objects in jqueue
Adding slots to _QueuedOpCode decreases memory usage (of these objects)
by roughly four times. It is a lesser change for _QueuedJobs.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>