Guido Trotter [Mon, 8 Jun 2009 10:32:51 +0000 (11:32 +0100)]
Cluster: add nicparams, and update them on upgrade
This also migrates the default bridge from the cluster object to the nic
params, at load time. Since we don't support changing the default bridge
after cluster init, this is ok for now. In the future we'll make
gnt-cluster init --bridge to the right thing, after the nic parameter
implementation is finished.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 8 Jun 2009 11:36:14 +0000 (12:36 +0100)]
Add NIC.CheckParameterSyntax
This function will be used to check the NIC parameters for validity.
Unittests are included.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 5 Jun 2009 14:38:38 +0000 (15:38 +0100)]
nic parameters: constants
Introducing the constants used for implementing nic parameters in
Ganeti, according to the 2.1 design.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 8 Jun 2009 10:27:26 +0000 (11:27 +0100)]
Abstract Param upgrade from cluster.UpgradeConfig
A new UpgradeGroupedParams is used to upgrade all the profiles for one
parameter filling in the default values, or creating the whole dict
anew, should it be missing. This is used only for beparams, currently,
but will be used at least for nicparams and diskparams as well.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 8 Jun 2009 09:58:59 +0000 (10:58 +0100)]
Change BEGR_DEFAULT to PP_DEFAULT
This way the same constant can represent the default profile also for
nic, disk and OS parameters.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 5 Jun 2009 14:57:22 +0000 (15:57 +0100)]
Move FillDict at module level
This way it can be also used by scripts and other object types.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 8 Jun 2009 13:23:52 +0000 (14:23 +0100)]
Fix a typo in InitCluster
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Mon, 8 Jun 2009 15:48:48 +0000 (17:48 +0200)]
Convert call_blockdev_removechildren to new result
This patch converts blockdev_removechildren to new result type and
slightly changes a message in addchildren to match this (paired)
function.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 8 Jun 2009 15:38:31 +0000 (17:38 +0200)]
Convert call_blockdev_addchildren to new result
This patch converts the blockdev_addchildren rpc call to the new result
format.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 8 Jun 2009 15:30:44 +0000 (17:30 +0200)]
Convert rpc call_blockdev_rename to (status, data)
This small patch converts the call_blockdev_rename to the new result
type.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 8 Jun 2009 15:20:56 +0000 (17:20 +0200)]
A small makefile rule to create a TAGS file
This helps emacs users ☺
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 4 Jun 2009 12:48:35 +0000 (13:48 +0100)]
2.1 design: non bridged instances support
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 27 May 2009 15:23:44 +0000 (16:23 +0100)]
2.1 design: disk/net parameters
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 28 May 2009 09:25:36 +0000 (10:25 +0100)]
Upgrade be/hv params with default values
From time to time we're adding new be or hv parameters. With this patch
missing parameters get set to the default value when loading the cluster
object. This patch version also considers the case when hv/be params
don't exist at all, and fixes a broken unit test triggered in that
case.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 28 May 2009 09:12:58 +0000 (10:12 +0100)]
Add cluster-init --no-etc-hosts parameter
If --no-etc-hosts is passed in at cluster init time we set a new
parameter in the cluster's object to false, and avoid adding nodes to
the hosts file. The UpgradeConfig function is used to set the value to
True, when upgrading from an old configuration version.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 28 May 2009 09:10:11 +0000 (10:10 +0100)]
objects: add configuration upgrade system
Add a very basic configuration update mechanism to objects.
An object can define the UpgradeConfig method, which will be called at
init time, and use it to fill in missing defaults in the configuration.
In the future we may want to make it more complex, for example adding
the config version, but for now a basic solution will do.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 28 May 2009 10:25:02 +0000 (11:25 +0100)]
Convert UploadFile (and its callers) to new rpc
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 28 May 2009 10:05:22 +0000 (11:05 +0100)]
UploadFile: allow ancillary files
Currently UploadFile is restricted to a static set of files, and thus
gnt-cluster redist-conf (silently) fails to upload all config files.
With this patch we add the new static files we distribute, and all
hypervisor-provided ancillary files.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 26 May 2009 17:41:19 +0000 (19:41 +0200)]
Add a node powercycle command
This (somewhat big) patch adds support for remotely rebooting the nodes
via whatever support the hypervisor has for such a concept.
For KVM/fake (and containers in the future) this just uses sysrq plus a
‘reboot’ call if the sysrq method failed. For Xen, it first tries the
above, and then Xen-hypervisor reboot (we first try sysrq since that
just requires opening a file handle, whereas xen reboot means launching
an external utility).
The user interface is:
# gnt-node powercycle node5
Are you sure you want to hard powercycle node node5?
y/[n]/?: y
Reboot scheduled in 5 seconds
The node reboots hopefully after sending the reply. In case the clock is
broken, “time.sleep(5)” might take ages (but then I suspect SSL
negotiation wouldn't work).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Tue, 26 May 2009 15:00:16 +0000 (17:00 +0200)]
Add a new CONFIRM_OPT option to cli.py
Today we are not very consistent as to what ‘--force’ represents:
sometimes confirmation, sometimes forcing a possible dangerous option,
etc.
This patch adds a new ‘--yes’ option that should be used for all simple
confirmations of genre “yes, I really want to remove the instance”.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Fri, 22 May 2009 14:35:46 +0000 (15:35 +0100)]
IsNormAbsPath and users, use "normalized" term
We used to refer to normalized paths as "normal" which might be
confusing. This fixes the syntax in all current IsNormAbsPath users and
in the docstring.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 22 May 2009 12:48:27 +0000 (13:48 +0100)]
Hypervisors: make absolute path checking strict
Use the new utils.IsAbsNormPath function, rather than just os.path.isabs
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 22 May 2009 12:35:30 +0000 (13:35 +0100)]
Add utils.IsNormAbsPath function
Currently most of the time we check for absolute path, but that doesn't
protect us from some invalid paths. In some places we should be more
strict, and this function should help us to.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Fri, 22 May 2009 12:27:46 +0000 (14:27 +0200)]
Convert instance reinstall to multi instance model
This patch converts ‘gnt-instance reinstall’ from single-instance to
multi-instance model; since this is dangerours, it's required to pass
“--force --force-multiple” to skip the confirmation.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Fri, 22 May 2009 11:01:35 +0000 (13:01 +0200)]
gnt-instance batch-create: use the job executor
This small patch changed the batch create functionality to use the job
executor instead of single-job submits.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Fri, 22 May 2009 10:25:31 +0000 (12:25 +0200)]
Modify cli.JobExecutor to use SubmitManyJobs
This patch changes the generic "multiple job executor" to use the many
jobs submit model, which automatically makes all its users use the new
model.
This makes, for example, startup/shutdown of a full cluster much more
logical (all the submitted job IDs are visible fast, and then waiting
for them proceeds normally).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Thu, 21 May 2009 15:23:55 +0000 (16:23 +0100)]
KVM: add the network script to the ancillary files
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 21 May 2009 15:15:50 +0000 (16:15 +0100)]
Xen: add ancillary files
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 20 May 2009 13:02:28 +0000 (14:02 +0100)]
_RedistributeAncillaryFiles: add hypervisor files
Each hypervisor can declare additional files to be shipped to all nodes.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 20 May 2009 11:18:32 +0000 (12:18 +0100)]
_RedistributeAncillaryFiles function
This function is shared between AddNode and RedistributeConfig, and used
to redistribute additional files which are inherently part of the
cluster configuration.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 13 May 2009 10:33:57 +0000 (11:33 +0100)]
Remove the HTS_COPY_VNC_PASSWORD constant/feature
Currently just for xen-hvm we copy the vnc password on node-add. This
will be changed for 2.1 with a more advanced gnt-cluster redist-conf
functionality which is going to be used by node-add as well.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 21 May 2009 15:21:27 +0000 (16:21 +0100)]
KVM: replace hardcoded network script path
Currently the kvm automatic network scripts accepts to be overridden by
an user supplied /etc/ganeti/kvm-vif-bridge script. We keep this
functionality but move the hardcoded path to a constant, dependent also
on SYSCONFDIR.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Thu, 21 May 2009 16:02:42 +0000 (18:02 +0200)]
Add a luxi call for multi-job submit
As a workaround for the job submit timeouts that we have, this patch
adds a new luxi call for multi-job submit; the advantage is that all the
jobs are added in the queue and only after the workers can start
processing them.
This is definitely faster than per-job submit, where the submission of
new jobs competes with the workers processing jobs.
On a pure no-op OpDelay opcode (not on master, not on nodes), we have:
- 100 jobs:
- individual: submit time ~21s, processing time ~21s
- multiple: submit time 7-9s, processing time ~22s
- 250 jobs:
- individual: submit time ~56s, processing time ~57s
run 2: ~54s ~55s
- multiple: submit time ~20s, processing time ~51s
run 2: ~17s ~52s
which shows that we indeed gain on the client side, and maybe even on
the total processing time for a high number of jobs. For just 10 or so I
expect the difference to be just noise.
This will probably require increasing the timeout a little when
submitting too many jobs - 250 jobs at ~20 seconds is close to the
current rw timeout of 60s.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 20 May 2009 16:19:44 +0000 (18:19 +0200)]
Doc fixes for RAPI
After moving the documentation from the .py files to .rst, we had some
cleanups to do.
This fixes the formatting of the comments, improves them a little, and
removes deprecated info (DOC_URI) from the python source.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Wed, 20 May 2009 13:16:16 +0000 (14:16 +0100)]
Merge branch 'master' into branch-2.1
Iustin Pop [Tue, 19 May 2009 13:01:17 +0000 (15:01 +0200)]
Release 2.0rc5
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 18 May 2009 19:15:41 +0000 (21:15 +0200)]
Move to data-based hvparam checks instead of code
Currently the hypervisor parameters are checked using hard-coded snippets in
each hypervisor. However, most parameter checks fall into three cases:
- file check
- directory check
- string value in a set
And the remaining ones are checked using simple functions.
This patch moves to a declarative-style for these parameter checks; in
hv_base we add the necessary infrastructure for these checks, and the
above common cases.
This translates into complete removal of the Check/Verify functions for
the Xen hypervisors, and a drastic reduction for the KVM one (which has
inter-parameter dependencies and thus can't use a simple table).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Tue, 19 May 2009 13:43:54 +0000 (15:43 +0200)]
Merge commit 'origin/next' into branch-2.1
* commit 'origin/next': (25 commits)
Move more hypervisor strings into constants
Add -H/-B startup parameters to gnt-instance
call_instance_start: add optional hv/be parameters
Fix gnt-job list argument handling
Instance reinstall: don't mix up errors
Don't check memory at startup if instance is up
gnt-cluster modify: fix --no-lvm-storage
LUSetClusterParams: improve volume group removal
gnt-cluster info: show more cluster parameters
LUQueryClusterInfo: return a few more fields
Add the new DRBD test files to the Makefile
Remove some superfluous imports
Make Python interpreter selectable for test scripts
Pass optional arguments to the daemons
ganeti.initd: include defaults file, if present
Fix ;; indentation in the main initd loop
Avoid DeprecationWarning on Python >= 2.6
ganeti-noded: add bind address option
Fix compatibility with DRBD 8.3
Fix compatibility with DRBD 8.2
...
Iustin Pop [Mon, 18 May 2009 19:15:41 +0000 (21:15 +0200)]
Move more hypervisor strings into constants
This patch adds constants for the mouse and boot order strings; while
there are still some issues remaining, we're trying to cleanup hardcoded
strings from the hypervisors.
Since the formatting of frozensets is currently wrong, we also add an
utility function for this and change all the error messages to use it.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Tue, 19 May 2009 11:23:31 +0000 (13:23 +0200)]
watcher: try to restart the master if down
Bugs in either our code or in associated libraries can bring the master daemon
down, and this (due to the 2.0 architecture) stops all work on the cluster.
Since the watcher already does periodic checks on the cluster, we modify
it to try to start the master automatically in case of failures to
connect. This will be tried only once per cycle.
Also, in this case, we modify the code so that the watcher status file
is not updated - its timestamp will reflect thus the time of last
successful connection to the master.
Side note: the except errors.ConfigurationError part could be cleaned
up, since in 2.0 we don't usually get that directly, and if we do it's
an error and we shouldn't touch the file anyway; but that is not a rc5
change.
Signed-off-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 19 May 2009 09:35:56 +0000 (11:35 +0200)]
IAllocator: export total disk size for instances
This patch adds for current instance a ‘disk_space_total’ key, similar
to the key for the new instance in case of new allocations.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 18 May 2009 17:21:44 +0000 (19:21 +0200)]
Add -H/-B startup parameters to gnt-instance
This patch modifies the start instance script, opcode and logical unit
to support temporary startup parameters.
Different from 1.2, where only the kernel arguments were supporting
changes (and thus xen-pvm specific), this version supports changing all
hypervisor and backend parameters (with appropriate checks).
This is much more flexible, and allows for example:
- start with different, temporary kernel
- start with different memory size
Note: in later versions, this should be extended to cover disk
parameters as well (e.g. start with drbd without flushes, start with
drbd in async mode, etc.).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 18 May 2009 16:39:29 +0000 (18:39 +0200)]
call_instance_start: add optional hv/be parameters
This patch modifies the rpc.call_instance_start - the master side - to
take optional hv/be parameters. The noded side is unchanged and
oblivious to the change.
This will allow implementation of single-user capability and such on
startup (temporary, as opposed to permanent).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Mon, 18 May 2009 15:57:25 +0000 (16:57 +0100)]
Fix gnt-job list argument handling
Currently QueryJob returns "None" when a wrong job ID is passed.
Handle this in gnt-job list, by printing an error for each wrong job,
and still giving output for all the jobs which actually do exist.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 15 May 2009 08:44:13 +0000 (09:44 +0100)]
Instance reinstall: don't mix up errors
If the remote info rpc call fails we can't assume that the instance is
up.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Fri, 15 May 2009 08:42:55 +0000 (09:42 +0100)]
Don't check memory at startup if instance is up
Signed-off-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Thu, 14 May 2009 12:47:49 +0000 (13:47 +0100)]
2.1 design: add VNC console password changes
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 13 May 2009 12:49:11 +0000 (13:49 +0100)]
2.1 design: OS parameters
Initial design for the OS parameter changes proposed for 2.1.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Thu, 14 May 2009 15:00:32 +0000 (17:00 +0200)]
Move HVM's device_model to a hypervisor parameter
This moves yet another hardcoded value to a hypervisor parameter. I
removed the 64/32 difference as it doesn't seem valid to me - it's more
of a local site config rather than arch config.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Thu, 14 May 2009 12:58:32 +0000 (14:58 +0200)]
Implement the KERNEL_PATH parameter for xen-hvm
For the xen-hvm hypervisor, the KERNEL_PATH parameter is needed but
today is hardcoded to a constants in the xen hypervisor library (argh!).
This patch moves this to a hypervisor constant with the default value
being the current hardcoded path. This will allow cluster/instance
customisation based on the installed xen version.
This should fix Debian bug #528618.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Wed, 13 May 2009 11:15:52 +0000 (12:15 +0100)]
2.1 design: propose redistribute config changes
This patch proposes a mini-design to improve redistribute-config and
integrate it better with other logical units.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 12 May 2009 17:34:42 +0000 (18:34 +0100)]
gnt-cluster modify: fix --no-lvm-storage
Currently doing a gnt-cluster-modify --no-lvm-storage is silently
ignored, as it passes a None value in vg_name, which is the same as not
modifying that parameter. Explicitely set the passed value to '', so the
non-true not-None value can be evaluate to actually remove a volume
group.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 12 May 2009 17:24:40 +0000 (18:24 +0100)]
LUSetClusterParams: improve volume group removal
Currently LUSetClusterParams will remove the volume group if the vg_name
field passed in is not true, but not None. Setting the target volume
group to False or the empty string, though, is a bad idea because it's
not a boolean value, and at cluster init we set it to None if
--no-lvm-storage is passed. With this fix we handle '' (or any other
non-None false value) as the "unset" value, but actually store None in
the config.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 12 May 2009 17:08:06 +0000 (18:08 +0100)]
gnt-cluster info: show more cluster parameters
Even if we cannot modify all of them, they are useful information about
the current cluster.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 12 May 2009 17:00:48 +0000 (18:00 +0100)]
LUQueryClusterInfo: return a few more fields
Some fields can be set at cluster init, and perhaps even modifed with
SetClusterParams but there's no way to know them. With this patch we
export them in the cluster info query.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 13 May 2009 10:50:39 +0000 (11:50 +0100)]
Specify another type of core changes
If a change modifies the way all/most LUs work it should also be
considered core.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Tue, 12 May 2009 13:34:10 +0000 (14:34 +0100)]
KVM: Abstract runtime file removal in a function
This removes some code which was duplicated in shutdown and migrate.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Sat, 9 May 2009 21:18:08 +0000 (23:18 +0200)]
Some small doc updates
We change some formatting to sphinx-specific, to show how the
documentation can be improved.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sat, 9 May 2009 21:18:07 +0000 (23:18 +0200)]
Move the glossary to a separate file
Currently we have an insignificant glossary at the end of the design-2.0
document. This patch moves it to a separate file with the goal that it
will grow and all files can refer to it.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Tue, 12 May 2009 11:11:00 +0000 (12:11 +0100)]
KVMHypervisor: return memory and cpus as integers
Currently the KVM hypervisor returns strings for the memory and cpu
values, while the xen hypervisor returns integers. Making this uniform
converting the values to integers in KVM as well.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 12 May 2009 11:07:18 +0000 (12:07 +0100)]
LUSetInstanceParam: don't assume memory is integer
LUSetInstanceParam currently assumes that the 'memory' value of a
call_instance_info result is an integer, while the rest of the code
explicitely converts it to int(). Converting it to int works around a
bug which prevents changing the memory allocation of a live instance if
the remote call returns the memory in string format.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Sat, 9 May 2009 21:18:06 +0000 (23:18 +0200)]
Switch the documentation to sphinx
This big patch converts the documentation build system to sphinx
(http://sphinx.pocoo.org/). Since that uses reStructuredText sources
too, there is no change (yet) in the documents themselves, just in the
build system.
As before, the docs are pre built by the maintainer, and the end-user
doesn't need sphinx or other rst tools to build the docs. Note that we
are not distributing PDFs, so building that will require the tools.
The docs will be stored under doc/html and the build system also need an
extra directory doc/build. These are considered (by automake)
maintainer-related objects and are removed at maintainer-clean time.
The patch also fixes some small issues: add a docpng variable, add
doc/api (also generated by maintainer) in maintainer-clean-local, etc.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sat, 9 May 2009 21:18:05 +0000 (23:18 +0200)]
Convert from auto-generated RAPI docs to static
This patch removes the autogeneration of the RAPI docs from the code
(based on docstrings) and moves the current autogenerated output to
the rapi.rst file.
The reasons behind this are multiple:
- the build system becomes a little more simple (this could have been
achieved also by distributing the built documentation, though)
- it's hard to actually write documentation in docstrings; you have to
fit restructured text inside the docstrings, and this results in
not really nice output
- even by being close to the code, the documentation manages to get
out of sync (not paying attention to docstrings)
This will also help with the move to sphinx.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sat, 9 May 2009 21:48:43 +0000 (23:48 +0200)]
Add the new DRBD test files to the Makefile
These were forgotten in commit
01e2ce3a6e4ca68983f50dedaddd0d0fc7b77026,
and caused “make distcheck” to fail.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 11 May 2009 13:43:34 +0000 (15:43 +0200)]
Fix QA and documentation about no initrd case
In Ganeti 1.2, “none” was used to signify no initrd. In 2.0 we have
changed to “no_” as a prefix (i.e. “-H no_initrd_path”) and thus we
document in the manpage this.
The QA suite is changed accordingly.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 11 May 2009 13:30:14 +0000 (15:30 +0200)]
Remove an unused function
The _TransformPath function is not used anymore in 2.0, let's remove it.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Tim Boring [Sun, 10 May 2009 14:27:04 +0000 (10:27 -0400)]
Exporting the instance network_port on the RAPI
Patch for adding network_port to the instance attributes exported by the
RAPI.
[iustin@google.com: slightly changed the formatting]
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Tim Boring [Fri, 8 May 2009 18:10:42 +0000 (14:10 -0400)]
Minor patch to rapi documentation
Minor patch to clarify the URL necessary for accessing the RAPI.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Sat, 9 May 2009 20:30:14 +0000 (22:30 +0200)]
Small doc change in README
The version is 2.0, and we don't build PDFs by default, only HTML
files.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Carlos Valiente [Tue, 5 May 2009 13:17:52 +0000 (14:17 +0100)]
Remove some superfluous imports
This is for Python 2.6 compatibility.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Carlos Valiente [Tue, 5 May 2009 15:12:48 +0000 (16:12 +0100)]
Make Python interpreter selectable for test scripts
The Python interpreter used to run the test cases is hard-coded to be
/usr/bin/python. If we use the first one from $PATH instead, it is
much easier to test ganeti with other Python versions.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 5 May 2009 08:45:43 +0000 (10:45 +0200)]
Inform the OS create script of reinstalls
Sometimes reinstalls are slightly different than new installs. For
example certain partitions may need to be preserved accross reinstalls.
In order to do that on a per-os basis we pass in the INSTANCE_REINSTALL
variable to inform the create script about when a reinstall is
happening.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 5 May 2009 10:07:06 +0000 (12:07 +0200)]
Add initial 2.1 design doc
This document contains a skeleton for the 2.1 design process.
For now it just has introductory paragraphs and a structure for the
various areas' design, but some sections still don't have a text, as
we're still in the early design phases.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 5 May 2009 13:58:49 +0000 (15:58 +0200)]
Pass optional arguments to the daemons
These can be set in the defaults file, default to no arguments being
passed, and make it easy for local installation to customize the way the
ganeti daemons are called.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin <iustin@google.com>
Guido Trotter [Tue, 5 May 2009 13:53:37 +0000 (15:53 +0200)]
ganeti.initd: include defaults file, if present
In the example init script we'll execute an optional defaults file to
make it easier to add local customizations to the ganeti startup.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin <iustin@google.com>
Guido Trotter [Tue, 5 May 2009 14:07:00 +0000 (16:07 +0200)]
Fix ;; indentation in the main initd loop
Currently two of the ;; ending the case bodies are not indented with
anything. Reindent all of them to the body of the loop, as it's done
somewhere else in the init script.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin <iustin@google.com>
Carlos Valiente [Tue, 5 May 2009 14:43:04 +0000 (15:43 +0100)]
Avoid DeprecationWarning on Python >= 2.6
Python 2.6 complains about module 'sha' being deprecated. It makes
execution of Ganeti commands a bit annoying, and when you run
'ganeti-watcher' in cron jobs, you get a mail message after every
execution.
Tests pass under under Python 2.6 and Python 2.4.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Carlos Valiente [Tue, 5 May 2009 14:43:04 +0000 (15:43 +0100)]
Avoid DeprecationWarning on Python >= 2.6
Python 2.6 complains about module 'sha' being deprecated. It makes
execution of Ganeti commands a bit annoying, and when you run
'ganeti-watcher' in cron jobs, you get a mail message after every
execution.
Tests pass under under Python 2.6 and Python 2.4.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 5 May 2009 10:29:10 +0000 (12:29 +0200)]
ganeti-noded: add bind address option
This allows ganeti-noded to bind only on one interface rather than all
the ones on the machine. The default behaviour doesn't change.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Tue, 5 May 2009 10:48:50 +0000 (12:48 +0200)]
Fix compatibility with DRBD 8.3
DRBD 8.3 changes two more things compared to 8.2:
- /proc/drbd format changed in multiple ways; the part we're
interested is the ‘st:’ to ‘ro:‘ change (in the changelog named as
“Renamed 'state' to 'role'”
- “drbdsetup /dev/drbdN show” changed the ‘device’ stanza from:
device "/dev/drbd0";
to:
device minor 0;
This patch fixes these both and adds data files and unittests for DRBD
8.3.1.
Signed-off-by: Iustin Pop <iustin@google.com>
Karsten Keil [Tue, 14 Apr 2009 15:06:30 +0000 (17:06 +0200)]
Fix compatibility with DRBD 8.2
This patch adds (and suppresses) the extra ipv4/ipv6 words before the
actual address that newer DRBD versions add.
[iustin@google.com: slightly changed the patch to conform to style
guide, and changed the commit message]
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Mon, 4 May 2009 22:03:29 +0000 (00:03 +0200)]
RunCmd: log command line for missing cmd case
In case of missing programs, currently utils.RunCmd doesn't show any
information to help debugging, only 'No such file or directory'. This
patch adds error handling for the ENOENT case such that at least we have
this information in the node daemon logs.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 4 May 2009 21:19:58 +0000 (23:19 +0200)]
Abstract Linux node information in hv_base
Currently both hv_fake and hv_kvm implement practically identical code
to get the node information. Since future container-like hypervisors
will also need this functionality, this patch moves it into the base
class (as a separate function) which can then be called from classes
which need this info.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 4 May 2009 14:49:15 +0000 (16:49 +0200)]
Fix argument checking in LUSetClusterParams
This patch fixes two issues with LUSetClusterParams and argument
checking.
First, this LU used the wrong function name (CheckParameters instead of
CheckArguments), which means that no parameter checking was done at all;
this impacted the candidate_pool_size checks (the only one done at this
stage).
Second, int() can raise both ValueError and TypeError, and we should
correctly handle both.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sat, 2 May 2009 21:03:07 +0000 (23:03 +0200)]
Small optimisation in utils.WriteFile
Currently we always try to remove the new file, even if the rename
succeeded. This patch tracks the existence of the new file and doesn't
try to remove it if we managed to rename it.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sat, 2 May 2009 20:17:01 +0000 (22:17 +0200)]
Fix luxi serialization in ganeti-masterd
Currently, lib/luxi.py used lib/serializer.py for encoding/decoding
messages, but the master daemon uses directly the simplejson module.
This is wrong as any non-trivial change to serializer.py will break the
master daemon.
The patch changes masterd to use exactly the same functions as luxi.py
for encoding/decoding of messages.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Fri, 1 May 2009 19:19:59 +0000 (21:19 +0200)]
Allow gnt-debug submit-job to take multiple args
Currently “gnt-debug submit-job” takes a single argument and has
non-trivial startup-costs; in order to exercise the job system, it is
better to be able to submit multiple jobs with a single invocation of
the script.
This patch extends it to take multiple argument, de-serialize the
opcodes and then submit all of them as fast as possible, in order to
increase pressure on the master daemon.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Alexander Schreiber <als@google.com>
Iustin Pop [Mon, 4 May 2009 11:29:10 +0000 (13:29 +0200)]
Include node name in hypervisor validation errors
The current validation routine just says "failed", without specifying
the node name. This is very confusing, and we should log the node name
too.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Alexander Schreiber <als@google.com>
Iustin Pop [Mon, 4 May 2009 12:48:29 +0000 (14:48 +0200)]
Fix gnt-cluster getmaster on non-master nodes
The current implementation of “gnt-cluster getmaster” doesn't work on
non-master nodes, which is a regression from 1.2. This patch implements
it (again) via ssconf.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Alexander Schreiber <als@google.com>
Iustin Pop [Mon, 27 Apr 2009 10:57:49 +0000 (10:57 +0000)]
Release 2.0rc4
Reviewed-by: ultrotter
Guido Trotter [Fri, 24 Apr 2009 16:13:45 +0000 (16:13 +0000)]
Update gnt-instance(8) for info
Add the --all argument, and reword a bit the basic information.
Reviewed-by: iustinp
Guido Trotter [Fri, 24 Apr 2009 16:13:29 +0000 (16:13 +0000)]
gnt-instance info --all
Don't show all instances info by default, but require --all to be passed
for this time consuming operation.
Reviewed-by: iustinp
Iustin Pop [Fri, 24 Apr 2009 14:36:17 +0000 (14:36 +0000)]
LUDiagnoseOS: change locking and error handling
Since the “list OSes” call is exported via RAPI, this can be used pretty
easily to DOS the master daemon during long jobs.
The implementation of LUDiagnoseOS makes an RPC call to all nodes; we
lock nodes here in order to prevent node removal.
However, after closer examination, the worst case is:
- we get the list of nodes from the config
- another thread removes a node
- our RPC queries reach the removed node
As this point, if ganeti-noded is stopped or doesn't accept our queries,
the RPC call will return failed, and in the current implementation all
OSes will become invalid.
If we change the ‘failed RPC’ handling to ignore such nodes, this allows
us to both remove locking, and to handle transient RPC failures better
(not invalidating all OSes).
This patch does both these things, with a single drawback: in gnt-os
diagnose, the down nodes do not appear at all. I think this is a small
drawback, and the alternative is to add them with status failed; this
works (3-line patch), but then the output of “list” and “diagnose” will
no longer be consistent. As such, my proposal is to not list the nodes.
Reviewed-by: ultrotter
Iustin Pop [Fri, 24 Apr 2009 08:43:09 +0000 (08:43 +0000)]
Fix verify-disks with broken volume groups
When a remote node returns invalid LVM data, we check it, but we don't
stop and continue with the rest of the checks (which require a valid
volume group). This raises an internal error and breaks verify disks.
This seems unchanged for a long while, I don't know why it surfaced just
recently.
Reviewed-by: ultrotter
Iustin Pop [Fri, 24 Apr 2009 08:43:01 +0000 (08:43 +0000)]
Prevent errors when xenvg is broken cluster verify
When vg_name is not returned at all, we currently abort with an internal
error. This is because we don't catch KeyError.
This patch adds a custom message for this case, and also adds KeyError
to the list of catched exceptions, just for safety.
On the other hand, we could also just remove this piece of code since
it's not used at all the ["dfree"] value.
Reviewed-by: ultrotter
Iustin Pop [Wed, 15 Apr 2009 11:11:12 +0000 (11:11 +0000)]
A bunch of doc and other small fixes
This patch adds a couple of both externally and internally reported
issues:
- missing SGML tags (Issue 54), report and patch by superdupont
- wrong variable used in the init.d script, report and patch by
Karsten Keil <karsten-keil@t-online.de>
- man page for gnt-instance reinstall needs clarification (Issue 56)
- gnt-instance man page missing --disks documentation for
replace-disks
- gnt-node modify help output is unclear about the -C/-D/-O input
format, and the man page doesn't document this command at all
- “gnt-node modify -C yes” for offline or drained nodes had wrong
error message
- “gnt-instance reinstall --select-os” has wrong prompt, we only
accept a number for the OS and not the template name
Reviewed-by: ultrotter
Alexander Schreiber [Tue, 14 Apr 2009 16:42:43 +0000 (16:42 +0000)]
Trivial typo fix in error message
Reviewed-by: iustinp
Iustin Pop [Wed, 8 Apr 2009 12:34:49 +0000 (12:34 +0000)]
Release 2.0rc3
Burnin tests were successful, release rc3.
Reviewed-by: imsnah
Iustin Pop [Tue, 7 Apr 2009 11:53:58 +0000 (11:53 +0000)]
Distribute built documentation
This patch changes the way documentation is built in order to distribute
the generated output in the 'dist' archive, and thus no longer
requiring the presence of the docbook/rst toolchains during build time.
This will lower the requirements for installation and also makes the
build time insignificant.
First, we remove the docbook2pdf rules and variables, since we no longer
build this kind of docs. Furthermore, the rst source files are not
(today) processed via replace_vars_sed, so the whole .in rules for doc/
go away.
Next, we change the ".sgml|.rst -> replace_vars_sed -> .in -> processor
-> final file" processing to ".sgml|.rst -> generator -> .in ->
replace_vars_sed -> final file"; this means we first process the file
using the formatter, with the @VARIABLE@ entries in it, and save the
output as .in; this output we distribute, and on the user side, the
replace_vars_sed will use the new configure flags to transform the
(almost final .in form) to the final form, without needing the
toolchain.
In configure.ac we also change from ERROR to WARN for the documentation
generators, and extra tests in Makefile.am check that the programs have
been found.
This was tested with distcheck and works as expected.
Reviewed-by: ultrotter
Iustin Pop [Mon, 6 Apr 2009 08:21:30 +0000 (08:21 +0000)]
Disable synchronous (locking) queries
This patch raises an error in the master daemon in case the user
requests a locking query; accordingly, all clients were modified to send
only lockless queries. This is short-term fix, for proper fix the
clients should be modified to submit a job when the user request a
locking query.
The other approach would be to ignore the flag passed by the client;
this would be worse as client's wouldn't get at least an error.
The possible impact of this is multiple:
- some commands could have been not converted, and thus fail; this
can be remedied easily
- the consistency of commands is lost; e.g. node failover will not
lock the node *while we get the node info*, so we could miss some
data; this is again in the thread of atomic operations which are
missing in the current model of query-and-act from gnt-* scripts
Reviewed-by: imsnah, ultrotter