code.grnet.gr Git - ganeti-local/log

Create a new --no-voting option for masterfailover

This allows failing over in certain corner cases, such as a 2 node
cluster with one node down. The man page is also updated to document
this dangerous option and how to recover from this situation.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

ganeti-masterd: allow non-interactive --no-voting

This will be used by ganeti-noded to start ganeti-masterd in a
--no-voting masterfailover.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Increase maximum accepted size for a DRBD meta dev

With the change to stripped LVs, the actual size of a meta device (which
is small) can be more than we expected (for non-stripped LVs). This
patch increases from 160MB to 1GB the accepted size, and updates the
comment with the rationale behind this change.

Note that we do want even meta devices stripped, since it can increase
metadata update.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>

Cleanup config data when draining nodes

Currently, when draining nodes we reset their master candidate flag, but
we don't instruct them to demote themselves. This leads to “ERROR: file
'/var/lib/ganeti/config.data' should not exist on non master candidates
(and the file is outdated)”.

This patch simply adds a call to node_demote_from_mc in this case.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

Fix node readd issues

This patch fixes a few node readd issues.

Currently, the node readd consists of two opcodes:
  - OpSetNodeParms, which resets the offline/drained flags
  - OpAddNode (with readd=True), which reconfigures the node

The problem is that between these two, the configuration is inconsistent
for certain cluster configurations. Thus, this patch removes the first
opcode and modified the LUAddNode to deal with this case too.

The patch also modifies the computation of the intended master_candidate
status, and actually sets the readded node to master candidate if
needed. Previously, we didn't modify the existing node at all.

Finally, the patch modifies the bottom of the Exec() function for this
LU to:
  - trigger a node update, which in turn redistributes the ssconf files
    to all nodes (and thus the new node too)
  - if the new node is not a master candidate, then call the
    node_demote_from_mc RPC so that old master files are cleared

My testing shows this behaves correctly for various cases.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

backend.DemoteFromMC: don't fail for missing files

If the config file is missing when the DemoteFromMC() function is
called, it will raise a ProgrammerError. Instead of changing the
utils.CreateBackup() file which is called from multiple places, for now
we only change the DemoteFromMC() function to not call it if the file is
not existing (we rely on the master to prevent race conditions here).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>

Allow GetMasterCandidateStats to ignore some nodes

This patch modifies ConfigWriter.GetMasterCandidateStats to allow it to
ignore some nodes in the calculation, so that we can use it to predict
cluster state without some nodes (which we know we will modify, and thus
we should not rely on their state).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>

Fix error message for extra files on non MC nodes

Currently the message for extraneous files on non master candidates is
confusing, to say the least. This makes it hopefully more clear.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>

Fix adjustement of candidates in cluster modify

The code for adjusting the candidate pool size was done after the config
update, and this means we triggered the save of the config file without
fixing the candidate pool, which aborts with an error.

The patch just moves it above. The old comment was valid, but we anyway
save the config file in MaintainCandidatePool, so this should be safe.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Add a new node list field

This patch adds a ‘role’ node list field, which shows a one-character
node status. This is a simpler way to see the node status than selecting
all the flags individually.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Fix HTTP server library handling of credentials

Currently the http library only checks credentials when authentication
is required. This means that any credentials are accepted on the root
resource, for example, which makes problems hard to diagnose - the
user/pw works for all queries, until one tries to do a modification at
which point fails.

This patch changes the PreHandleRequest() function to not ignore
credentials when passed, even if we don't require authentication. This
makes the behavior of RAPI more predictable.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Fix a typo in backend.InstanceReboot docstring

The documentation for the reboot was wrong. This patch fixes it and
updates the docstring with more details.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Fix handling of 'vcpus' in instance list

Currently running “gnt-instance list -o+vcpus” fails with a cryptic message:
  Unhandled Ganeti error: vcpus

This is due to multiple issues:
  - in some corner cases cmdlib.py raises an errors.ParameterError but
    this is not handled by cli.py
  - LUQueryInstances declares ‘vcpu’ as a supported field, but doesn't handle
    it, so instead of failing with unknown parameter, e.g.:
      Failure: prerequisites not met for this operation:
      Unknown output fields selected: vcpuscd
    it raises the ParameteError message

This patch:
  - adds handling of 'vcpus' to LUQueryInstances
  - adds handling of the ParameterError exception to cli.py
  - changes the 'else: raise errors.ParameterError' in the field handling of
    LUQueryInstance to an assert, since it's a programmer error if we reached
    this step

With this, a future unhandled parameter will show:
  gnt-instance list -o+vcpus
  Unhandled protocol error while talking to the master daemon:
  Caught exception: Declared but unhandled parameter 'vcpus'

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Fix checking for valid OS in instance create

The current check in LUCreateInstance.CheckPrereq() is wrong - it only checks
if we got an OS, but not if we got a valid OS. This patch fixes it.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Show disk size in instance info

The size of the instance's disk was not shown in “gnt-instance info”.
This patch adds it and formats it nicely if possible.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

gnt-cluster(8) fix --backend-parameters opt name

It was mistakenly called --backend

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

LUQueryInstances: fix querying for nic data

Currently we support querying for "mac" "ip" or "bridge", meaning "the
one of the first nic. We are not checking that there is a first nic,
though, and thus could incur in errors. This patch fixes it by returning
"None" should there be no such nic, as it's done when explicitely asking
for a nic via nic.<field>/<N>

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Specify the object type in two docstring

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Merge branch 'master' into next

* master:
  Update NEWS and version for 2.0.1 release
  gnt-{instance,backup}(8) --nic is actually --net
  Fix a wrong function name in backend.DrbdAttachNet
  GNT-CLUSTER(8) fix search-tags example

Update NEWS and version for 2.0.1 release

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

gnt-{instance,backup}(8) --nic is actually --net

Fix a typo in the man pages that used the wrong option name.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Fix a wrong function name in backend.DrbdAttachNet

Commit cf8df3f30c2dcd0ab398d835fa9f64d61578a4f7 "bdev: forward-port
ReAttachNet/DisconnectNet" forward-ported 1.2's bdev.DRBD8.ReAttachNet()
to 2.0 while renaming it to AttachNet(), but commit
6b93ec9d798ed53089a06bc0ced58ef1d8a9e4b0 "Forward-port DrbdNetReconfig"
didn't rename all the calls to it and left one ReAttachNet call in
backend.py.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

GNT-CLUSTER(8) fix search-tags example

Reported in issue 59.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Enable stripped LVs

This patch enables stripped LVs, falling back to non-stripped if the
stripped creation fails. If the configure-time lvm-stripecount is 1,
this patch becomes a noop (with an insignificant python-level overhead,
but no extra lvm calls).

The effect of this patch is that new instances will get stripped LVs
from the start, whereas old instances will have their LVs stripped as
soon as replace-disks is run for them.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Add a lvm stripecount configure parameter

This patch adds a configure-time customizable parameter that will be
used to enable stripped LVs. The default of the parameter is 3.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Add more constants for DRBD and change sync tests

This patch adds constants for the connection status, peer roles and disk
status, and it changes the rules for when the disk is considered as
“resyncing” - previously it was only for syncsource/synctarget, but
there are many other transient statuses which could be misinterpreted as
‘degraded’ (because they where not considered as resyncing, but the disk
is not consistent in these statuses).

Furthermore, cmdlib.py:WaitForSync determines if a device is syncing or
not based on sync_percent being not none. Not all DRBD resync statuses
offer a percent done, so if we are syncing but don't have a sync
percent, we'll report a zero sync percent (and no time estimate).

The patch also removes a few unused variables (is_sync_target,
peer_sync_target, is_resync) whose value doesn't make sense anymore with
the new sync rules.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

Merge branch 'master' into next

* master:
Wait for a while in failed resyncs
Fix two issues with exports and snapshot errors

Wait for a while in failed resyncs

This patch is an attempt at fixing some very rare occurrences of messages like:
- "There are some degraded disks for this instance", or:
- "Cannot resync disks on node node3.example.com: [True, 100]"

What I believe happens is that drbd has finished syncing, but not all
fields are updated in 'Connected' state; maybe it's in WFBitmap[ST], or
in some other transient state we don't handle well.

The patch will change the _WaitForSync method to recheck up to a
hardcoded number of times if we're finished syncing but we're degraded
(using the same condition as the 'break' clause of the loop).

The cons of this changes is that a normal, really-degraded due to
network or disk failure will cause an extra delay before it aborts. For
this, I'm happy to choose other values.

A better, long term fix is to handle more DRBD state correctly (see the
bdev.DRBD8Status class).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Assemble DRBD using the known size

This patch changes DRBD disk attachment to force the wanted size, as opposed to
letting the device auto-discover its size.

This should make the disks more resilient with regard to small differences in
size (e.g. due to LVM rounding). This still works with regard to disk
growth, but the instances needs to be fully restarted (including disks)
in that case.

This passes a full burning without problems, but it's still a tricky
change - if the config.data is not synced with the reality, we might
tell DRBD a wrong size. At least this will fail outright (and not
introduce silent errors), as DRBD (per a quick check at the sources)
tracks the size in the meta-dev and also does not allow shrinking
consistent devices.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Fix two issues with exports and snapshot errors

This patch fixes two issues related to failed snapshots during exports:
  - first, the error messages used disk.logical_id[1], which is a node
    name for DRBD, and it resulted in strange error messages like
    "cannot snapshot block device node1 on node2"
  - second, if snapshotting fails for any disk, rpc.call_finalize_export
    fails as it didn't handle booleans (backend.FinalizeExport does)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Set the size on new DRBDs in replace secondary

Currently the code in cmdlib doesn't set the device size to new DRBD
devices in replace secondary, but we need to do it otherwise it gets
initialized to None.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

Change the bdev init signatures

This patch changes all the bdev.BlockDev constructors to take an
additional ‘size’ parameter, all the backend functions that call those
functions to pass it and also changes backend.BlocdevCreate() to not use
the size passed via the rpc call but instead directly disk.size (this is
the only way it's called).

Note that this patch doesn't do anything with this parameter, just
stores it on the blockdev objects.

With the patch, we actually have a more uniform init sequence (before
create had the parameter, but the other functions not).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Merge branch 'next'

* next: (34 commits)
  watcher: automatically restart noded/rapi
  watcher: handle full and drained queue cases
  rapi: rework error handling
  Fix backend.OSEnvironment be/hv parameters
  rapi: make tags query not use jobs
  Change failover instance when instance is stopped
  Export more instance information in hooks
  watcher: write the instance status to a file
  Fix the SafeEncoding behaviour
  Move more hypervisor strings into constants
  Add -H/-B startup parameters to gnt-instance
  call_instance_start: add optional hv/be parameters
  Fix gnt-job list argument handling
  Instance reinstall: don't mix up errors
  Don't check memory at startup if instance is up
  gnt-cluster modify: fix --no-lvm-storage
  LUSetClusterParams: improve volume group removal
  gnt-cluster info: show more cluster parameters
  LUQueryClusterInfo: return a few more fields
  Add the new DRBD test files to the Makefile
  ...

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Release 2.0.0 final

This is simply a version bump, no changes from rc5.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

watcher: automatically restart noded/rapi

This patch makes the watcher automatically restart the node and rapi
daemons, if they are not running (as per the PID file).

This is not an exhaustive test; a better one would be TCP connect to the
port, and an even better one a simple protocol ping (e.g. get / for rapi
and a rpc_call_alive for noded), but since we don't know how they've
been started we can't implement it today. rapi would need to write the
SSL/port to a file, and noded something similar, so that we know how to
connect.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

watcher: handle full and drained queue cases

Currently the watcher is broken when the queue is full, thus not
fulfilling its job as a queue cleaner. It also doesn't handle nicely the
queue drained status.

This patch does a few changes:
  - first archive jobs, and only after submit jobs; this fixes the case
    where the queue is already full and there are jobs suited for
    archiving (but not the case where the jobs all too young to be
    archived)
  - handle nicely the job queue full and drained cases—instead of
    tracebacks, log such cases nicely
  - reverse the initial value and special cases for update_file; we now
    whitelist instead of blacklist cases, since we have much more
    blacklist cases than vice versa, and we set the flag to True only
    after the run is successful

The last change, especially, is a significant one: now errors during the
watcher run will not update the status file, and thus they won't be lost
again in the logs.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

rapi: rework error handling

Currently the rapi code doesn't have any custom error handling; any
exceptions raised are simply converted into an HTTP 500 error, without
much explanation.

This patch adds a couple of generic SubmitJob/GetClient functions that
handle some errors specially so that they are transformed into HTTP
errors, with more detailed information.

With this patch, the behaviour of rapi when the queue is full or
drained, or when the master is down is more readable.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Fix backend.OSEnvironment be/hv parameters

Commit 67fc3042c20f5893abf71a0b4c445c356f9603b9 added some more
variables to be exported to OSEnvironment, but it has two bugs:
  - wrong variable name (env vs. result)
  - in OSEnvironment we don't have the automatic converstion to strings
    that we do in hooks, so we must manually enforce this

With this patch instance creations work again.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

rapi: make tags query not use jobs

Currently the rapi tags query implementation is similar to the command
line one: it submits OpGetTags jobs. This not good, since this being an
API it can be used a lot and can pollute the job queue with many such
trivial jobs.

This patch converts it to use either queries (for nodes/instances) or
direct read from ssconf (for the cluster case). For ssconf, we added a
function to the ssconf.SimpleStore class for reading the tags.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

Change failover instance when instance is stopped

Currently, if the instance is stopped, we still check for enough memory
on the target node. This is a little bit too strict, since in case too
many nodes have failed and one is out of the memory, this prevents
fixing the cluster (with the instances down).

We change it to do the memory checks only when the instance will be
started.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Export more instance information in hooks

Currently we miss in hooks the instance's hypervisor, hypervisor
parameters and backend parameters. This forces hooks to query back into
ganeti, which is dangerous due to possible luxi sockets exhaustion.

This patch adds these three as INSTANCE_HYPERVISOR, INSTANCE_HV_*,
INSTANCE_BE_*. The hook environment prefixes all keys with “GANETI”, so
a default settings for a xen-pvm instance would be:

  GANETI_INSTANCE_HV_initrd_path=
  GANETI_INSTANCE_HV_kernel_args=ro
  GANETI_INSTANCE_HV_kernel_path=/boot/vmlinuz-2.6-xenU
  GANETI_INSTANCE_HV_root_path=/dev/sda1

Any dashes in parameter names are changed to underscores, since
variables with dashes are not easy to access from the shell
(alternatively we could deny those via an unittest for constants.py).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Merge branch 'master' into next

Signed-off-by: Guido Trotter <ultrotter@google.com>

watcher: write the instance status to a file

This patch modifies the watcher to keep on-disk a file with the instance
status; this can be used from outside of ganeti to react to instances
being down (when the watcher cannot restart them).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Release 2.0rc5

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Fix the SafeEncoding behaviour

Currently we have bad behaviour in SafeEncode:
- binary strings are actually not handled correctly (ahem)
- the encoding is not stable, due to use of string_escape

For this reason, we replace the use of string_escape with part of the
code of string escape (PyString_Repr in Objects/stringobject.c); we
don't escape backslashes or single quotes, since that is that makes it
nonstable. Furthermore, we only use the encode('ascii', ...) for unicode
inputs.

The patch also adds unittests for the function that test basic
behaviour.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Move more hypervisor strings into constants

This patch adds constants for the mouse and boot order strings; while
there are still some issues remaining, we're trying to cleanup hardcoded
strings from the hypervisors.

Since the formatting of frozensets is currently wrong, we also add an
utility function for this and change all the error messages to use it.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

watcher: try to restart the master if down

Bugs in either our code or in associated libraries can bring the master daemon
down, and this (due to the 2.0 architecture) stops all work on the cluster.

Since the watcher already does periodic checks on the cluster, we modify
it to try to start the master automatically in case of failures to
connect. This will be tried only once per cycle.

Also, in this case, we modify the code so that the watcher status file
is not updated - its timestamp will reflect thus the time of last
successful connection to the master.

Side note: the except errors.ConfigurationError part could be cleaned
up, since in 2.0 we don't usually get that directly, and if we do it's
an error and we shouldn't touch the file anyway; but that is not a rc5
change.

Signed-off-by: Iustin Pop <iustin@google.com>

IAllocator: export total disk size for instances

This patch adds for current instance a ‘disk_space_total’ key, similar
to the key for the new instance in case of new allocations.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Add -H/-B startup parameters to gnt-instance

This patch modifies the start instance script, opcode and logical unit
to support temporary startup parameters.

Different from 1.2, where only the kernel arguments were supporting
changes (and thus xen-pvm specific), this version supports changing all
hypervisor and backend parameters (with appropriate checks).

This is much more flexible, and allows for example:
- start with different, temporary kernel
- start with different memory size

Note: in later versions, this should be extended to cover disk
parameters as well (e.g. start with drbd without flushes, start with
drbd in async mode, etc.).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

call_instance_start: add optional hv/be parameters

This patch modifies the rpc.call_instance_start - the master side - to
take optional hv/be parameters. The noded side is unchanged and
oblivious to the change.

This will allow implementation of single-user capability and such on
startup (temporary, as opposed to permanent).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Fix gnt-job list argument handling

Currently QueryJob returns "None" when a wrong job ID is passed.
Handle this in gnt-job list, by printing an error for each wrong job,
and still giving output for all the jobs which actually do exist.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Instance reinstall: don't mix up errors

If the remote info rpc call fails we can't assume that the instance is
up.

Signed-off-by: Guido Trotter <ultrotter@google.com>

Don't check memory at startup if instance is up

Signed-off-by: Guido Trotter <ultrotter@google.com>

gnt-cluster modify: fix --no-lvm-storage

Currently doing a gnt-cluster-modify --no-lvm-storage is silently
ignored, as it passes a None value in vg_name, which is the same as not
modifying that parameter. Explicitely set the passed value to '', so the
non-true not-None value can be evaluate to actually remove a volume
group.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

LUSetClusterParams: improve volume group removal

Currently LUSetClusterParams will remove the volume group if the vg_name
field passed in is not true, but not None. Setting the target volume
group to False or the empty string, though, is a bad idea because it's
not a boolean value, and at cluster init we set it to None if
--no-lvm-storage is passed. With this fix we handle '' (or any other
non-None false value) as the "unset" value, but actually store None in
the config.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

gnt-cluster info: show more cluster parameters

Even if we cannot modify all of them, they are useful information about
the current cluster.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

LUQueryClusterInfo: return a few more fields

Some fields can be set at cluster init, and perhaps even modifed with
SetClusterParams but there's no way to know them. With this patch we
export them in the cluster info query.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

KVMHypervisor: return memory and cpus as integers

Currently the KVM hypervisor returns strings for the memory and cpu
values, while the xen hypervisor returns integers. Making this uniform
converting the values to integers in KVM as well.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

LUSetInstanceParam: don't assume memory is integer

LUSetInstanceParam currently assumes that the 'memory' value of a
call_instance_info result is an integer, while the rest of the code
explicitely converts it to int(). Converting it to int works around a
bug which prevents changing the memory allocation of a live instance if
the remote call returns the memory in string format.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Add the new DRBD test files to the Makefile

These were forgotten in commit 01e2ce3a6e4ca68983f50dedaddd0d0fc7b77026,
and caused “make distcheck” to fail.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Fix QA and documentation about no initrd case

In Ganeti 1.2, “none” was used to signify no initrd. In 2.0 we have
changed to “no_” as a prefix (i.e. “-H no_initrd_path”) and thus we
document in the manpage this.

The QA suite is changed accordingly.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Remove an unused function

The _TransformPath function is not used anymore in 2.0, let's remove it.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Exporting the instance network_port on the RAPI

Patch for adding network_port to the instance attributes exported by the
RAPI.

[iustin@google.com: slightly changed the formatting]
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Minor patch to rapi documentation

Minor patch to clarify the URL necessary for accessing the RAPI.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Small doc change in README

The version is 2.0, and we don't build PDFs by default, only HTML
files.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Remove some superfluous imports

This is for Python 2.6 compatibility.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Make Python interpreter selectable for test scripts

The Python interpreter used to run the test cases is hard-coded to be
/usr/bin/python. If we use the first one from $PATH instead, it is
much easier to test ganeti with other Python versions.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Pass optional arguments to the daemons

These can be set in the defaults file, default to no arguments being
passed, and make it easy for local installation to customize the way the
ganeti daemons are called.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin <iustin@google.com>

ganeti.initd: include defaults file, if present

In the example init script we'll execute an optional defaults file to
make it easier to add local customizations to the ganeti startup.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin <iustin@google.com>

Fix ;; indentation in the main initd loop

Currently two of the ;; ending the case bodies are not indented with
anything. Reindent all of them to the body of the loop, as it's done
somewhere else in the init script.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin <iustin@google.com>

Avoid DeprecationWarning on Python >= 2.6

Python 2.6 complains about module 'sha' being deprecated. It makes
execution of Ganeti commands a bit annoying, and when you run
'ganeti-watcher' in cron jobs, you get a mail message after every
execution.

Tests pass under under Python 2.6 and Python 2.4.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

ganeti-noded: add bind address option

This allows ganeti-noded to bind only on one interface rather than all
the ones on the machine. The default behaviour doesn't change.

Signed-off-by: Guido Trotter <ultrotter@google.com>

Fix compatibility with DRBD 8.3

DRBD 8.3 changes two more things compared to 8.2:
  - /proc/drbd format changed in multiple ways; the part we're
    interested is the ‘st:’ to ‘ro:‘ change (in the changelog named as
    “Renamed 'state' to 'role'”
  - “drbdsetup /dev/drbdN show” changed the ‘device’ stanza from:
      device "/dev/drbd0";
    to:
      device                  minor 0;

This patch fixes these both and adds data files and unittests for DRBD
8.3.1.

Signed-off-by: Iustin Pop <iustin@google.com>

Fix compatibility with DRBD 8.2

This patch adds (and suppresses) the extra ipv4/ipv6 words before the
actual address that newer DRBD versions add.

[iustin@google.com: slightly changed the patch to conform to style
guide, and changed the commit message]
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

RunCmd: log command line for missing cmd case

In case of missing programs, currently utils.RunCmd doesn't show any
information to help debugging, only 'No such file or directory'. This
patch adds error handling for the ENOENT case such that at least we have
this information in the node daemon logs.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Abstract Linux node information in hv_base

Currently both hv_fake and hv_kvm implement practically identical code
to get the node information. Since future container-like hypervisors
will also need this functionality, this patch moves it into the base
class (as a separate function) which can then be called from classes
which need this info.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Fix argument checking in LUSetClusterParams

This patch fixes two issues with LUSetClusterParams and argument
checking.

First, this LU used the wrong function name (CheckParameters instead of
CheckArguments), which means that no parameter checking was done at all;
this impacted the candidate_pool_size checks (the only one done at this
stage).

Second, int() can raise both ValueError and TypeError, and we should
correctly handle both.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Small optimisation in utils.WriteFile

Currently we always try to remove the new file, even if the rename
succeeded. This patch tracks the existence of the new file and doesn't
try to remove it if we managed to rename it.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Fix luxi serialization in ganeti-masterd

Currently, lib/luxi.py used lib/serializer.py for encoding/decoding
messages, but the master daemon uses directly the simplejson module.
This is wrong as any non-trivial change to serializer.py will break the
master daemon.

The patch changes masterd to use exactly the same functions as luxi.py
for encoding/decoding of messages.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

Allow gnt-debug submit-job to take multiple args

Currently “gnt-debug submit-job” takes a single argument and has
non-trivial startup-costs; in order to exercise the job system, it is
better to be able to submit multiple jobs with a single invocation of
the script.

This patch extends it to take multiple argument, de-serialize the
opcodes and then submit all of them as fast as possible, in order to
increase pressure on the master daemon.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Alexander Schreiber <als@google.com>

Include node name in hypervisor validation errors

The current validation routine just says "failed", without specifying
the node name. This is very confusing, and we should log the node name
too.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Alexander Schreiber <als@google.com>

Fix gnt-cluster getmaster on non-master nodes

The current implementation of “gnt-cluster getmaster” doesn't work on
non-master nodes, which is a regression from 1.2. This patch implements
it (again) via ssconf.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Alexander Schreiber <als@google.com>

Release 2.0rc4

Reviewed-by: ultrotter

Update gnt-instance(8) for info

Add the --all argument, and reword a bit the basic information.

Reviewed-by: iustinp

gnt-instance info --all

Don't show all instances info by default, but require --all to be passed
for this time consuming operation.

Reviewed-by: iustinp

LUDiagnoseOS: change locking and error handling

Since the “list OSes” call is exported via RAPI, this can be used pretty
easily to DOS the master daemon during long jobs.

The implementation of LUDiagnoseOS makes an RPC call to all nodes; we
lock nodes here in order to prevent node removal.

However, after closer examination, the worst case is:
  - we get the list of nodes from the config
  - another thread removes a node
  - our RPC queries reach the removed node

As this point, if ganeti-noded is stopped or doesn't accept our queries,
the RPC call will return failed, and in the current implementation all
OSes will become invalid.

If we change the ‘failed RPC’ handling to ignore such nodes, this allows
us to both remove locking, and to handle transient RPC failures better
(not invalidating all OSes).

This patch does both these things, with a single drawback: in gnt-os
diagnose, the down nodes do not appear at all. I think this is a small
drawback, and the alternative is to add them with status failed; this
works (3-line patch), but then the output of “list” and “diagnose” will
no longer be consistent. As such, my proposal is to not list the nodes.

Reviewed-by: ultrotter

Fix verify-disks with broken volume groups

When a remote node returns invalid LVM data, we check it, but we don't
stop and continue with the rest of the checks (which require a valid
volume group). This raises an internal error and breaks verify disks.

This seems unchanged for a long while, I don't know why it surfaced just
recently.

Reviewed-by: ultrotter

Prevent errors when xenvg is broken cluster verify

When vg_name is not returned at all, we currently abort with an internal
error. This is because we don't catch KeyError.

This patch adds a custom message for this case, and also adds KeyError
to the list of catched exceptions, just for safety.

On the other hand, we could also just remove this piece of code since
it's not used at all the ["dfree"] value.

Reviewed-by: ultrotter

A bunch of doc and other small fixes

This patch adds a couple of both externally and internally reported
issues:
  - missing SGML tags (Issue 54), report and patch by superdupont
  - wrong variable used in the init.d script, report and patch by
    Karsten Keil <karsten-keil@t-online.de>
  - man page for gnt-instance reinstall needs clarification (Issue 56)
  - gnt-instance man page missing --disks documentation for
    replace-disks
  - gnt-node modify help output is unclear about the -C/-D/-O input
    format, and the man page doesn't document this command at all
  - “gnt-node modify -C yes” for offline or drained nodes had wrong
    error message
  - “gnt-instance reinstall --select-os” has wrong prompt, we only
    accept a number for the OS and not the template name

Reviewed-by: ultrotter

Trivial typo fix in error message

Reviewed-by: iustinp

Release 2.0rc3

Burnin tests were successful, release rc3.

Reviewed-by: imsnah

Distribute built documentation

This patch changes the way documentation is built in order to distribute
the generated output in the 'dist' archive, and thus no longer
requiring the presence of the docbook/rst toolchains during build time.
This will lower the requirements for installation and also makes the
build time insignificant.

First, we remove the docbook2pdf rules and variables, since we no longer
build this kind of docs. Furthermore, the rst source files are not
(today) processed via replace_vars_sed, so the whole .in rules for doc/
go away.

Next, we change the ".sgml|.rst -> replace_vars_sed -> .in -> processor
-> final file" processing to ".sgml|.rst -> generator -> .in ->
replace_vars_sed -> final file"; this means we first process the file
using the formatter, with the @VARIABLE@ entries in it, and save the
output as .in; this output we distribute, and on the user side, the
replace_vars_sed will use the new configure flags to transform the
(almost final .in form) to the final form, without needing the
toolchain.

In configure.ac we also change from ERROR to WARN for the documentation
generators, and extra tests in Makefile.am check that the programs have
been found.

This was tested with distcheck and works as expected.

Reviewed-by: ultrotter

Disable synchronous (locking) queries

This patch raises an error in the master daemon in case the user
requests a locking query; accordingly, all clients were modified to send
only lockless queries. This is short-term fix, for proper fix the
clients should be modified to submit a job when the user request a
locking query.

The other approach would be to ignore the flag passed by the client;
this would be worse as client's wouldn't get at least an error.

The possible impact of this is multiple:
  - some commands could have been not converted, and thus fail; this
    can be remedied easily
  - the consistency of commands is lost; e.g. node failover will not
    lock the node *while we get the node info*, so we could miss some
    data; this is again in the thread of atomic operations which are
    missing in the current model of query-and-act from gnt-* scripts

Reviewed-by: imsnah, ultrotter

Fix the output of watcher on non-master nodes

Currently the watcher spews errors message on non-master nodes. This
cleans it up.

Reviewed-by: imsnah

Change the watcher to use jobs instead of queries

As per the mailing list discussion, this patch changes the watcher to
use a single job (two opcodes) for getting the cluster state (node list
and instance list); it will then compute the needed actions based on
this data.

The patch also archives this job and the verify-disks job.

Reviewed-by: imsnah

Fix Xen soft reboot via polling

This patch fixes the Xen soft reboot ("xm reboot") via polling for a specific
time for either changed domain ID or decreased CPU run-time.

This sould prevent the race-conditions discussed on the mailing list for
reboots.

Reviewed-by: imsnah

Add a new ssconf file with the cluster tags

Since the cluster tags are/should be more-or-less static, add them as an
ssconf key, so that querying them is possible without creating a
job/requiring the masterd to be running.

Reviewed-by: imsnah

Add some more debugging info to masterd

This patch will log data about queries, which are today completely
invisible (at the default log level) in the master log file.

Reviewed-by: imsnah

Release 2.0rc2

This updates the NEWS file and bumps up the version number.

Reviewed-by: ultrotter