ganeti-local
12 years agoAdd new back-end parameter "always_failover"
Bernardo Dal Seno [Mon, 5 Dec 2011 19:10:42 +0000 (20:10 +0100)]
Add new back-end parameter "always_failover"

Instances that have this parameter is set to True are never migrated, but
instead they can only fail over.  There are some cases where freezing the
kernel may cause problems, and hence this behavior is preferable.

Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agomanpages: Fix small errors in documentation
Bernardo Dal Seno [Wed, 30 Nov 2011 17:05:29 +0000 (18:05 +0100)]
manpages: Fix small errors in documentation

Mostly typos, except for the output of "gnt-instance migrate" in an
example, which has been updated to the current version

Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agognt-cluster: Allow modify disk/hv state
René Nussbaumer [Mon, 28 Nov 2011 15:03:43 +0000 (16:03 +0100)]
gnt-cluster: Allow modify disk/hv state

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agognt-group: Allow modify disk/hv state
René Nussbaumer [Mon, 28 Nov 2011 14:45:27 +0000 (15:45 +0100)]
gnt-group: Allow modify disk/hv state

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agognt-node: Allow modify disk/hv state
René Nussbaumer [Fri, 25 Nov 2011 13:54:29 +0000 (14:54 +0100)]
gnt-node: Allow modify disk/hv state

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agocmdlib: Adding hv/disk state dict helper functions
René Nussbaumer [Wed, 30 Nov 2011 13:57:15 +0000 (14:57 +0100)]
cmdlib: Adding hv/disk state dict helper functions

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agocli: Add common command flags for hv/disk state
René Nussbaumer [Fri, 25 Nov 2011 13:53:47 +0000 (14:53 +0100)]
cli: Add common command flags for hv/disk state

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agocmdlib: Adding _UpdateAndVerifySubDict helper
René Nussbaumer [Fri, 25 Nov 2011 10:49:41 +0000 (11:49 +0100)]
cmdlib: Adding _UpdateAndVerifySubDict helper

This helps with 2 dimensional dicts.
For example the hv_state and the disk_state dicts.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoCleanup hlint errors
Iustin Pop [Wed, 7 Dec 2011 17:28:01 +0000 (18:28 +0100)]
Cleanup hlint errors

First, we update the recommended hlint version to what I used to get a
clean output (1.8.15). Most of the changes are:

- remove unneeded parentheses
- some simplifications (intercalate " " → unwords, maybe … id →
  fromMaybe, etc.)
- removal of some duplicate code (in previous patches)

There are still some warnings which I didn't clean out but plain
ignored:

- 'Eta reduce' in some specific files, because the type inference
  specialises the function on the first call, and annotating the type
  properly would be too verbose
- use of 'first', 'comparing', and 'on', since these don't seem to be
  widely or consistently used (outside ganeti/htools, I mean)
- use of Control.Exception.catch, as we only care about I/O errors; at
  one point yes, we will need to transition to this new API
- 'Reduce duplication', since hlint warns even for 3 duplicate lines,
  and abstracting that away seems overkill to me

After this patch, make hlint is clean and doesn't exit with an error
anymore; we could enable it automatically on 'make lint' if hlint is
detected (future patch).

Note that we explicitly skip the THH.hs file from checking because it
seems that hlint doesn't parse correctly for now the splice notation.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

12 years agoAbstract some common hspace code into a function
Iustin Pop [Wed, 7 Dec 2011 17:22:19 +0000 (18:22 +0100)]
Abstract some common hspace code into a function

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

12 years agoAbstract some common Cluster.hs code into a function
Iustin Pop [Wed, 7 Dec 2011 17:21:15 +0000 (18:21 +0100)]
Abstract some common Cluster.hs code into a function

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

12 years agoAdd DRBD dynamic resync speed params to design doc
Andrea Spadaccini [Tue, 6 Dec 2011 22:14:02 +0000 (22:14 +0000)]
Add DRBD dynamic resync speed params to design doc

* Expand the Name column of the table (for c-delay-target)
* Add the c-* DRBDparameters to the table containing the disk parameters
* Add the unit of measurement in square brackets, when needed
* Document the supported DRBD version, warn about the DRBD version
  needed for barriers and for the dynamic resync speed parameters.
* Add links to some documentation about the dynamic resync speed
  parameters

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoConvert opcode TH code to the use of Field type
Iustin Pop [Tue, 6 Dec 2011 15:39:32 +0000 (16:39 +0100)]
Convert opcode TH code to the use of Field type

This makes more explicit the field behaviour - previously an optional
field was detected via a "Maybe" constructor, and an optional one via
a "Just defval" one. With this, field behaviour become more explicit
than auto-deduced.

In THH.hs, I slightly changed the fieldVariable function to use the
field name (if the field is not renamed), so that we have the exact
same output as before.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

12 years agoUnify some file lists in Makefile.am
Iustin Pop [Tue, 6 Dec 2011 14:26:01 +0000 (15:26 +0100)]
Unify some file lists in Makefile.am

These were repeated needlessly; I hope I grouped them correctly.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

12 years agoAdd DRBD barriers disk parameters
Andrea Spadaccini [Mon, 28 Nov 2011 18:17:04 +0000 (18:17 +0000)]
Add DRBD barriers disk parameters

Add the disk-barriers and meta-barriers parameters described in the
design doc.

constants.py:
* add the needed LD and DT-level parameters, use the defaults provided
  at ./configure time;
* add constants representing which barriers should be disabled and the
  set of valid options.

lib/bdev.py:
* factor the barriers handling code to a class method, for testing
  purposes;
* implement the more granular version checking logic;
* use the LD level parameters;
* add stricter check on DRBD version (8.0, 8.2 or 8.3), as we do not
  support 8.4 yet.

lib/cmdlib.py:
* translate DT level parameters to LD level ones.

configure.ac, Makefile.am:
* set both disk and meta barriers parameters depending on the value of
  --enable-drbd-barriers.

test/ganeti.bdev_unittest.py:
* unit tests for the code that sets DRBD barrier parameters depending on
  the version.

doc/design-resource-model.rst:
* reword the description of meta-barriers;
* change all disk parameters names to use dashes instead of underscores.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoStyle fixes on confd-client
Iustin Pop [Tue, 6 Dec 2011 09:44:27 +0000 (10:44 +0100)]
Style fixes on confd-client

Oops, forgot to check this before initial commit, sorry!

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoNEWS: Add missing space
Michael Hanselmann [Tue, 6 Dec 2011 09:57:03 +0000 (10:57 +0100)]
NEWS: Add missing space

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agohtools: small change in error message in THH.hs
Iustin Pop [Sun, 20 Nov 2011 17:03:30 +0000 (18:03 +0100)]
htools: small change in error message in THH.hs

We should also display the value we can't parse, otherwise debugging
is very hard.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

12 years agohtools: improvements to JSON deserialisation
Iustin Pop [Fri, 25 Nov 2011 11:21:52 +0000 (12:21 +0100)]
htools: improvements to JSON deserialisation

This fixes two problems:

- first, when we deserialise a big object, showing its value is not
  useful, as it will hide the actual error message
- second, we shouldn't deserialise a container at once, because then
  we will lose the detail of which 'key' failed to deserialise; we
  change to manual deserialisation of each key/value pair, so that we
  can keep this information

The last point requires that we import JSON.hs into THH.hs, in order
not to duplicate functionality.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

12 years agohtools: add new template haskell system
Iustin Pop [Fri, 18 Nov 2011 09:54:02 +0000 (10:54 +0100)]
htools: add new template haskell system

This system based on explicit types instead of ad-hoc rules
(e.g. instead of deducing from "Maybe Int" an optional field, we now
can say explicitly OptionalField ''Int). In the first phase, this will
be used for the equivalent of lib/objects.py, which has slightly
different rules than luxi/opcodes.

We should look at merging the two systems later.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

12 years agoAdd a small confd client
Iustin Pop [Tue, 22 Nov 2011 10:47:38 +0000 (11:47 +0100)]
Add a small confd client

This can be used to test live servers; currently there's not direct
way to interact with a confd server, except for burnin's builtin tests
(which were the source of this file).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agoA few updates to the confd design (2.1)
Iustin Pop [Sun, 20 Nov 2011 00:42:11 +0000 (01:42 +0100)]
A few updates to the confd design (2.1)

While the 2.1 design is old and should be “immutable”, I can't find
documentation about the confd protocol anywhere else, so let's correct
the design doc.

The patch is mostly style changes, plus a clarification on the ‘query’
field of the request, which varies *a lot* per request type.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agocmdlib: Make use of cluster's new “primary_hypervisor” property
Michael Hanselmann [Thu, 1 Dec 2011 13:11:06 +0000 (14:11 +0100)]
cmdlib: Make use of cluster's new “primary_hypervisor” property

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoobjects.Cluster: Add property for primary hypervisor
Michael Hanselmann [Thu, 1 Dec 2011 12:59:25 +0000 (13:59 +0100)]
objects.Cluster: Add property for primary hypervisor

This is useful for working with a node's hypervisor state, where only
the primary hypervisor will be authoritative.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoLV stripes parameters for plain and drbd
Andrea Spadaccini [Mon, 28 Nov 2011 10:47:35 +0000 (10:47 +0000)]
LV stripes parameters for plain and drbd

configure.ac:
* change the documentation of --with-lvm-stripecount parameter to
  reflect the change

doc/design-resource-model.rst:
* change drbd/stripes to drbd/data-stripes and drbd/metastripes to
  drbd/meta-stripes

rest of files:
* add the plain/stripes, drbd/data-stripes and drbd/meta-stripes disk
  parameters

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoAdd DRBD8 static resync speed disk parameter
Andrea Spadaccini [Mon, 21 Nov 2011 14:51:11 +0000 (14:51 +0000)]
Add DRBD8 static resync speed disk parameter

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoUse disk parameters in Logical Units
Andrea Spadaccini [Wed, 23 Nov 2011 11:43:40 +0000 (11:43 +0000)]
Use disk parameters in Logical Units

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoUse disk parameters in noded
Andrea Spadaccini [Mon, 21 Nov 2011 13:48:58 +0000 (13:48 +0000)]
Use disk parameters in noded

* add the params attribute to BlockDev, and add the corresponding
  parameter to all the BlockDev classes;
* change the Create, Assemble and FindDevice factory functions interface
  to accept as parameters an objects.Disk instance and a list of
  children block devices; update their callers;
* make the factory functions provide default values for params if
  needed;
* factor out a check in the block device factory functions.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoqa: add gnt-cluster tests related to disk params
Andrea Spadaccini [Wed, 23 Nov 2011 22:53:34 +0000 (22:53 +0000)]
qa: add gnt-cluster tests related to disk params

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoAdd basic support for disk parameters
Andrea Spadaccini [Mon, 21 Nov 2011 13:43:09 +0000 (13:43 +0000)]
Add basic support for disk parameters

objects.py:
  * add disk parameters to Disk, Cluster, NodeGroup.

constants.py:
  * add dictionaries that will hold types and default values for disk
    parameters (for now, empty).

test/ganeti.constants_unittest.py:
  * add unit tests for consistency in disk parameters default values.

rest of files:
  * add to gnt-cluster and gnt-group the options to manipulate disk
    parameters.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoMore fixes after commit 78519c106
Michael Hanselmann [Wed, 30 Nov 2011 13:51:12 +0000 (14:51 +0100)]
More fixes after commit 78519c106

A quick QA run successfully finished with these changes.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

12 years agoFix “node_info” RPC result
Michael Hanselmann [Wed, 30 Nov 2011 13:38:15 +0000 (14:38 +0100)]
Fix “node_info” RPC result

Commit 78519c106 broke everything. Here's the fix.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoquery: Add fields for node's disk/hv state
Michael Hanselmann [Mon, 28 Nov 2011 14:07:39 +0000 (15:07 +0100)]
query: Add fields for node's disk/hv state

These fields just return the node attribute's contents. They will be
used by the watcher to detect out of date node states.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agohv_xen: Report memory used by hypervisor
Michael Hanselmann [Fri, 25 Nov 2011 10:33:43 +0000 (11:33 +0100)]
hv_xen: Report memory used by hypervisor

- Report memory used by hypervisor (“mem_hv” as per resource model
  design document, “xmem” in htools)
- Also report number of CPUs available to Dom0
- Some other, small changes

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agohv_xen: Export number of CPUs for Dom0
Michael Hanselmann [Thu, 24 Nov 2011 13:21:03 +0000 (14:21 +0100)]
hv_xen: Export number of CPUs for Dom0

This will be stored in the node object and used for calculations.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoAdd objects for disk/hv state
Michael Hanselmann [Tue, 22 Nov 2011 09:13:59 +0000 (10:13 +0100)]
Add objects for disk/hv state

- Data objects
- Serialization/deserialization
- Unittests

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoobjects.Node: Add static hv/disk state
Michael Hanselmann [Mon, 21 Nov 2011 10:34:03 +0000 (11:34 +0100)]
objects.Node: Add static hv/disk state

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agohv_xen: Use constant for “Domain-0” name
Michael Hanselmann [Thu, 24 Nov 2011 13:17:34 +0000 (14:17 +0100)]
hv_xen: Use constant for “Domain-0” name

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoChange “node_info” RPC to accept multiple VGs/hypervisors
Michael Hanselmann [Tue, 29 Nov 2011 16:32:48 +0000 (17:32 +0100)]
Change “node_info” RPC to accept multiple VGs/hypervisors

Keeping the node state up to date will require information from multiple
VGs and hypervisors. Instead of requiring multiple calls this change
allows a single call to return all needed information. Existing users
are changed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agolocking: Allow checking if lock is owned in certain mode
Michael Hanselmann [Tue, 29 Nov 2011 15:36:26 +0000 (16:36 +0100)]
locking: Allow checking if lock is owned in certain mode

With this patch the “LockSet” and “GanetiLockManager” classes have a new
function to check if a single or a group of locks (at a certain level)
have been acquired in a specific mode. This will be used for additional
assertions. Until now they could only check if a lock has been acquired,
but not in which mode. One use-case will be updating the node state in
various places, where the node lock must be acquired in exclusive mode.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoMerge branch 'devel-2.5'
Michael Hanselmann [Thu, 24 Nov 2011 13:59:52 +0000 (14:59 +0100)]
Merge branch 'devel-2.5'

* devel-2.5:
  ConfigWriter: Fix epydoc error
  ConfigWriter: Fix epydoc error

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoMerge branch 'devel-2.4' into devel-2.5
Michael Hanselmann [Thu, 24 Nov 2011 12:22:28 +0000 (13:22 +0100)]
Merge branch 'devel-2.4' into devel-2.5

* devel-2.4:
  ConfigWriter: Fix epydoc error

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoMerge branch 'stable-2.5' into devel-2.5
Michael Hanselmann [Thu, 24 Nov 2011 12:15:46 +0000 (13:15 +0100)]
Merge branch 'stable-2.5' into devel-2.5

* stable-2.5:
  ConfigWriter: Fix epydoc error

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

12 years agoConfigWriter: Fix epydoc error
Michael Hanselmann [Thu, 24 Nov 2011 12:02:36 +0000 (13:02 +0100)]
ConfigWriter: Fix epydoc error

The parameter is called “mods”, not “modes”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>
(cherry picked from commit 1730d4a1ab56ef36d082b614d3d0ab13f3e14a85)

12 years agoConfigWriter: Fix epydoc error
Michael Hanselmann [Thu, 24 Nov 2011 12:02:36 +0000 (13:02 +0100)]
ConfigWriter: Fix epydoc error

The parameter is called “mods”, not “modes”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

12 years agoMerge branch 'devel-2.5'
Michael Hanselmann [Thu, 24 Nov 2011 09:50:35 +0000 (10:50 +0100)]
Merge branch 'devel-2.5'

* devel-2.5:
  LUGroupAssignNodes: Fix node membership corruption
  LUGroupAssignNodes: Fix node membership corruption
  Fix pylint warning on unreachable code
  LUNodeEvacuate: Disallow migrating all instances at once
  Separate OpNodeEvacuate.mode from iallocator
  LUNodeEvacuate: Locking fixes
  Fix error when removing node

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoMerge branch 'devel-2.4' into devel-2.5
Michael Hanselmann [Thu, 24 Nov 2011 09:06:39 +0000 (10:06 +0100)]
Merge branch 'devel-2.4' into devel-2.5

* devel-2.4:
  LUGroupAssignNodes: Fix node membership corruption

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoMerge branch 'stable-2.5' into devel-2.5
Michael Hanselmann [Thu, 24 Nov 2011 08:40:10 +0000 (09:40 +0100)]
Merge branch 'stable-2.5' into devel-2.5

* stable-2.5:
  LUGroupAssignNodes: Fix node membership corruption
  Fix pylint warning on unreachable code
  LUNodeEvacuate: Disallow migrating all instances at once
  LUNodeEvacuate: Locking fixes
  Fix error when removing node

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoLUGroupAssignNodes: Fix node membership corruption
Michael Hanselmann [Thu, 24 Nov 2011 07:43:04 +0000 (08:43 +0100)]
LUGroupAssignNodes: Fix node membership corruption

Note: This bug only manifests itself in Ganeti 2.5, but since the
problematic code also exists in 2.4, I decided to fix it there.

If a node was assigned to a new group using “gnt-group assign-nodes” the
node object's group would be changed, but not the duplicate member list
in the group object. The latter is an optimization to require fewer
locks for other operations. The per-group member list is only kept in
memory and not written to disk.

Ganeti 2.5 starts to make use of the data kept in the per-group member
list and consequently fails when it is out of date. The following
commands can be used to reproduce the issue in 2.5 (in 2.4 the issue was
confirmed using additional logging):

  $ gnt-group add foo
  $ gnt-group assign-nodes foo $(gnt-node list --no-header -o name)
  $ gnt-cluster verify  # Fails with KeyError

This patch moves the code modifying node and group objects into
“config.ConfigWriter” to do the complete operation under the config
lock, and also to avoid making use of side-effects of modifying objects
without calling “ConfigWriter.Update”. A unittest is included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
(cherry picked from commit 218f4c3de706aca7e4521d7e1975f517cf5ecb9b)

12 years agoLUGroupAssignNodes: Fix node membership corruption
Michael Hanselmann [Thu, 24 Nov 2011 07:43:04 +0000 (08:43 +0100)]
LUGroupAssignNodes: Fix node membership corruption

Note: This bug only manifests itself in Ganeti 2.5, but since the
problematic code also exists in 2.4, I decided to fix it there.

If a node was assigned to a new group using “gnt-group assign-nodes” the
node object's group would be changed, but not the duplicate member list
in the group object. The latter is an optimization to require fewer
locks for other operations. The per-group member list is only kept in
memory and not written to disk.

Ganeti 2.5 starts to make use of the data kept in the per-group member
list and consequently fails when it is out of date. The following
commands can be used to reproduce the issue in 2.5 (in 2.4 the issue was
confirmed using additional logging):

  $ gnt-group add foo
  $ gnt-group assign-nodes foo $(gnt-node list --no-header -o name)
  $ gnt-cluster verify  # Fails with KeyError

This patch moves the code modifying node and group objects into
“config.ConfigWriter” to do the complete operation under the config
lock, and also to avoid making use of side-effects of modifying objects
without calling “ConfigWriter.Update”. A unittest is included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoFix pylint warning on unreachable code
Michael Hanselmann [Thu, 24 Nov 2011 07:58:56 +0000 (08:58 +0100)]
Fix pylint warning on unreachable code

Commit c50452c3186 added an exception when all instances should be
evacuated off a node, but did so in a way which made pylint complain
about unreachable code.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoLUNodeEvacuate: Disallow migrating all instances at once
Michael Hanselmann [Wed, 23 Nov 2011 13:01:23 +0000 (14:01 +0100)]
LUNodeEvacuate: Disallow migrating all instances at once

There is a design issue in the iallocator interface which prevents us
from doing this.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoSeparate OpNodeEvacuate.mode from iallocator
Michael Hanselmann [Wed, 23 Nov 2011 12:40:46 +0000 (13:40 +0100)]
Separate OpNodeEvacuate.mode from iallocator

Until now the iallocator constants for node evacuation
(IALLOCATOR_NEVAC_*) were also used for the opcode. However, it turned
out this was due to a misunderstanding and is incorrect. This patch adds
new constants (with the same values) and changes the affected places.
Fortunately the RAPI client already used good names, so no changes are
necessary.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

Signed-off-by: Michael Hanselmann <hansmi@google.com>

12 years agoLUNodeEvacuate: Locking fixes
Michael Hanselmann [Wed, 23 Nov 2011 12:16:14 +0000 (13:16 +0100)]
LUNodeEvacuate: Locking fixes

When evacuating a node, only an assertion without informative text was
used to check if the necessary node locks had been acquired. This was on
top of evaluating the list of nodes without having a node group lock, so
this was changed as well.

Also update some exception messages to include “retry the operation”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoFix error when removing node
Michael Hanselmann [Wed, 23 Nov 2011 07:15:18 +0000 (08:15 +0100)]
Fix error when removing node

ConfigWriter.GetAllInstancesInfo returns a dictionary, not a list.
Removing a node would fail with “too many values to unpack”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agomanpages: update beparams explanations
Guido Trotter [Mon, 21 Nov 2011 11:49:21 +0000 (11:49 +0000)]
manpages: update beparams explanations

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoconstants: reindent a few dicts
Guido Trotter [Tue, 22 Nov 2011 10:07:45 +0000 (10:07 +0000)]
constants: reindent a few dicts

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoRemove BE_MEMORY from beparams but keep compatibility
Guido Trotter [Mon, 21 Nov 2011 11:18:22 +0000 (11:18 +0000)]
Remove BE_MEMORY from beparams but keep compatibility

Queries are already compatible (be/memory is an alias for be/maxmem) and
import/exports work. This patch patch fixes it for cluster init, modify
and instance add/start/modify.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoburnin: use mem_size as max and min
Guido Trotter [Mon, 21 Nov 2011 11:12:00 +0000 (11:12 +0000)]
burnin: use mem_size as max and min

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agounittests: use max/min memory
Guido Trotter [Mon, 21 Nov 2011 11:05:31 +0000 (11:05 +0000)]
unittests: use max/min memory

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agocmdlib: use MAXMEM for all operations
Guido Trotter [Mon, 21 Nov 2011 10:56:50 +0000 (10:56 +0000)]
cmdlib: use MAXMEM for all operations

Since for now we can only start instances at their maximum memory, we
modify all checks to use that value. When we'll have better support for
using a value in between some of these checks have to move to minimum
memory.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoqa: use maximum and minimum memory
Guido Trotter [Mon, 21 Nov 2011 10:43:28 +0000 (10:43 +0000)]
qa: use maximum and minimum memory

test modification of either parameter, but also both at once.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agohypervisors: use maximum memory for all operations
Guido Trotter [Mon, 21 Nov 2011 10:22:35 +0000 (10:22 +0000)]
hypervisors: use maximum memory for all operations

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoImportExport: use max and min memory params
Guido Trotter [Thu, 17 Nov 2011 14:09:32 +0000 (14:09 +0000)]
ImportExport: use max and min memory params

Import uses the old "memory" parameter to populate the two new ones, if
they're not overridden already.

FinalizeExport exports minmem and maxmem, but also memory, as maxmem, to
allow importing to older ganeti clusters.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoQuery: allow query on maximum and minimum memory
Guido Trotter [Thu, 17 Nov 2011 14:39:43 +0000 (14:39 +0000)]
Query: allow query on maximum and minimum memory

be/memory is kept as an alias.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoShowInstanceConfig: show max and min memory
Guido Trotter [Thu, 17 Nov 2011 14:12:44 +0000 (14:12 +0000)]
ShowInstanceConfig: show max and min memory

The old "memory" value is kept as maxmem, for now, for
retrocompatibility.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoinstance hooks: pass maximum and minimum memory
Guido Trotter [Thu, 17 Nov 2011 15:07:55 +0000 (15:07 +0000)]
instance hooks: pass maximum and minimum memory

Also pass the "memory" value for retrocompatibility, for now.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agobeparams: add min/max memory values
Guido Trotter [Thu, 10 Nov 2011 16:19:12 +0000 (16:19 +0000)]
beparams: add min/max memory values

For now the new "memory" parameter stays there, but it will be removed
later. The new values are just taken from the old one, in this patch.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agodesign-resource-model: update disk params section
Andrea Spadaccini [Mon, 21 Nov 2011 10:43:28 +0000 (10:43 +0000)]
design-resource-model: update disk params section

Simplify design by moving all the parameters to disk template level,
explaining why this is sub-optimal. Add notes about DRBD versions,
corner cases and parameters application time.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoSet DRBD sync speed in DRBD8.Assemble
Andrea Spadaccini [Wed, 16 Nov 2011 11:31:18 +0000 (11:31 +0000)]
Set DRBD sync speed in DRBD8.Assemble

Instead of relying on clients of the class for setting the device speed
(and, in general, the DRBD parameters), move this responsibility inside
the Assemble method.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agobuild-rpc: Fail if call is defined more than once
Michael Hanselmann [Mon, 21 Nov 2011 09:34:53 +0000 (10:34 +0100)]
build-rpc: Fail if call is defined more than once

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoReapply commit 2a6de57 after merge
Andrea Spadaccini [Fri, 18 Nov 2011 12:07:27 +0000 (12:07 +0000)]
Reapply commit 2a6de57 after merge

In the last merge I erroneously discarded the changes introduced by
commit 2a6de57 "Check the results of master IP RPCs". This commit
reintroduces them.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoFix QA breakage caused by merge 0e82dcf9
Michael Hanselmann [Mon, 21 Nov 2011 07:22:24 +0000 (08:22 +0100)]
Fix QA breakage caused by merge 0e82dcf9

Patch tested and confirmed to work by Andrea Spadaccini
<spadaccio@google.com>.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

12 years agomasterd: Initialize job queue only after RPC client
Michael Hanselmann [Thu, 17 Nov 2011 11:08:32 +0000 (12:08 +0100)]
masterd: Initialize job queue only after RPC client

Otherwise jobs started after an unclean master shutdown will fail as
they depend on the RPC client.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agomasterd: Shutdown only once running jobs have been processed
Michael Hanselmann [Thu, 17 Nov 2011 11:07:57 +0000 (12:07 +0100)]
masterd: Shutdown only once running jobs have been processed

Until now, if masterd received a fatal signal, it would start shutting
down immediately. In the meantime it would hang while jobs are still
processed. Clients couldn't connect anymore to retrieve a jobs' status.

This this patch masterd checks if any job is running before shutting
down. If there is it'll check again every five seconds. Once all jobs
are finished, it waits another five seconds to give clients a chance to
retrieve the jobs' status. After that masterd will shutdown in a clean
fashion.

If a second signal is received the old behaviour is preserved.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agodaemon: Support clean daemon shutdown
Michael Hanselmann [Thu, 17 Nov 2011 11:01:33 +0000 (12:01 +0100)]
daemon: Support clean daemon shutdown

Instead of aborting the main loop as soon as a fatal signal (SIGTERM or
SIGINT) is received, additional logic allows waiting for tasks to finish
while I/O is still being processed.

If no callback function is provided the old behaviour--shutting down
on the first signal--is preserved.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agodaemon: Allow custom maximum timeout for scheduler
Michael Hanselmann [Thu, 17 Nov 2011 10:56:34 +0000 (11:56 +0100)]
daemon: Allow custom maximum timeout for scheduler

This is needed in case the scheduler user (daemon.Mainloop in this case)
has other timeouts at the same time. Needed for clean master shutdown.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agojqueue: Add code to prepare for queue shutdown
Michael Hanselmann [Thu, 17 Nov 2011 10:55:18 +0000 (11:55 +0100)]
jqueue: Add code to prepare for queue shutdown

Doing so will prevent job submissions (similar to a drained queue),
but won't affect currently running jobs. No further jobs will be
executed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoworkerpool: Export function to check for running tasks
Michael Hanselmann [Wed, 16 Nov 2011 11:35:07 +0000 (12:35 +0100)]
workerpool: Export function to check for running tasks

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agodaemon: Use counter instead of boolean for mainloop abortion
Michael Hanselmann [Wed, 16 Nov 2011 11:34:50 +0000 (12:34 +0100)]
daemon: Use counter instead of boolean for mainloop abortion

Also log a message when a fatal signal was received and use dict.items.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agohtools: adjust imports for newer compilers
Iustin Pop [Thu, 17 Nov 2011 14:40:18 +0000 (15:40 +0100)]
htools: adjust imports for newer compilers

While testing with ghc 7.2, I saw that some imports we are using are
very old (from ghc 6.8 time), even though current libraries are using
different names.

We fix this and bump minimum documented version to ghc 6.12, as I
don't have 6.10 to test anymore (possibly still works with that
version, but better safe - both Ubuntu Lucid and Debian Squeeze ship
with 6.12 nowadays).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoMerge branch 'devel-2.5'
Andrea Spadaccini [Fri, 18 Nov 2011 11:27:16 +0000 (11:27 +0000)]
Merge branch 'devel-2.5'

* devel-2.5: (24 commits)
  LUInstanceCreate: Release unused node locks
  htools: rework message display construction
  hbal: handle empty node groups
  Document OpNodeMigrate's result for RAPI
  Ensure unused ports return to the free port pool
  Re-wrap a paragraph to eliminate a sphinx warning
  Fix newer pylint's E0611 error in compat.py
  Fail if node/group evacuation can't evacuate instances
  Update init script description
  LUInstanceRename: Compare name with name
  LUClusterRepairDiskSizes: Acquire instance locks in exclusive mode
  Update synopsis for “gnt-cluster repair-disk-sizes”
  Move hooks PATH environment variable to constants
  Check the results of master IP RPCs
  Add documentation for the master IP hooks
  Add master IP turnup and turndown hooks
  Add RunLocalHooks decorator
  Generalize HooksMaster
  Update NEWS for 2.5.0~rc4
  Bump version to 2.5.0~rc4
  ...

Conflicts:
NEWS
doc/hooks.rst
lib/backend.py
lib/cmdlib.py
lib/constants.py

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agoMerge branch 'stable-2.5' into devel-2.5
Michael Hanselmann [Fri, 18 Nov 2011 10:09:33 +0000 (11:09 +0100)]
Merge branch 'stable-2.5' into devel-2.5

* stable-2.5:
  htools: rework message display construction
  hbal: handle empty node groups
  Document OpNodeMigrate's result for RAPI
  Fail if node/group evacuation can't evacuate instances
  LUInstanceRename: Compare name with name
  LUClusterRepairDiskSizes: Acquire instance locks in exclusive mode
  Update NEWS for 2.5.0~rc4
  Bump version to 2.5.0~rc4
  jqueue: Allow zero jobs to be submitted at once
  hail: don't select the primary as new secondary
  hail: add an extra safety check in relocate
  Bump version to 2.5.0~rc3

Conflicts:
configure.ac: Trivial

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agoMerge branch 'devel-2.4' into devel-2.5
Michael Hanselmann [Fri, 18 Nov 2011 07:27:30 +0000 (08:27 +0100)]
Merge branch 'devel-2.4' into devel-2.5

* devel-2.4:
  Ensure unused ports return to the free port pool
  Re-wrap a paragraph to eliminate a sphinx warning

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agoadmin.rst update regarding offline state of the instance
Agata Murawska [Thu, 17 Nov 2011 14:59:39 +0000 (15:59 +0100)]
admin.rst update regarding offline state of the instance

Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoNEWS update - offline instance state
Agata Murawska [Wed, 16 Nov 2011 15:59:37 +0000 (16:59 +0100)]
NEWS update - offline instance state

Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoBackwards compatibity - added admin_up to query
Agata Murawska [Wed, 16 Nov 2011 16:08:49 +0000 (17:08 +0100)]
Backwards compatibity - added admin_up to query

Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoMan page update: online/offline state of instance
Agata Murawska [Wed, 16 Nov 2011 15:41:15 +0000 (16:41 +0100)]
Man page update: online/offline state of instance

Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoAdd small node in admin.rst about confd disabling
Iustin Pop [Thu, 17 Nov 2011 11:33:28 +0000 (12:33 +0100)]
Add small node in admin.rst about confd disabling

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoWarn if we enable maintain-node-health without confd
Iustin Pop [Thu, 17 Nov 2011 11:31:10 +0000 (12:31 +0100)]
Warn if we enable maintain-node-health without confd

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoAdapt daemon-util to ENABLE_CONFD
Iustin Pop [Thu, 17 Nov 2011 11:19:22 +0000 (12:19 +0100)]
Adapt daemon-util to ENABLE_CONFD

We still allow explicit shutdown of confd, but we prevent manual
or automatic start-up.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoAdapt watcher for ENABLE_CONFD
Iustin Pop [Thu, 17 Nov 2011 11:04:58 +0000 (12:04 +0100)]
Adapt watcher for ENABLE_CONFD

If confd is disabled, do not automatically restart it. Furthermore, we
can't run maintenance actions if it is disabled so log a warning.

Note that I haven't completely disabled the NodeMaintenance class with
ENABLE_CONFD = False because I think they are at two different levels
(e.g. we might have other maintenance actions done even with confd
disabled).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoPrevent runnning of confd tests in burnin
Iustin Pop [Thu, 17 Nov 2011 10:55:03 +0000 (11:55 +0100)]
Prevent runnning of confd tests in burnin

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoAdd toggle for enabling/disabling confd
Iustin Pop [Thu, 17 Nov 2011 10:49:56 +0000 (11:49 +0100)]
Add toggle for enabling/disabling confd

Doesn't do anything yet.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoFix unittest bug related to offline instances
Iustin Pop [Thu, 17 Nov 2011 10:19:41 +0000 (11:19 +0100)]
Fix unittest bug related to offline instances

Currently, the code in Node.hs is overly strict: once a node's free
memory reaches 0, it will refuse to add any instances (offline or
not). I think this is a safe safeguard (I don't expect nodes to run
without at least 1MB of free memory), so rather than change this
behaviour we need to restrict the Node generation in the unittest to
skip such nodes.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

12 years agohtools: reindent the rest of the files
Iustin Pop [Wed, 16 Nov 2011 18:14:48 +0000 (19:14 +0100)]
htools: reindent the rest of the files

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: re-indent IAlloc.hs
Iustin Pop [Wed, 16 Nov 2011 17:53:47 +0000 (18:53 +0100)]
htools: re-indent IAlloc.hs

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: reindent hspace
Iustin Pop [Wed, 16 Nov 2011 17:19:42 +0000 (18:19 +0100)]
htools: reindent hspace

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: reindent hbal
Iustin Pop [Wed, 16 Nov 2011 17:18:16 +0000 (18:18 +0100)]
htools: reindent hbal

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: reindent CLI.hs
Iustin Pop [Wed, 16 Nov 2011 17:15:57 +0000 (18:15 +0100)]
htools: reindent CLI.hs

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>