ganeti-local
12 years agoAdd gnt-instance start --pause
Stephen Shirley [Thu, 23 Jun 2011 15:15:28 +0000 (17:15 +0200)]
Add gnt-instance start --pause

Creates the instance, but pauses execution before booting. This combined
with 'gnt-instance console' unpausing instances means that the entire
boot process can be viewed and monitored.

Signed-off-by: Stephen Shirley <diamond@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoAdding a wrapper around connecting to kvm console
Stephen Shirley [Fri, 24 Jun 2011 11:29:48 +0000 (13:29 +0200)]
Adding a wrapper around connecting to kvm console

The wrapper will connect to the console, and check in the background if
the instance is paused, unpausing it as necessary.

Signed-off-by: Stephen Shirley <diamond@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoAdding a wrapper around "xm console"
Stephen Shirley [Thu, 23 Jun 2011 09:42:23 +0000 (11:42 +0200)]
Adding a wrapper around "xm console"

The wrapper will connect to the console, and check in the background if
the instance is paused, unpausing it as necessary.

Signed-off-by: Stephen Shirley <diamond@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoFix lint error
Michael Hanselmann [Tue, 5 Jul 2011 22:54:42 +0000 (00:54 +0200)]
Fix lint error

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agoRAPI: Document all feature strings
Michael Hanselmann [Mon, 27 Jun 2011 22:13:23 +0000 (00:13 +0200)]
RAPI: Document all feature strings

- Use constants and an assertion
- Update documentation for node migration

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoRemove old node evacuation opcode
Michael Hanselmann [Mon, 23 May 2011 13:58:29 +0000 (15:58 +0200)]
Remove old node evacuation opcode

LUNodeEvacStrategy has been replaced with LUNodeEvacuate.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoChange RAPI for new node evacuation opcode
Michael Hanselmann [Mon, 23 May 2011 12:43:33 +0000 (14:43 +0200)]
Change RAPI for new node evacuation opcode

The change is not backwards compatible, see the updated NEWS file.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoChange “gnt-node evacuate” to use new opcode
Michael Hanselmann [Fri, 20 May 2011 14:31:54 +0000 (16:31 +0200)]
Change “gnt-node evacuate” to use new opcode

By default it'll now evacuate all instances from the node, not
just secondaries.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoAdd new opcode to evacuate node
Michael Hanselmann [Fri, 20 May 2011 13:30:33 +0000 (15:30 +0200)]
Add new opcode to evacuate node

This new opcode will replace LUNodeEvacStrategy, which used to return a
list of instances and new secondary nodes. With the new opcode the
iallocator (if available) is tasked to generate the necessary operations
in the form of opcodes. This moves some logic from the client to the
master daemon.

At the same time support is added to evacuate primary instances, which
are also evacuated by default.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

12 years agoAlias gnt-job show to gnt-job info
Guido Trotter [Tue, 5 Jul 2011 16:35:25 +0000 (17:35 +0100)]
Alias gnt-job show to gnt-job info

Am I the only one to make that mistake 10 times a week?

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoOne Haskell and integer sizes fix
Iustin Pop [Wed, 29 Jun 2011 13:06:12 +0000 (15:06 +0200)]
One Haskell and integer sizes fix

Haskell has two main integer types:

- Int, which is a native-type, and is guaranteed to have at least
  [-2²⁹, 2²⁹-1] range; on 64-bit platforms, it has much higher range
- Integer, which is a software type (implemented using libgmp), and
  thus unbounded

For performance reasons, the node/instance properties use Int for
their attributes (and Double for some, but that's another story). This
is all fine and doesn't cause problems. However, the CStats type which
holds the overall cluster resources starts to fail when we analyse
clusters with more than around 400 nodes and big memory/disk sizes on
32 bit platforms.

The simple fix would be to restrict cluster sizes, but that's no
nice. I've benchmarked and changing to Integer doesn't show a visible
slowdown on 64-bit platforms (as far as I can read on the internets,
GHC knows to optimise Integer and only use software types when the
values are large enough), and it also fixes the 32-bit problem. So
this patch changes the CStats types to Integer, except for the
instance count (which I don't expect to overflow 2²⁹ anytime soon).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoFix cluster verify for empty node groups
Iustin Pop [Fri, 1 Jul 2011 10:25:26 +0000 (12:25 +0200)]
Fix cluster verify for empty node groups

There were some implicit assertions in the code that all node groups
have nodes, which is not necessarily true.

Additionally, the patch does a wrapping change.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoFix a typo and punctuation in iallocator.rst
Iustin Pop [Thu, 30 Jun 2011 11:19:42 +0000 (13:19 +0200)]
Fix a typo and punctuation in iallocator.rst

Beside the 'dscription' typo, also make the punctuation more
consistent.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoFix htools, QuickCheck library detection and tests
Iustin Pop [Tue, 28 Jun 2011 16:12:54 +0000 (18:12 +0200)]
Fix htools, QuickCheck library detection and tests

Just saw this while testing the migration to QuickCheck v2: while
configure.ac detects that QuickCheck-2.x is not available, the test in
Makefile.am was against WANT_HTOOLS (overall htools compilation), not
on a more-specific WANT_HTOOLSTESTS.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agohtools: fix prop_Node_rMem corner case
Iustin Pop [Mon, 20 Jun 2011 13:29:52 +0000 (15:29 +0200)]
htools: fix prop_Node_rMem corner case

This patch fixes a bug in the test specification where we allowed nodes
with zero free memory (hence no instance can be added, at all) and adds
a simple labeling of the way this test can fail.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agohtools: fix corner case in prop_Text_Load_Instance
Iustin Pop [Mon, 20 Jun 2011 13:28:57 +0000 (15:28 +0200)]
htools: fix corner case in prop_Text_Load_Instance

This unittest had a corner case where it could fail if the same
primary/secondary node names were generated.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agohtools: allow unittest to be replayed
Iustin Pop [Mon, 20 Jun 2011 13:04:59 +0000 (16:04 +0300)]
htools: allow unittest to be replayed

This just adds glue to allow replaying of tests using a given RNG state
and test size (both are needed for exact replayability).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agohtools: improve more unittests
Iustin Pop [Sun, 19 Jun 2011 21:56:31 +0000 (00:56 +0300)]
htools: improve more unittests

Using new functionality in QuickCheck 2 (the suchThat function), we
generate now better test cases, such that (heh) we have no longer
incomplete tests.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agohtools: more fixes to unsatisfiable tests
Iustin Pop [Sun, 19 Jun 2011 21:43:42 +0000 (00:43 +0300)]
htools: more fixes to unsatisfiable tests

Currently the way we generate nodes in some cases is by creating a
totally random node, then restricting the test based on whether the node
'size' (as defined by multiples of base unit) satifies some high/low
rules. This results in hard-to-satisfy conditions, so we change this
model to be able to specify node sizes directly in the generation
process, thus no longer needed post-creation filters.

This fixes prop_ClusterAllocBalance which before had at most 1-2
satisfiable tests.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agohtools: Rework some unittests
Iustin Pop [Sun, 19 Jun 2011 21:11:51 +0000 (00:11 +0300)]
htools: Rework some unittests

The new scaffolding which replaced the batch driver of QuickCheck 1 now
shows how many passes we have for incomplete tests. Some tests show very
low pass counts, so we rework them to have more actually valid test
cases.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agohtools: Switch to QuickCheck 2.x
Iustin Pop [Sun, 19 Jun 2011 20:46:15 +0000 (23:46 +0300)]
htools: Switch to QuickCheck 2.x

Since current distros don't package anymore QuickCheck 1.x, let's move
to 2.x.

This requires also a few changes to the code:

- Test.QuickCheck.Batch doesn't exist anymore, so we need to write some
  scaffolding code to replace it
- the way test sizes are generated has changed, and we need to restrict
  (in some tests) the cluster size, as our code is not yet ready for
  hundreds of thousands of nodes in a cluster and we run out of stack
  (which could be a bug somewhere by itself, needs investigation)
- at least with GHC 7, floating point errors make a perfect cluster
  score even bigger, so we need to bump up the max. rounding error
  allowed

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agohtools: some lint fixes
Iustin Pop [Sun, 19 Jun 2011 10:38:32 +0000 (13:38 +0300)]
htools: some lint fixes

Removal of duplicate parantheses, removal of extra 'do', conversion from
nested if to guards, use hierarchical imports. All per hlint.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agohtools: rewrite Cluster.filterMGResults
Iustin Pop [Sun, 19 Jun 2011 02:46:58 +0000 (05:46 +0300)]
htools: rewrite Cluster.filterMGResults

filterMGResults was built using a sequence of map and filter calls;
while this was logically correct, it used some incomplete pattern
matching which with the new GHC 7 triggers a warning.

The patch rewrites it using a single foldl that does both the filtering
and the mapping, in a more type-safe way.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agohtools: use the new Group.isAllocable
Iustin Pop [Sun, 19 Jun 2011 02:48:05 +0000 (05:48 +0300)]
htools: use the new Group.isAllocable

… instead of the hardcoded test against AllocUnallocable.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agohtools: add a helper function
Iustin Pop [Sat, 18 Jun 2011 15:32:39 +0000 (17:32 +0200)]
htools: add a helper function

… that checks if a group is allocable.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoEnable using ghc parallel version 3
Guido Trotter [Thu, 9 Jun 2011 09:28:08 +0000 (10:28 +0100)]
Enable using ghc parallel version 3

Currently htools cannot be compiled under sid because the parallel
haskell library is version 3. Using it issues a few warning, but
compiles and passes unit tests. Ship it?

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'devel-2.4'
Guido Trotter [Thu, 23 Jun 2011 13:56:00 +0000 (14:56 +0100)]
Merge branch 'devel-2.4'

* devel-2.4:
  LUInstanceCreate: use opcodes.RequireFileStorage
  Don't add ",boot=on" to disks on kvm >= 0.14
  KVM: fix per-instance stored UID value

Conflicts:
lib/cmdlib.py
          - use RequireSharedFileStorage there

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoremove bootstrap._InitSharedFileStorage
Guido Trotter [Fri, 17 Jun 2011 14:27:36 +0000 (17:27 +0300)]
remove bootstrap._InitSharedFileStorage

This function is a copy of bootstrap._InitFileStorage with the following
differences:
  - check constants.ENABLE_SHARED_FILE_STORAGE and not
    constants.ENABLE_FILE_STORAGE
  - use different local variable names
  - one different error string

Thus:
  - move the constant check outside of the function call
  - change error string so it's clear where the error is
  - call the same function twice

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoLUInstanceCreate: use opcodes.RequireFileStorage
Guido Trotter [Wed, 22 Jun 2011 13:14:27 +0000 (14:14 +0100)]
LUInstanceCreate: use opcodes.RequireFileStorage

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoDon't add ",boot=on" to disks on kvm >= 0.14
Guido Trotter [Thu, 9 Jun 2011 09:01:30 +0000 (09:01 +0000)]
Don't add ",boot=on" to disks on kvm >= 0.14

Under newer kvm this prevents the vm from starting.
Ah, change!

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoKVM: fix per-instance stored UID value
Apollon Oikonomopoulos [Wed, 22 Jun 2011 15:41:29 +0000 (18:41 +0300)]
KVM: fix per-instance stored UID value

When using the pool security model, _ExecuteKVMRuntime was storing the
instance's UID using str(uid), which would result in storing the
LockedUid.__repr__() result:

 $ cat /var/run/ganeti/kvm-hypervisor/uid/xxxxxxxxxxxxx
 <ganeti.uidpool.LockedUid object at 0x1f30610>

This patch restores the intended behaviour, by using LockedUid.AsStr().

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoMerge branch 'devel-2.4'
Guido Trotter [Fri, 17 Jun 2011 12:57:41 +0000 (15:57 +0300)]
Merge branch 'devel-2.4'

* devel-2.4:
  Add one forgotten element to the file disk path

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoAdd one forgotten element to the file disk path
Guido Trotter [Fri, 17 Jun 2011 11:44:55 +0000 (14:44 +0300)]
Add one forgotten element to the file disk path

This was left out during the fix/refactoring

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMerge branch 'devel-2.4'
Guido Trotter [Fri, 17 Jun 2011 11:30:51 +0000 (14:30 +0300)]
Merge branch 'devel-2.4'

* devel-2.4:
  LUInstanceCreate: fix file storage dir calculation
  Check that filestorage is enabled when requested
  Remove self.op.file_storage_dir isabs check

Conflicts:
lib/cmdlib.py
          - use constants.DTS_FILEBASED
          - handle DT_SHARED_FILE correctly

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoAdd DTS_FILEBASED constant
Guido Trotter [Fri, 17 Jun 2011 09:19:44 +0000 (12:19 +0300)]
Add DTS_FILEBASED constant

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoLUInstanceCreate: fix file storage dir calculation
Guido Trotter [Fri, 17 Jun 2011 09:39:43 +0000 (12:39 +0300)]
LUInstanceCreate: fix file storage dir calculation

- Move the calculation at the beginning of CheckPrereq, since it doesn't
  modify any state, but still keeps locks
- Only perform the calculation if the actual disk template is filebased
- Error out if there is no defined file storage dir
- Only join the optional --file-storage-dir extra-path if one is passed

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoCheck that filestorage is enabled when requested
Guido Trotter [Fri, 17 Jun 2011 09:23:51 +0000 (12:23 +0300)]
Check that filestorage is enabled when requested

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoRemove self.op.file_storage_dir isabs check
Guido Trotter [Fri, 17 Jun 2011 09:16:58 +0000 (09:16 +0000)]
Remove self.op.file_storage_dir isabs check

As the manpage says, and the code does, self.op.file_storage_dir is an
additional relative path under the cluster file storage dir. As such it
should not be absolute.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agohtools live test: test instance selection as well
Guido Trotter [Fri, 10 Jun 2011 13:24:49 +0000 (14:24 +0100)]
htools live test: test instance selection as well

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years ago--select-instances hbal manpage update
Guido Trotter [Fri, 10 Jun 2011 12:27:12 +0000 (12:27 +0000)]
--select-instances hbal manpage update

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoCheck that the selected instances are known
Guido Trotter [Mon, 13 Jun 2011 12:01:08 +0000 (12:01 +0000)]
Check that the selected instances are known

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoLoader.updateMovable: evaluate selected instances
Guido Trotter [Mon, 13 Jun 2011 11:44:58 +0000 (11:44 +0000)]
Loader.updateMovable: evaluate selected instances

This also adds docstrings for the function arguments and renames exinst
to exinsts, which is how it is called in other functions, since it's a
list.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd instance selection list to Loader.mergeData
Guido Trotter [Fri, 10 Jun 2011 13:45:09 +0000 (14:45 +0100)]
Add instance selection list to Loader.mergeData

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd --select-instances hbal flag
Guido Trotter [Fri, 10 Jun 2011 13:44:30 +0000 (14:44 +0100)]
Add --select-instances hbal flag

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoRemove double whitespace in help string
Guido Trotter [Fri, 10 Jun 2011 13:30:10 +0000 (14:30 +0100)]
Remove double whitespace in help string

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd gnt-network design doc
Apollon Oikonomopoulos [Thu, 16 Jun 2011 12:15:30 +0000 (15:15 +0300)]
Add gnt-network design doc

This design covers high level network block definition and pool
management.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoReplace iallocator's mreloc w/ change-group and node-evac
Michael Hanselmann [Tue, 14 Jun 2011 16:27:06 +0000 (18:27 +0200)]
Replace iallocator's mreloc w/ change-group and node-evac

This patch removes all occurrences of the “multi-relocate” iallocator
mode. Commit 25ee7fd845 updated the design document and introduced
separate modes, “change-group” and “node-evacuate”. The constants aren't
removed yet as they're still used by htools.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoFix a couple of typos
Stephen Shirley [Mon, 6 Jun 2011 08:59:46 +0000 (10:59 +0200)]
Fix a couple of typos

Signed-off-by: Stephen Shirley <diamond@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMakefile: Add version check for iallocator.rst
Michael Hanselmann [Mon, 6 Jun 2011 15:18:30 +0000 (17:18 +0200)]
Makefile: Add version check for iallocator.rst

iallocator.rst contains the Ganeti version at the top.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoUpdate iallocator design for node group-aware operations
Michael Hanselmann [Mon, 6 Jun 2011 15:10:43 +0000 (17:10 +0200)]
Update iallocator design for node group-aware operations

A while ago a new ``multi-relocate`` mode was proposed and documented.
As it turned out, the interface had some deficiencies. With this patch
The relocation modes are reduced to two and split into separate
iallocator request modes: node-evacuate and change-group. Some request
and response requirements are clarified in the documentation.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agojqueue: Allow loading of archived jobs
Michael Hanselmann [Wed, 1 Jun 2011 15:44:56 +0000 (17:44 +0200)]
jqueue: Allow loading of archived jobs

Chained jobs need to look at previous jobs, including archived ones. A
nice side-effect of this change is the ability to look at archived jobs
using “gnt-job info <id>” as long as the ID is known.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdding basic abstraction layer for caching
René Nussbaumer [Tue, 31 May 2011 09:54:23 +0000 (11:54 +0200)]
Adding basic abstraction layer for caching

This includes an own simple cache implementation and an
interface to a memcache instance.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoFix _checkRsaPrivateKey for newer key generation
Guido Trotter [Thu, 9 Jun 2011 10:03:39 +0000 (10:03 +0000)]
Fix _checkRsaPrivateKey for newer key generation

Keys generated under debian sid just read "BEGIN PRIVATE KEY" rather
than "BEGIN RSA PRIVATE KEY".

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoFix locking issues in LUClusterVerifyGroup
Michael Hanselmann [Tue, 7 Jun 2011 06:48:55 +0000 (08:48 +0200)]
Fix locking issues in LUClusterVerifyGroup

- Use functions in ConfigWriter instead of custom loops
- Calculate nodes only once instances locks are acquired, removes one
  potential race condition
- Don't retrieve lists of all node/instance information without locks
- Additionally move the end of the node time check window after the
  first RPC call--the second call isn't involved in checking the
  node time at all

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agocmdlib: Acquire BGL for LUClusterVerifyConfig
Michael Hanselmann [Tue, 7 Jun 2011 05:17:09 +0000 (07:17 +0200)]
cmdlib: Acquire BGL for LUClusterVerifyConfig

LUClusterVerifyConfig verifies a number of configuration settings. For
doing so, it needs a consistent list of nodes, groups and instances. So
far no locks were acquired at all (except for the BGL in shared mode).
This is a race condition (e.g. if a node group is added in parallel) and
can be fixed by acquiring the BGL in exclusive mode. Since this LU
verifies the cluster-wide configuration, doing so instead of acquiring
individual locks is just.

Includes one typo fix and one docstring update.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoExport/import instance tags
Michael Hanselmann [Tue, 7 Jun 2011 10:46:05 +0000 (12:46 +0200)]
Export/import instance tags

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoFix issue with tags on instance creation
Michael Hanselmann [Tue, 7 Jun 2011 10:45:53 +0000 (12:45 +0200)]
Fix issue with tags on instance creation

Commit 720f56c85a added the ability to specify tags when creating an
instance. The “tags” attribute of an instance object needs to be a set,
but the patch's code saved it as a list, causing breakage in other parts
of Ganeti. This patch changes the code to use TaggableObject.AddTag,
which has a nice side-effect of doing some verification (including max.
number of tags). Instance import was also broken (no “tags” attribute in
options).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoFix incomplete merge
Iustin Pop [Fri, 3 Jun 2011 09:53:04 +0000 (11:53 +0200)]
Fix incomplete merge

Commit 66bd7445 changed the semantics of _JobProcessor on finished
jobs, and updated the related unittests in the 2.4 branch. It was then
merged to master, however on master there was an additional test for
this case, which was not updated.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoExport instance tags to instance hooks
Apollon Oikonomopoulos [Tue, 31 May 2011 12:50:28 +0000 (15:50 +0300)]
Export instance tags to instance hooks

Instance hooks now get an INSTANCE_TAGS environment variable, which contains a
space-delimited list of the affected instance's tags.

Also update the documentation to reflect the change.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd tagging option to gnt-instance create
Apollon Oikonomopoulos [Tue, 31 May 2011 12:49:58 +0000 (15:49 +0300)]
Add tagging option to gnt-instance create

Add TAG_ADD_OPT option to cli.py and use it in gnt-instance. Modify
cli.GenericInstanceCreate() accordingly.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd tag handling to {Op,LU}InstanceCreate
Apollon Oikonomopoulos [Tue, 31 May 2011 12:49:28 +0000 (15:49 +0300)]
Add tag handling to {Op,LU}InstanceCreate

Add a tag slot to opcodes.OpInstanceCreate. We do not reuse _PTags, as this is
intended for OpTagsSet and thus:

  a) is not documented
  b) does not carry a default value, making it mandatory

Also pass the tags to the iallocator during instance creation.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agohttp.client: Make debug log less noisy
Michael Hanselmann [Thu, 26 May 2011 14:40:50 +0000 (16:40 +0200)]
http.client: Make debug log less noisy

The HTTP client code generates quite a lot of debug log messages. With
this patch they're hidden unless explicitely enabled in the code.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'devel-2.4'
Michael Hanselmann [Wed, 1 Jun 2011 16:10:06 +0000 (18:10 +0200)]
Merge branch 'devel-2.4'

* devel-2.4:
  jqueue: Fix potential race condition when cancelling queued jobs
  Fix argument order in ReserveLV and ReserveMAC

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agohtools: introduce a type alias for JSON objects
Iustin Pop [Wed, 1 Jun 2011 14:51:39 +0000 (16:51 +0200)]
htools: introduce a type alias for JSON objects

This makes the type definitions a bit more readable/simpler.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agohail: stop using old-style 'nodes' key
Iustin Pop [Mon, 30 May 2011 12:13:13 +0000 (14:13 +0200)]
hail: stop using old-style 'nodes' key

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agohail: add parsing of multi-relocate request
Iustin Pop [Mon, 30 May 2011 11:50:49 +0000 (13:50 +0200)]
hail: add parsing of multi-relocate request

This is not handled yet, this patch just adds parsing of the incoming
request.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agohail: add option for displaying the parsed request
Iustin Pop [Mon, 30 May 2011 08:43:50 +0000 (10:43 +0200)]
hail: add option for displaying the parsed request

This can be used for debugging.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agohail: add new data types for the multi-reloc mode
Iustin Pop [Thu, 26 May 2011 13:24:36 +0000 (15:24 +0200)]
hail: add new data types for the multi-reloc mode

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoAdd --no-instance-moves to the htools live tests
Guido Trotter [Tue, 31 May 2011 11:14:26 +0000 (13:14 +0200)]
Add --no-instance-moves to the htools live tests

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoUpdate hbal manpage for --no-instance-moves
Guido Trotter [Tue, 31 May 2011 10:38:50 +0000 (12:38 +0200)]
Update hbal manpage for --no-instance-moves

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoImplement balancing with no instance moves
Guido Trotter [Tue, 31 May 2011 14:57:51 +0000 (16:57 +0200)]
Implement balancing with no instance moves

Note that --no-disk-moves and --no-instance-moves are not incompatible,
but if both are used no solution can possibly exist.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoPass the instance moves option in hbal
Guido Trotter [Tue, 31 May 2011 10:34:19 +0000 (10:34 +0000)]
Pass the instance moves option in hbal

While still being ignored, now it gets passed down to the iteration
function.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd --no-instance-moves cli htools option
Guido Trotter [Mon, 30 May 2011 14:10:13 +0000 (16:10 +0200)]
Add --no-instance-moves cli htools option

This option doesn't currently do anything.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agojqueue: Fix potential race condition when cancelling queued jobs
Michael Hanselmann [Tue, 31 May 2011 14:49:45 +0000 (16:49 +0200)]
jqueue: Fix potential race condition when cancelling queued jobs

When a job was cancelled, its status would be changed and the file
written again. Since this was a final status, the job file could be
moved anytime for archival. If the job was still in the queue, however,
it would be processed (not fully, just updating the “end_timestamp”
attribute) and written again. This was bad as it could leave the same
job in two different files.

With this patch the processor is changed to return early for finished
jobs. Cancelling a queued job will finalize it right away. Unittests are
updated.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoiallocator: add ht-checking for the request
Iustin Pop [Mon, 30 May 2011 10:52:22 +0000 (12:52 +0200)]
iallocator: add ht-checking for the request

Currently, we only ht-check the result value from the iallocator, and
we send whatever we happen to check manually in the LUs that call the
iallocator.

This is not good, as we have to duplicate checks in many places, and
still we might miss checks. So we add add ht information to the
per-request variables. As the cluster data is built in one place, the
iallocator code itself (and is more consistent), I didn't add checks
to that too.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoiallocator: rename mem_size to memory
Iustin Pop [Mon, 30 May 2011 11:14:18 +0000 (13:14 +0200)]
iallocator: rename mem_size to memory

Currently, the iallocator in 'allocate' requires mem_size on input
but serialises that as 'memory'. This inconsistency makes it hard to
automatically validate the parameters, hence this patch renames
mem_size.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoiallocator: change default for target_groups
Iustin Pop [Mon, 30 May 2011 09:56:45 +0000 (11:56 +0200)]
iallocator: change default for target_groups

Per the design doc, the target_groups request key "if present, it must
either be the empty list, or contain a list of group UUIDs". Currently
it defaults to None/null, which is not valid.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoiallocator: export the hypervisor value
Iustin Pop [Mon, 30 May 2011 11:24:39 +0000 (13:24 +0200)]
iallocator: export the hypervisor value

In 'allocate' mode, the documentation specifies that we export the
hypervisor value (“Allocation needs, in addition: … hypervisor, the
hypervisor of this instance”) and we need that on input, however we
don't actually export it.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoiallocator: fix incomplete refactoring
Iustin Pop [Mon, 30 May 2011 10:54:44 +0000 (12:54 +0200)]
iallocator: fix incomplete refactoring

Commit fdbe29ee changed the iallocator modes from 'r'/'w' to
'ro'/'rw', but forgot one check in LUTestAllocator. This patch just
completes the replacements.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agognt-node migrate: Use LU-generated jobs
Michael Hanselmann [Mon, 23 May 2011 16:55:50 +0000 (18:55 +0200)]
gnt-node migrate: Use LU-generated jobs

Until now LUNodeMigrate used multiple tasklets to evacuate all primary
instances on a node. In some cases it would acquire all node locks,
which isn't good on big clusters. With upcoming improvements to the LUs
for instance failover and migration, switching to separate jobs looks
like a better option. This patch changes LUNodeMigrate to use
LU-generated jobs.

While working on this patch, I identified a race condition in
LUNodeMigrate.ExpandNames. A node's instances were retrieved without a
lock and no verification was done.

For RAPI, a new feature string is added and can be used to detect
clusters which support more parameters for node migration. The client
is updated.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Signed-off-by: Michael Hanselmann <hansmi@google.com>

13 years agoFix argument order in ReserveLV and ReserveMAC
Apollon Oikonomopoulos [Mon, 30 May 2011 11:02:25 +0000 (14:02 +0300)]
Fix argument order in ReserveLV and ReserveMAC

ConfigWriter.ReserveLV() and Configwriter.ReserveMAC() called
TemporaryReservationManager.Reserve() with the ec_id and resource arguments
swapped. As a result, two reservation attempts for the same resource type
within the same LU would fail, even if the resources requested were different,
e.g.:

  $ gnt-instance add -t sharedfile -o debootstrap+default \
       --net 0:mac=00:01:02:03:04:00 \
       --net 1:mac=00:01:02:03:04:ff \
       --disk 0:size=2g  test_instance
  Failure: prerequisites not met for this operation:
  error type: resource_not_unique, error details:
  MAC address 00:01:02:03:04:ff already in use in cluster

This patch fixes the argument order in the call to Reserve().

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoht: Accept both int and long as integers
Michael Hanselmann [Mon, 30 May 2011 11:02:14 +0000 (13:02 +0200)]
ht: Accept both int and long as integers

This fixes a unittest failure on 32 bit systems. A recently added
unittest for ht.TJobId uses a rather large number (2347625220). On 64
bit systems it is stored as “int”. On 32 bit systems however, Python
uses “long”. The two types can be intermixed in Python as the
interpreter will take care of conversions. If one processed too many
jobs (2**31) on a 32 bit system, ht would no longer accept the job IDs.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoDesign doc for CPU pinning
Tsachy Shacham [Wed, 18 May 2011 17:00:00 +0000 (19:00 +0200)]
Design doc for CPU pinning

Signed-off-by: Tsachy Shacham <tsachy@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoht: Add checks for anything, regexp, job ID, container items
Michael Hanselmann [Fri, 27 May 2011 10:49:39 +0000 (12:49 +0200)]
ht: Add checks for anything, regexp, job ID, container items

The check for container items is useful for tuples and/or lists with
non-uniform values. The “anything” check can be used when any value
should be accepted for an item.

The job ID check, which uses the regexp check, will be used for
expressing opcode dependencies on other jobs.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'devel-2.4'
Michael Hanselmann [Thu, 26 May 2011 12:57:34 +0000 (14:57 +0200)]
Merge branch 'devel-2.4'

* devel-2.4:
  TLReplaceDisks: Move assertion checking locks

Conflicts:
lib/cmdlib.py: Trivial

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoTLReplaceDisks: Move assertion checking locks
Michael Hanselmann [Thu, 26 May 2011 12:36:50 +0000 (14:36 +0200)]
TLReplaceDisks: Move assertion checking locks

Commit 1bee66f3 added assertions for ensuring only the necessary locks
are kept while replacing disks. One of them makes sure locks have been
released during the operation. Unfortunately the commit added the check
as part of a “finally” branch, which is also run when an exception is
thrown (in which case the locks may not have been released yet). Errors
could be masked by the assertion error. Moving the check out of the
“finally” branch fixes the issue.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agocli.JobExecutor: Handle empty name, allow adding job IDs
Michael Hanselmann [Fri, 20 May 2011 13:17:46 +0000 (15:17 +0200)]
cli.JobExecutor: Handle empty name, allow adding job IDs

With LU-generated jobs only the ID is known.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agocli.JobExecutor: Use counter for indexing jobs
Michael Hanselmann [Wed, 25 May 2011 15:32:56 +0000 (17:32 +0200)]
cli.JobExecutor: Use counter for indexing jobs

If “SubmitPending” were mixed with calls to “QueueJob”, jobs in the
internal structures will get duplicate indices. With this change each
queued job is assigned a unique index, which will be used for sorting
the results.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoFix bug in LUNodeMigrate
Michael Hanselmann [Wed, 25 May 2011 14:45:24 +0000 (16:45 +0200)]
Fix bug in LUNodeMigrate

Commit aac4511a added CheckArguments to LUNodeMigrate with a call to
_CheckIAllocatorOrNode. When no default iallocator is defined,
evacuating a node would always fail:

$ gnt-node migrate node123
Migrate instance(s) '...'?
y/[n]/?: y
Failure: prerequisites not met for this operation:
No iallocator or node given and no cluster-wide default iallocator
found; please specify either an iallocator or a node, or set a
cluster-wide default iallocator

This patch adds a new parameter to specify a target node. This doesn't
solve all issues, but will make the most important cases work again in
the meantime. This opcode will receive more work for node group support.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoconfig: Add method to get members of nodes' groups
Michael Hanselmann [Fri, 20 May 2011 13:35:28 +0000 (15:35 +0200)]
config: Add method to get members of nodes' groups

This will be used for locking during node evacuation.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoYet another attempt to fix builds
Iustin Pop [Wed, 25 May 2011 08:54:34 +0000 (10:54 +0200)]
Yet another attempt to fix builds

It seems that abs_top_srcdir is not a good option, so I tested again
with just using the same as in doc/examples/bash_completion.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoFix build breakage
Iustin Pop [Tue, 24 May 2011 16:46:39 +0000 (18:46 +0200)]
Fix build breakage

Sorry, I already had PYTHONPATH exported in my env, and as I said I
wasn't able to test this on buildbot.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoMerge branch 'devel-2.4'
Michael Hanselmann [Tue, 24 May 2011 16:50:13 +0000 (18:50 +0200)]
Merge branch 'devel-2.4'

* devel-2.4:
  node evac: don't call IAllocator if no instances

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agonode evac: don't call IAllocator if no instances
Iustin Pop [Tue, 24 May 2011 09:29:39 +0000 (11:29 +0200)]
node evac: don't call IAllocator if no instances

Currently we generate an empty list only for the '-n node' invocation,
but for iallocator we still call the iallocator (which needs an RPC
call, etc.). By moving the computation of instances outside of the if
block, we can return early from the LU.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoMerge branch 'devel-2.4'
Michael Hanselmann [Tue, 24 May 2011 16:35:56 +0000 (18:35 +0200)]
Merge branch 'devel-2.4'

* devel-2.4:
  RPC/Backend: Make UploadFile uid and gid agnostic
  Resolve uid/gid upon mainloop run
  GetEntResolver: Make it possible to resolve uid/gid to name
  utils.algo: Add InvertDict to invert a dict
  autotools: Add noded group

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agognt-debug: rename allocator to iallocator
Iustin Pop [Tue, 24 May 2011 09:34:32 +0000 (11:34 +0200)]
gnt-debug: rename allocator to iallocator

I'm always confused by this strange difference, so let's rename the
command to match what it tests.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoMisc other conversions
Iustin Pop [Thu, 19 May 2011 16:44:51 +0000 (18:44 +0200)]
Misc other conversions

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoConvert job status strings to constants
Iustin Pop [Thu, 19 May 2011 16:44:37 +0000 (18:44 +0200)]
Convert job status strings to constants

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoConvert group policies to constants
Iustin Pop [Thu, 19 May 2011 16:29:06 +0000 (18:29 +0200)]
Convert group policies to constants

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoReplace instance states hardcoded with constants
Iustin Pop [Thu, 19 May 2011 16:21:13 +0000 (18:21 +0200)]
Replace instance states hardcoded with constants

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>