Stephen Shirley [Thu, 23 Jun 2011 15:15:28 +0000 (17:15 +0200)]
Add gnt-instance start --pause
Creates the instance, but pauses execution before booting. This combined
with 'gnt-instance console' unpausing instances means that the entire
boot process can be viewed and monitored.
Signed-off-by: Stephen Shirley <diamond@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Stephen Shirley [Fri, 24 Jun 2011 11:29:48 +0000 (13:29 +0200)]
Adding a wrapper around connecting to kvm console
The wrapper will connect to the console, and check in the background if
the instance is paused, unpausing it as necessary.
Signed-off-by: Stephen Shirley <diamond@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Stephen Shirley [Thu, 23 Jun 2011 09:42:23 +0000 (11:42 +0200)]
Adding a wrapper around "xm console"
The wrapper will connect to the console, and check in the background if
the instance is paused, unpausing it as necessary.
Signed-off-by: Stephen Shirley <diamond@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Tue, 5 Jul 2011 22:54:42 +0000 (00:54 +0200)]
Fix lint error
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 27 Jun 2011 22:13:23 +0000 (00:13 +0200)]
RAPI: Document all feature strings
- Use constants and an assertion
- Update documentation for node migration
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 23 May 2011 13:58:29 +0000 (15:58 +0200)]
Remove old node evacuation opcode
LUNodeEvacStrategy has been replaced with LUNodeEvacuate.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Mon, 23 May 2011 12:43:33 +0000 (14:43 +0200)]
Change RAPI for new node evacuation opcode
The change is not backwards compatible, see the updated NEWS file.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Fri, 20 May 2011 14:31:54 +0000 (16:31 +0200)]
Change “gnt-node evacuate” to use new opcode
By default it'll now evacuate all instances from the node, not
just secondaries.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Fri, 20 May 2011 13:30:33 +0000 (15:30 +0200)]
Add new opcode to evacuate node
This new opcode will replace LUNodeEvacStrategy, which used to return a
list of instances and new secondary nodes. With the new opcode the
iallocator (if available) is tasked to generate the necessary operations
in the form of opcodes. This moves some logic from the client to the
master daemon.
At the same time support is added to evacuate primary instances, which
are also evacuated by default.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Guido Trotter [Tue, 5 Jul 2011 16:35:25 +0000 (17:35 +0100)]
Alias gnt-job show to gnt-job info
Am I the only one to make that mistake 10 times a week?
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 29 Jun 2011 13:06:12 +0000 (15:06 +0200)]
One Haskell and integer sizes fix
Haskell has two main integer types:
- Int, which is a native-type, and is guaranteed to have at least
[-2²⁹, 2²⁹-1] range; on 64-bit platforms, it has much higher range
- Integer, which is a software type (implemented using libgmp), and
thus unbounded
For performance reasons, the node/instance properties use Int for
their attributes (and Double for some, but that's another story). This
is all fine and doesn't cause problems. However, the CStats type which
holds the overall cluster resources starts to fail when we analyse
clusters with more than around 400 nodes and big memory/disk sizes on
32 bit platforms.
The simple fix would be to restrict cluster sizes, but that's no
nice. I've benchmarked and changing to Integer doesn't show a visible
slowdown on 64-bit platforms (as far as I can read on the internets,
GHC knows to optimise Integer and only use software types when the
values are large enough), and it also fixes the 32-bit problem. So
this patch changes the CStats types to Integer, except for the
instance count (which I don't expect to overflow 2²⁹ anytime soon).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Fri, 1 Jul 2011 10:25:26 +0000 (12:25 +0200)]
Fix cluster verify for empty node groups
There were some implicit assertions in the code that all node groups
have nodes, which is not necessarily true.
Additionally, the patch does a wrapping change.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Thu, 30 Jun 2011 11:19:42 +0000 (13:19 +0200)]
Fix a typo and punctuation in iallocator.rst
Beside the 'dscription' typo, also make the punctuation more
consistent.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Tue, 28 Jun 2011 16:12:54 +0000 (18:12 +0200)]
Fix htools, QuickCheck library detection and tests
Just saw this while testing the migration to QuickCheck v2: while
configure.ac detects that QuickCheck-2.x is not available, the test in
Makefile.am was against WANT_HTOOLS (overall htools compilation), not
on a more-specific WANT_HTOOLSTESTS.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 20 Jun 2011 13:29:52 +0000 (15:29 +0200)]
htools: fix prop_Node_rMem corner case
This patch fixes a bug in the test specification where we allowed nodes
with zero free memory (hence no instance can be added, at all) and adds
a simple labeling of the way this test can fail.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 20 Jun 2011 13:28:57 +0000 (15:28 +0200)]
htools: fix corner case in prop_Text_Load_Instance
This unittest had a corner case where it could fail if the same
primary/secondary node names were generated.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 20 Jun 2011 13:04:59 +0000 (16:04 +0300)]
htools: allow unittest to be replayed
This just adds glue to allow replaying of tests using a given RNG state
and test size (both are needed for exact replayability).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 19 Jun 2011 21:56:31 +0000 (00:56 +0300)]
htools: improve more unittests
Using new functionality in QuickCheck 2 (the suchThat function), we
generate now better test cases, such that (heh) we have no longer
incomplete tests.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 19 Jun 2011 21:43:42 +0000 (00:43 +0300)]
htools: more fixes to unsatisfiable tests
Currently the way we generate nodes in some cases is by creating a
totally random node, then restricting the test based on whether the node
'size' (as defined by multiples of base unit) satifies some high/low
rules. This results in hard-to-satisfy conditions, so we change this
model to be able to specify node sizes directly in the generation
process, thus no longer needed post-creation filters.
This fixes prop_ClusterAllocBalance which before had at most 1-2
satisfiable tests.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 19 Jun 2011 21:11:51 +0000 (00:11 +0300)]
htools: Rework some unittests
The new scaffolding which replaced the batch driver of QuickCheck 1 now
shows how many passes we have for incomplete tests. Some tests show very
low pass counts, so we rework them to have more actually valid test
cases.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sun, 19 Jun 2011 20:46:15 +0000 (23:46 +0300)]
htools: Switch to QuickCheck 2.x
Since current distros don't package anymore QuickCheck 1.x, let's move
to 2.x.
This requires also a few changes to the code:
- Test.QuickCheck.Batch doesn't exist anymore, so we need to write some
scaffolding code to replace it
- the way test sizes are generated has changed, and we need to restrict
(in some tests) the cluster size, as our code is not yet ready for
hundreds of thousands of nodes in a cluster and we run out of stack
(which could be a bug somewhere by itself, needs investigation)
- at least with GHC 7, floating point errors make a perfect cluster
score even bigger, so we need to bump up the max. rounding error
allowed
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 19 Jun 2011 10:38:32 +0000 (13:38 +0300)]
htools: some lint fixes
Removal of duplicate parantheses, removal of extra 'do', conversion from
nested if to guards, use hierarchical imports. All per hlint.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sun, 19 Jun 2011 02:46:58 +0000 (05:46 +0300)]
htools: rewrite Cluster.filterMGResults
filterMGResults was built using a sequence of map and filter calls;
while this was logically correct, it used some incomplete pattern
matching which with the new GHC 7 triggers a warning.
The patch rewrites it using a single foldl that does both the filtering
and the mapping, in a more type-safe way.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 19 Jun 2011 02:48:05 +0000 (05:48 +0300)]
htools: use the new Group.isAllocable
… instead of the hardcoded test against AllocUnallocable.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sat, 18 Jun 2011 15:32:39 +0000 (17:32 +0200)]
htools: add a helper function
… that checks if a group is allocable.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 9 Jun 2011 09:28:08 +0000 (10:28 +0100)]
Enable using ghc parallel version 3
Currently htools cannot be compiled under sid because the parallel
haskell library is version 3. Using it issues a few warning, but
compiles and passes unit tests. Ship it?
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Thu, 23 Jun 2011 13:56:00 +0000 (14:56 +0100)]
Merge branch 'devel-2.4'
* devel-2.4:
LUInstanceCreate: use opcodes.RequireFileStorage
Don't add ",boot=on" to disks on kvm >= 0.14
KVM: fix per-instance stored UID value
Conflicts:
lib/cmdlib.py
- use RequireSharedFileStorage there
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Fri, 17 Jun 2011 14:27:36 +0000 (17:27 +0300)]
remove bootstrap._InitSharedFileStorage
This function is a copy of bootstrap._InitFileStorage with the following
differences:
- check constants.ENABLE_SHARED_FILE_STORAGE and not
constants.ENABLE_FILE_STORAGE
- use different local variable names
- one different error string
Thus:
- move the constant check outside of the function call
- change error string so it's clear where the error is
- call the same function twice
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 22 Jun 2011 13:14:27 +0000 (14:14 +0100)]
LUInstanceCreate: use opcodes.RequireFileStorage
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 9 Jun 2011 09:01:30 +0000 (09:01 +0000)]
Don't add ",boot=on" to disks on kvm >= 0.14
Under newer kvm this prevents the vm from starting.
Ah, change!
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Apollon Oikonomopoulos [Wed, 22 Jun 2011 15:41:29 +0000 (18:41 +0300)]
KVM: fix per-instance stored UID value
When using the pool security model, _ExecuteKVMRuntime was storing the
instance's UID using str(uid), which would result in storing the
LockedUid.__repr__() result:
$ cat /var/run/ganeti/kvm-hypervisor/uid/xxxxxxxxxxxxx
<ganeti.uidpool.LockedUid object at 0x1f30610>
This patch restores the intended behaviour, by using LockedUid.AsStr().
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Fri, 17 Jun 2011 12:57:41 +0000 (15:57 +0300)]
Merge branch 'devel-2.4'
* devel-2.4:
Add one forgotten element to the file disk path
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Guido Trotter [Fri, 17 Jun 2011 11:44:55 +0000 (14:44 +0300)]
Add one forgotten element to the file disk path
This was left out during the fix/refactoring
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Guido Trotter [Fri, 17 Jun 2011 11:30:51 +0000 (14:30 +0300)]
Merge branch 'devel-2.4'
* devel-2.4:
LUInstanceCreate: fix file storage dir calculation
Check that filestorage is enabled when requested
Remove self.op.file_storage_dir isabs check
Conflicts:
lib/cmdlib.py
- use constants.DTS_FILEBASED
- handle DT_SHARED_FILE correctly
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Guido Trotter [Fri, 17 Jun 2011 09:19:44 +0000 (12:19 +0300)]
Add DTS_FILEBASED constant
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Guido Trotter [Fri, 17 Jun 2011 09:39:43 +0000 (12:39 +0300)]
LUInstanceCreate: fix file storage dir calculation
- Move the calculation at the beginning of CheckPrereq, since it doesn't
modify any state, but still keeps locks
- Only perform the calculation if the actual disk template is filebased
- Error out if there is no defined file storage dir
- Only join the optional --file-storage-dir extra-path if one is passed
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Guido Trotter [Fri, 17 Jun 2011 09:23:51 +0000 (12:23 +0300)]
Check that filestorage is enabled when requested
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Guido Trotter [Fri, 17 Jun 2011 09:16:58 +0000 (09:16 +0000)]
Remove self.op.file_storage_dir isabs check
As the manpage says, and the code does, self.op.file_storage_dir is an
additional relative path under the cluster file storage dir. As such it
should not be absolute.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Guido Trotter [Fri, 10 Jun 2011 13:24:49 +0000 (14:24 +0100)]
htools live test: test instance selection as well
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 10 Jun 2011 12:27:12 +0000 (12:27 +0000)]
--select-instances hbal manpage update
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 13 Jun 2011 12:01:08 +0000 (12:01 +0000)]
Check that the selected instances are known
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 13 Jun 2011 11:44:58 +0000 (11:44 +0000)]
Loader.updateMovable: evaluate selected instances
This also adds docstrings for the function arguments and renames exinst
to exinsts, which is how it is called in other functions, since it's a
list.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 10 Jun 2011 13:45:09 +0000 (14:45 +0100)]
Add instance selection list to Loader.mergeData
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 10 Jun 2011 13:44:30 +0000 (14:44 +0100)]
Add --select-instances hbal flag
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 10 Jun 2011 13:30:10 +0000 (14:30 +0100)]
Remove double whitespace in help string
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Apollon Oikonomopoulos [Thu, 16 Jun 2011 12:15:30 +0000 (15:15 +0300)]
Add gnt-network design doc
This design covers high level network block definition and pool
management.
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 14 Jun 2011 16:27:06 +0000 (18:27 +0200)]
Replace iallocator's mreloc w/ change-group and node-evac
This patch removes all occurrences of the “multi-relocate” iallocator
mode. Commit
25ee7fd845 updated the design document and introduced
separate modes, “change-group” and “node-evacuate”. The constants aren't
removed yet as they're still used by htools.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Stephen Shirley [Mon, 6 Jun 2011 08:59:46 +0000 (10:59 +0200)]
Fix a couple of typos
Signed-off-by: Stephen Shirley <diamond@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 6 Jun 2011 15:18:30 +0000 (17:18 +0200)]
Makefile: Add version check for iallocator.rst
iallocator.rst contains the Ganeti version at the top.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 6 Jun 2011 15:10:43 +0000 (17:10 +0200)]
Update iallocator design for node group-aware operations
A while ago a new ``multi-relocate`` mode was proposed and documented.
As it turned out, the interface had some deficiencies. With this patch
The relocation modes are reduced to two and split into separate
iallocator request modes: node-evacuate and change-group. Some request
and response requirements are clarified in the documentation.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 1 Jun 2011 15:44:56 +0000 (17:44 +0200)]
jqueue: Allow loading of archived jobs
Chained jobs need to look at previous jobs, including archived ones. A
nice side-effect of this change is the ability to look at archived jobs
using “gnt-job info <id>” as long as the ID is known.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
René Nussbaumer [Tue, 31 May 2011 09:54:23 +0000 (11:54 +0200)]
Adding basic abstraction layer for caching
This includes an own simple cache implementation and an
interface to a memcache instance.
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 9 Jun 2011 10:03:39 +0000 (10:03 +0000)]
Fix _checkRsaPrivateKey for newer key generation
Keys generated under debian sid just read "BEGIN PRIVATE KEY" rather
than "BEGIN RSA PRIVATE KEY".
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Tue, 7 Jun 2011 06:48:55 +0000 (08:48 +0200)]
Fix locking issues in LUClusterVerifyGroup
- Use functions in ConfigWriter instead of custom loops
- Calculate nodes only once instances locks are acquired, removes one
potential race condition
- Don't retrieve lists of all node/instance information without locks
- Additionally move the end of the node time check window after the
first RPC call--the second call isn't involved in checking the
node time at all
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Tue, 7 Jun 2011 05:17:09 +0000 (07:17 +0200)]
cmdlib: Acquire BGL for LUClusterVerifyConfig
LUClusterVerifyConfig verifies a number of configuration settings. For
doing so, it needs a consistent list of nodes, groups and instances. So
far no locks were acquired at all (except for the BGL in shared mode).
This is a race condition (e.g. if a node group is added in parallel) and
can be fixed by acquiring the BGL in exclusive mode. Since this LU
verifies the cluster-wide configuration, doing so instead of acquiring
individual locks is just.
Includes one typo fix and one docstring update.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Tue, 7 Jun 2011 10:46:05 +0000 (12:46 +0200)]
Export/import instance tags
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Tue, 7 Jun 2011 10:45:53 +0000 (12:45 +0200)]
Fix issue with tags on instance creation
Commit
720f56c85a added the ability to specify tags when creating an
instance. The “tags” attribute of an instance object needs to be a set,
but the patch's code saved it as a list, causing breakage in other parts
of Ganeti. This patch changes the code to use TaggableObject.AddTag,
which has a nice side-effect of doing some verification (including max.
number of tags). Instance import was also broken (no “tags” attribute in
options).
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Fri, 3 Jun 2011 09:53:04 +0000 (11:53 +0200)]
Fix incomplete merge
Commit
66bd7445 changed the semantics of _JobProcessor on finished
jobs, and updated the related unittests in the 2.4 branch. It was then
merged to master, however on master there was an additional test for
this case, which was not updated.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Apollon Oikonomopoulos [Tue, 31 May 2011 12:50:28 +0000 (15:50 +0300)]
Export instance tags to instance hooks
Instance hooks now get an INSTANCE_TAGS environment variable, which contains a
space-delimited list of the affected instance's tags.
Also update the documentation to reflect the change.
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Apollon Oikonomopoulos [Tue, 31 May 2011 12:49:58 +0000 (15:49 +0300)]
Add tagging option to gnt-instance create
Add TAG_ADD_OPT option to cli.py and use it in gnt-instance. Modify
cli.GenericInstanceCreate() accordingly.
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Apollon Oikonomopoulos [Tue, 31 May 2011 12:49:28 +0000 (15:49 +0300)]
Add tag handling to {Op,LU}InstanceCreate
Add a tag slot to opcodes.OpInstanceCreate. We do not reuse _PTags, as this is
intended for OpTagsSet and thus:
a) is not documented
b) does not carry a default value, making it mandatory
Also pass the tags to the iallocator during instance creation.
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 26 May 2011 14:40:50 +0000 (16:40 +0200)]
http.client: Make debug log less noisy
The HTTP client code generates quite a lot of debug log messages. With
this patch they're hidden unless explicitely enabled in the code.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 1 Jun 2011 16:10:06 +0000 (18:10 +0200)]
Merge branch 'devel-2.4'
* devel-2.4:
jqueue: Fix potential race condition when cancelling queued jobs
Fix argument order in ReserveLV and ReserveMAC
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 1 Jun 2011 14:51:39 +0000 (16:51 +0200)]
htools: introduce a type alias for JSON objects
This makes the type definitions a bit more readable/simpler.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 30 May 2011 12:13:13 +0000 (14:13 +0200)]
hail: stop using old-style 'nodes' key
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 30 May 2011 11:50:49 +0000 (13:50 +0200)]
hail: add parsing of multi-relocate request
This is not handled yet, this patch just adds parsing of the incoming
request.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 30 May 2011 08:43:50 +0000 (10:43 +0200)]
hail: add option for displaying the parsed request
This can be used for debugging.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Thu, 26 May 2011 13:24:36 +0000 (15:24 +0200)]
hail: add new data types for the multi-reloc mode
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Tue, 31 May 2011 11:14:26 +0000 (13:14 +0200)]
Add --no-instance-moves to the htools live tests
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 31 May 2011 10:38:50 +0000 (12:38 +0200)]
Update hbal manpage for --no-instance-moves
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 31 May 2011 14:57:51 +0000 (16:57 +0200)]
Implement balancing with no instance moves
Note that --no-disk-moves and --no-instance-moves are not incompatible,
but if both are used no solution can possibly exist.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 31 May 2011 10:34:19 +0000 (10:34 +0000)]
Pass the instance moves option in hbal
While still being ignored, now it gets passed down to the iteration
function.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 30 May 2011 14:10:13 +0000 (16:10 +0200)]
Add --no-instance-moves cli htools option
This option doesn't currently do anything.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 31 May 2011 14:49:45 +0000 (16:49 +0200)]
jqueue: Fix potential race condition when cancelling queued jobs
When a job was cancelled, its status would be changed and the file
written again. Since this was a final status, the job file could be
moved anytime for archival. If the job was still in the queue, however,
it would be processed (not fully, just updating the “end_timestamp”
attribute) and written again. This was bad as it could leave the same
job in two different files.
With this patch the processor is changed to return early for finished
jobs. Cancelling a queued job will finalize it right away. Unittests are
updated.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 30 May 2011 10:52:22 +0000 (12:52 +0200)]
iallocator: add ht-checking for the request
Currently, we only ht-check the result value from the iallocator, and
we send whatever we happen to check manually in the LUs that call the
iallocator.
This is not good, as we have to duplicate checks in many places, and
still we might miss checks. So we add add ht information to the
per-request variables. As the cluster data is built in one place, the
iallocator code itself (and is more consistent), I didn't add checks
to that too.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 30 May 2011 11:14:18 +0000 (13:14 +0200)]
iallocator: rename mem_size to memory
Currently, the iallocator in 'allocate' requires mem_size on input
but serialises that as 'memory'. This inconsistency makes it hard to
automatically validate the parameters, hence this patch renames
mem_size.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 30 May 2011 09:56:45 +0000 (11:56 +0200)]
iallocator: change default for target_groups
Per the design doc, the target_groups request key "if present, it must
either be the empty list, or contain a list of group UUIDs". Currently
it defaults to None/null, which is not valid.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 30 May 2011 11:24:39 +0000 (13:24 +0200)]
iallocator: export the hypervisor value
In 'allocate' mode, the documentation specifies that we export the
hypervisor value (“Allocation needs, in addition: … hypervisor, the
hypervisor of this instance”) and we need that on input, however we
don't actually export it.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 30 May 2011 10:54:44 +0000 (12:54 +0200)]
iallocator: fix incomplete refactoring
Commit
fdbe29ee changed the iallocator modes from 'r'/'w' to
'ro'/'rw', but forgot one check in LUTestAllocator. This patch just
completes the replacements.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Mon, 23 May 2011 16:55:50 +0000 (18:55 +0200)]
gnt-node migrate: Use LU-generated jobs
Until now LUNodeMigrate used multiple tasklets to evacuate all primary
instances on a node. In some cases it would acquire all node locks,
which isn't good on big clusters. With upcoming improvements to the LUs
for instance failover and migration, switching to separate jobs looks
like a better option. This patch changes LUNodeMigrate to use
LU-generated jobs.
While working on this patch, I identified a race condition in
LUNodeMigrate.ExpandNames. A node's instances were retrieved without a
lock and no verification was done.
For RAPI, a new feature string is added and can be used to detect
clusters which support more parameters for node migration. The client
is updated.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Apollon Oikonomopoulos [Mon, 30 May 2011 11:02:25 +0000 (14:02 +0300)]
Fix argument order in ReserveLV and ReserveMAC
ConfigWriter.ReserveLV() and Configwriter.ReserveMAC() called
TemporaryReservationManager.Reserve() with the ec_id and resource arguments
swapped. As a result, two reservation attempts for the same resource type
within the same LU would fail, even if the resources requested were different,
e.g.:
$ gnt-instance add -t sharedfile -o debootstrap+default \
--net 0:mac=00:01:02:03:04:00 \
--net 1:mac=00:01:02:03:04:ff \
--disk 0:size=2g test_instance
Failure: prerequisites not met for this operation:
error type: resource_not_unique, error details:
MAC address 00:01:02:03:04:ff already in use in cluster
This patch fixes the argument order in the call to Reserve().
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 30 May 2011 11:02:14 +0000 (13:02 +0200)]
ht: Accept both int and long as integers
This fixes a unittest failure on 32 bit systems. A recently added
unittest for ht.TJobId uses a rather large number (
2347625220). On 64
bit systems it is stored as “int”. On 32 bit systems however, Python
uses “long”. The two types can be intermixed in Python as the
interpreter will take care of conversions. If one processed too many
jobs (2**31) on a 32 bit system, ht would no longer accept the job IDs.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Tsachy Shacham [Wed, 18 May 2011 17:00:00 +0000 (19:00 +0200)]
Design doc for CPU pinning
Signed-off-by: Tsachy Shacham <tsachy@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 27 May 2011 10:49:39 +0000 (12:49 +0200)]
ht: Add checks for anything, regexp, job ID, container items
The check for container items is useful for tuples and/or lists with
non-uniform values. The “anything” check can be used when any value
should be accepted for an item.
The job ID check, which uses the regexp check, will be used for
expressing opcode dependencies on other jobs.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 26 May 2011 12:57:34 +0000 (14:57 +0200)]
Merge branch 'devel-2.4'
* devel-2.4:
TLReplaceDisks: Move assertion checking locks
Conflicts:
lib/cmdlib.py: Trivial
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 26 May 2011 12:36:50 +0000 (14:36 +0200)]
TLReplaceDisks: Move assertion checking locks
Commit
1bee66f3 added assertions for ensuring only the necessary locks
are kept while replacing disks. One of them makes sure locks have been
released during the operation. Unfortunately the commit added the check
as part of a “finally” branch, which is also run when an exception is
thrown (in which case the locks may not have been released yet). Errors
could be masked by the assertion error. Moving the check out of the
“finally” branch fixes the issue.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Fri, 20 May 2011 13:17:46 +0000 (15:17 +0200)]
cli.JobExecutor: Handle empty name, allow adding job IDs
With LU-generated jobs only the ID is known.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 25 May 2011 15:32:56 +0000 (17:32 +0200)]
cli.JobExecutor: Use counter for indexing jobs
If “SubmitPending” were mixed with calls to “QueueJob”, jobs in the
internal structures will get duplicate indices. With this change each
queued job is assigned a unique index, which will be used for sorting
the results.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 25 May 2011 14:45:24 +0000 (16:45 +0200)]
Fix bug in LUNodeMigrate
Commit
aac4511a added CheckArguments to LUNodeMigrate with a call to
_CheckIAllocatorOrNode. When no default iallocator is defined,
evacuating a node would always fail:
$ gnt-node migrate node123
Migrate instance(s) '...'?
y/[n]/?: y
Failure: prerequisites not met for this operation:
No iallocator or node given and no cluster-wide default iallocator
found; please specify either an iallocator or a node, or set a
cluster-wide default iallocator
This patch adds a new parameter to specify a target node. This doesn't
solve all issues, but will make the most important cases work again in
the meantime. This opcode will receive more work for node group support.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 20 May 2011 13:35:28 +0000 (15:35 +0200)]
config: Add method to get members of nodes' groups
This will be used for locking during node evacuation.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 25 May 2011 08:54:34 +0000 (10:54 +0200)]
Yet another attempt to fix builds
It seems that abs_top_srcdir is not a good option, so I tested again
with just using the same as in doc/examples/bash_completion.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Tue, 24 May 2011 16:46:39 +0000 (18:46 +0200)]
Fix build breakage
Sorry, I already had PYTHONPATH exported in my env, and as I said I
wasn't able to test this on buildbot.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Tue, 24 May 2011 16:50:13 +0000 (18:50 +0200)]
Merge branch 'devel-2.4'
* devel-2.4:
node evac: don't call IAllocator if no instances
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 24 May 2011 09:29:39 +0000 (11:29 +0200)]
node evac: don't call IAllocator if no instances
Currently we generate an empty list only for the '-n node' invocation,
but for iallocator we still call the iallocator (which needs an RPC
call, etc.). By moving the computation of instances outside of the if
block, we can return early from the LU.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Tue, 24 May 2011 16:35:56 +0000 (18:35 +0200)]
Merge branch 'devel-2.4'
* devel-2.4:
RPC/Backend: Make UploadFile uid and gid agnostic
Resolve uid/gid upon mainloop run
GetEntResolver: Make it possible to resolve uid/gid to name
utils.algo: Add InvertDict to invert a dict
autotools: Add noded group
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 24 May 2011 09:34:32 +0000 (11:34 +0200)]
gnt-debug: rename allocator to iallocator
I'm always confused by this strange difference, so let's rename the
command to match what it tests.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 19 May 2011 16:44:51 +0000 (18:44 +0200)]
Misc other conversions
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 19 May 2011 16:44:37 +0000 (18:44 +0200)]
Convert job status strings to constants
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 19 May 2011 16:29:06 +0000 (18:29 +0200)]
Convert group policies to constants
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 19 May 2011 16:21:13 +0000 (18:21 +0200)]
Replace instance states hardcoded with constants
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>