ganeti-local
13 years agoMakefile: Merge build-time reST copying
Michael Hanselmann [Wed, 5 Jan 2011 17:52:29 +0000 (18:52 +0100)]
Makefile: Merge build-time reST copying

No need to copy this snippet around, “make” can work harder for us.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMove doc/upgrade.rst to UPGRADE, copy at build-time
Michael Hanselmann [Wed, 5 Jan 2011 17:48:29 +0000 (18:48 +0100)]
Move doc/upgrade.rst to UPGRADE, copy at build-time

This will allow distributions to install the file as text documentation.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoImport upgrade notes into documentation
Michael Hanselmann [Wed, 5 Jan 2011 15:22:33 +0000 (16:22 +0100)]
Import upgrade notes into documentation

This patch formats the upgrade notes currently in the wiki[1] as reST
and adds them to the documentation.

[1] http://code.google.com/p/ganeti/wiki/UpgradeNotes

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoFix typo in gnt-instance manpage
Michael Hanselmann [Fri, 31 Dec 2010 12:11:05 +0000 (13:11 +0100)]
Fix typo in gnt-instance manpage

s/os-name/os-type/. This was reported in issue 133.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agojqueue: Fix cancelling while in waitlock in queue
Michael Hanselmann [Tue, 21 Dec 2010 18:10:32 +0000 (19:10 +0100)]
jqueue: Fix cancelling while in waitlock in queue

Since the recent change to leave jobs in the “waitlock” status (commit
5fd6b6947), cancelling a job while it's back in the queue would break.
This patch handles these cases and adds a unittest.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agocli: Extend message for LUXI timeouts
Michael Hanselmann [Mon, 20 Dec 2010 21:23:13 +0000 (22:23 +0100)]
cli: Extend message for LUXI timeouts

Point out that jobs already submitted continue to run.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoFix timeout handling in LUXI client
Michael Hanselmann [Mon, 20 Dec 2010 19:20:18 +0000 (20:20 +0100)]
Fix timeout handling in LUXI client

If the socket can't be read in time, it raises “socket.timeout”, for
which there is special handling code. Unfortunately the exception block
was in the wrong order and “socket.error” caught it before.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'stable-2.3' into devel-2.3
Michael Hanselmann [Mon, 20 Dec 2010 14:18:36 +0000 (15:18 +0100)]
Merge branch 'stable-2.3' into devel-2.3

* stable-2.3:
  Prepare 2.3.1 release
  Fix disk status verification in LUClusterVerify

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoPrepare 2.3.1 release v2.3.1
Michael Hanselmann [Mon, 20 Dec 2010 13:15:19 +0000 (14:15 +0100)]
Prepare 2.3.1 release

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoQA: Run cluster-verify as part of all instance tests
Michael Hanselmann [Thu, 16 Dec 2010 14:19:52 +0000 (15:19 +0100)]
QA: Run cluster-verify as part of all instance tests

“gnt-cluster verify” looks at some per-instance information as well, so
it should be run for each instance type QA tests.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoQA: Fix typo and add “not”
Michael Hanselmann [Wed, 15 Dec 2010 19:03:18 +0000 (20:03 +0100)]
QA: Fix typo and add “not”

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoensure-dirs: Speed up when using big queues
Michael Hanselmann [Wed, 15 Dec 2010 17:53:34 +0000 (18:53 +0100)]
ensure-dirs: Speed up when using big queues

The “ensure-dirs” script as included in Ganeti 2.3 is very slow when
working with big queues requiring a change of permissions on many or all
files.

$ find /var/lib/ganeti/queue/ | wc -l
52354

Before this change:
$ time /usr/local/lib/ganeti/ensure-dirs -f
real    16m4.739s

While not adressed in this patch, I'd like to record the overall
ineffiency of the “ensure-dirs” script, even after this change:

$ time /usr/local/lib/ganeti/ensure-dirs -f
real    5m57.362s
[…]
$ strace -e clone,execve -f -c /usr/local/lib/ganeti/ensure-dirs -f
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 50.08    5.147090          49    104774           clone
 49.92    5.131094          49    104739           execve

More changes will be needed. Just for comparision, a small Python
snippet changing permissions on all files (“ensure-dirs” changes the
owner too):

$ time python -c 'import os; from ganeti import utils;
[os.chmod(i, 0644) for i in
utils.ListVisibleFiles("/var/lib/ganeti/queue/archive/big")]'
real    0m0.605s
[…]

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoFix gnt-cluster verify with diskless instances
Adeodato Simo [Wed, 15 Dec 2010 17:40:30 +0000 (17:40 +0000)]
Fix gnt-cluster verify with diskless instances

`gnt-cluster verify` was failing with KeyError if there was any
diskless instance in the cluster. This was because _CollectDiskInfo()
was not including these instances in the returned dictionary, but they
were expected to be present in LUVerifyCluster.Exec().

With this commit, we ensure that the dictionary returned by _CollectDiskInfo
includes entries for diskless instances as well.

Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agojqueue: Keep jobs in “waitlock” while returning to queue
Michael Hanselmann [Tue, 14 Dec 2010 16:56:39 +0000 (17:56 +0100)]
jqueue: Keep jobs in “waitlock” while returning to queue

Iustin Pop reported that a job's file is updated many times while it
waits for locks held by other thread(s). After an investigation it was
concluded that the reason was a design decision for job priorities to
return jobs to the “queued” status if they couldn't acquire all locks.
Changing a jobs' status or priority requires an update to permanent
storage.

In a high-level view this is what happens:
1. Mark as waitlock
2. Write to disk as permanent storage (jobs left in this state by a
   crashing master daemon are resumed on restart)
3. Wait for lock (assume lock is held by another thread)
4. Mark as queued
5. Write to disk again
6. Return to workerpool

Another option originally discussed was to leave the job in the
“waitlock” status. Ignoring priority changes, this is what would happen:
1. If not in waitlock
1.1. Assert state == queued
1.2. Mark as waitlock
1.3. Set start_timestamp
1.4. Write to disk as permanent storage
3. Wait for locks (assume lock is held by another thread)
4. Leave in waitlock
5. Return to workerpool

Now let's assume the lock is released by the other thread:
[…]
3. Wait for locks and get them
4. Assert state == waitlock
5. Set state to running
6. Set exec_timestamp
7. Write to disk

As this change reduces the number of writes from two per lock acquire
attempt to two per opcode and one per priority increase (as happens
after 24 acquire attempts (see mcpu._CalculateLockAttemptTimeouts) until
the highest priority is reached), here's the patch to implement it.
Unittests are updated.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoImprove jqueue unittests
Michael Hanselmann [Mon, 13 Dec 2010 17:32:27 +0000 (18:32 +0100)]
Improve jqueue unittests

- Verify job file updates
- Ensure queue lock is released while executing opcode

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoUpdate manpages to display version 2.3
Miguel Di Ciurcio Filho [Mon, 13 Dec 2010 19:07:34 +0000 (17:07 -0200)]
Update manpages to display version 2.3

Signed-off-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoFix disk status verification in LUClusterVerify
Iustin Pop [Thu, 9 Dec 2010 13:03:18 +0000 (14:03 +0100)]
Fix disk status verification in LUClusterVerify

Commit b8d26c6 added disk status verification, but it has two
(different) bugs for not healthy nodes.

For offline nodes, we don't add at all the disk status to the
instance/node dict, with the result that the instance is not present in
the instdisk dict if all of its nodes are offline. This creates a
KeyError later when we call VerifyInstance with instdisk[instance].

For online nodes, but which don't return a valid disk status, we simply
set the status to None for each disk, but the code in _VerifyInstance
presumes and requires that each status is a valid tuple of length two.

For both these bugs, we redo the instdisk computations to always include
valid data, and we enhance the asserts to check for consistency.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoMerge branch 'devel-2.2' into devel-2.3
Guido Trotter [Thu, 9 Dec 2010 15:13:00 +0000 (16:13 +0100)]
Merge branch 'devel-2.2' into devel-2.3

* devel-2.2:
  Fix rename for file-backed instances

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'stable-2.2' into devel-2.2
Guido Trotter [Thu, 9 Dec 2010 15:12:18 +0000 (16:12 +0100)]
Merge branch 'stable-2.2' into devel-2.2

* stable-2.2:
  Fix rename for file-backed instances

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'stable-2.2' into stable-2.3
Guido Trotter [Thu, 9 Dec 2010 15:10:54 +0000 (16:10 +0100)]
Merge branch 'stable-2.2' into stable-2.3

* stable-2.2:
  Fix rename for file-backed instances

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoFix rename for file-backed instances
Guido Trotter [Wed, 8 Dec 2010 14:53:48 +0000 (15:53 +0100)]
Fix rename for file-backed instances

Currently the code wrongly changes the disk logical/physical id
component representing the path from "$storage_dir/$iname/disk$seq" to
"$storage_dir/$iname/disk/$seq" (note the additional slash) breaking the
rename.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'stable-2.3' into devel-2.3
Michael Hanselmann [Thu, 2 Dec 2010 15:47:56 +0000 (16:47 +0100)]
Merge branch 'stable-2.3' into devel-2.3

* stable-2.3:
  Bump version for 2.3.1~rc1 release

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agolocking: Clarify message for removed locks
Michael Hanselmann [Wed, 1 Dec 2010 17:33:27 +0000 (18:33 +0100)]
locking: Clarify message for removed locks

Just being told that a lock doesn't exist can be confusing. One case
were this happens is when a job (e.g. instance modify) waits for a job
removing the instance (e.g. export with remove).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoBump version for 2.3.1~rc1 release v2.3.1rc1
Michael Hanselmann [Wed, 1 Dec 2010 19:45:06 +0000 (20:45 +0100)]
Bump version for 2.3.1~rc1 release

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoimpexpd: Disable OpenSSL compression in socat if possible
Michael Hanselmann [Wed, 10 Nov 2010 18:43:01 +0000 (19:43 +0100)]
impexpd: Disable OpenSSL compression in socat if possible

This uses an option only available in patched socat versions. More
information is available from the INSTALL update included in this
patch.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'stable-2.3' into devel-2.3
Michael Hanselmann [Wed, 1 Dec 2010 15:55:47 +0000 (16:55 +0100)]
Merge branch 'stable-2.3' into devel-2.3

* stable-2.3:
  Bump version for 2.3.0

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoBump version for 2.3.0 v2.3.0
Michael Hanselmann [Wed, 1 Dec 2010 15:03:56 +0000 (16:03 +0100)]
Bump version for 2.3.0

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'devel-2.2' into devel-2.3
Michael Hanselmann [Tue, 30 Nov 2010 18:26:46 +0000 (19:26 +0100)]
Merge branch 'devel-2.2' into devel-2.3

* devel-2.2:
  Correct version check for release candidates
  Fix version check
  Add script to check version format

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoCorrect version check for release candidates
Michael Hanselmann [Tue, 30 Nov 2010 17:50:44 +0000 (18:50 +0100)]
Correct version check for release candidates

The tilde needs to be escaped and I forgot the space which should be
used instead.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoconfig.py: need explicit %-formatting in errors.OpPrereqError.
Adeodato Simo [Tue, 30 Nov 2010 16:05:47 +0000 (16:05 +0000)]
config.py: need explicit %-formatting in errors.OpPrereqError.

Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoFix version check
Michael Hanselmann [Wed, 24 Nov 2010 19:50:46 +0000 (20:50 +0100)]
Fix version check

Don't ask … all I say is distcheck.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd script to check version format
Michael Hanselmann [Wed, 24 Nov 2010 19:18:14 +0000 (20:18 +0100)]
Add script to check version format

Only versions of the format “x.y.z” and “x.y.z~(rc|beta)N” (for N>0) are
allowed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'devel-2.2' into devel-2.3
Iustin Pop [Wed, 24 Nov 2010 17:01:57 +0000 (17:01 +0000)]
Merge branch 'devel-2.2' into devel-2.3

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoFix coverage reports
Iustin Pop [Wed, 24 Nov 2010 16:06:32 +0000 (16:06 +0000)]
Fix coverage reports

Currently, the coverage reports include the unittests themselves, and
this skewes unfairly the reports, as the coverage for the tests is very
high (since they all run).

To fix this, we export the ganeti temp dir from run-in-temp-dir, and we
use that to exclude the tests directory. The patch also fixes a but
related to multiple directories to be omitted (--omit a --omit b is
wrong, it needs to be --omit a,b).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoUpdates NEWS and configure.ac for 2.3.0~rc1 v2.3.0rc1
Iustin Pop [Fri, 19 Nov 2010 13:10:54 +0000 (14:10 +0100)]
Updates NEWS and configure.ac for 2.3.0~rc1

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoMerge branch 'devel-2.2' into devel-2.3
Iustin Pop [Fri, 19 Nov 2010 13:01:14 +0000 (14:01 +0100)]
Merge branch 'devel-2.2' into devel-2.3

* devel-2.2:
  Update NEWS & configure.ac for the 2.2.2 release
  Fix documentation regarding conversion to drbd

Conflicts:
NEWS         (integrated 2.2 changes)
configure.ac (kept our version)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoUpdate NEWS & configure.ac for the 2.2.2 release v2.2.2
Iustin Pop [Fri, 19 Nov 2010 10:42:35 +0000 (11:42 +0100)]
Update NEWS & configure.ac for the 2.2.2 release

This imports the 2.1.8 NEWS entry and adds the 2.2.2 one, then updates the
configure.ac version.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoFix documentation regarding conversion to drbd
Iustin Pop [Fri, 19 Nov 2010 10:17:12 +0000 (11:17 +0100)]
Fix documentation regarding conversion to drbd

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoFix documentation regarding conversion to drbd
Iustin Pop [Fri, 19 Nov 2010 10:17:12 +0000 (11:17 +0100)]
Fix documentation regarding conversion to drbd

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoReinstall instance: disallow offline secondaries
Iustin Pop [Thu, 18 Nov 2010 09:37:34 +0000 (10:37 +0100)]
Reinstall instance: disallow offline secondaries

Currently, reinstallation of a DRBD instance with the secondary node offline does:

node1# gnt-instance reinstall -f instance1
Waiting for job 139053 for instance1...
Thu Nov 18 01:36:09 2010  - WARNING: Could not prepare block device disk/0 on node node3 (is_primary=False, pass=1): Node is marked offline
Thu Nov 18 01:36:09 2010  - WARNING: Could not shutdown block device disk/0 on node node3: Node is marked offline
Job 139053 for instance1 has failed: Failure: command execution error:
Disk consistency error

Since this fails anyway, let's check the secondary nodes, thus
preventing any modifications to the instance (e.g. OS type change):

node1# gnt-instance reinstall -f instance1
Waiting for job 139058 for instance1...
Job 139058 for instance1 has failed: Failure: prerequisites not met for this operation:
error type: wrong_state, error details:
Instance secondary node offline, cannot reinstall: node3

The patch needs modifications to the _CheckNodeOnline function, in order
to display meaningful messages ("Can't use offline node" would be very
confusing for an instance reinstall, since we didn't select a node
manually).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoQA: check that doubly modifying an OS state is OK
Iustin Pop [Thu, 18 Nov 2010 09:23:48 +0000 (10:23 +0100)]
QA: check that doubly modifying an OS state is OK

This would have prevented the bug fixed in the previous patch :(

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoFix breakage in OS state modify
Iustin Pop [Thu, 18 Nov 2010 09:20:06 +0000 (10:20 +0100)]
Fix breakage in OS state modify

I was using the feedback_fn function incorrectly (it doesn't
automatically expand the arguments).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMerge branch 'devel-2.2' into devel-2.3
Iustin Pop [Wed, 17 Nov 2010 15:28:23 +0000 (16:28 +0100)]
Merge branch 'devel-2.2' into devel-2.3

* devel-2.2:
  QA: add tests for gnt-cluster modify -B
  LUSetClusterParms: fix validation of beparams

Conflicts:
lib/cmdlib.py (reverted & applied manually the change)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoQA: add tests for gnt-cluster modify -B
Iustin Pop [Wed, 17 Nov 2010 10:53:10 +0000 (11:53 +0100)]
QA: add tests for gnt-cluster modify -B

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoLUSetClusterParms: fix validation of beparams
Iustin Pop [Wed, 17 Nov 2010 10:52:04 +0000 (11:52 +0100)]
LUSetClusterParms: fix validation of beparams

Since the contents of the dict is validated via the ForceDictType, we can
simply require that it is a dict here. The previous check was wrong, as it was
copied from the HV checks (which also doesn't verify the leaf dict type).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd unittests for TemporaryReservationManager
Iustin Pop [Thu, 11 Nov 2010 09:38:44 +0000 (10:38 +0100)]
Add unittests for TemporaryReservationManager

And fix an error message.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoTempReservationManager: Reserved() doesn't work
David Knowles [Wed, 10 Nov 2010 20:57:19 +0000 (15:57 -0500)]
TempReservationManager: Reserved() doesn't work

Note: It appears this has been around since the initial checkin of
TemporaryReservationManager. I have no idea what this could break, so
someone else may want to test this more thoroughly.

Signed-off-by: David Knowles <dknowles@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'devel-2.2' into devel-2.3
Michael Hanselmann [Tue, 9 Nov 2010 13:56:49 +0000 (14:56 +0100)]
Merge branch 'devel-2.2' into devel-2.3

* devel-2.2:
  devel/release: Use release-specific Makefile targets
  Makefile: Add new dist target for releases
  Makefile: Stricter checks for release distchecks

Conflicts:
Makefile.am: Trivial

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agodevel/release: Use release-specific Makefile targets
Michael Hanselmann [Mon, 8 Nov 2010 19:44:00 +0000 (20:44 +0100)]
devel/release: Use release-specific Makefile targets

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMakefile: Add new dist target for releases
Michael Hanselmann [Mon, 8 Nov 2010 19:43:39 +0000 (20:43 +0100)]
Makefile: Add new dist target for releases

A new script, autotools/check-tar, is used to check the resulting
.tar.gz file for unwanted contents like wrong file owners or
permissions.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoUpdate ganeti-os-interface documentation
Apollon Oikonomopoulos [Fri, 5 Nov 2010 14:32:48 +0000 (16:32 +0200)]
Update ganeti-os-interface documentation

man/ganeti-os-interace.sgml lacked complete information for the NIC-related
environment variables. Added a reference to NIC_%N_LINK and NIC_%N_MODE and
clarified the reference to NIC_%N_BRIDGE.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMakefile: Check for empty files and dirs on distcheck
Michael Hanselmann [Thu, 4 Nov 2010 13:39:12 +0000 (14:39 +0100)]
Makefile: Check for empty files and dirs on distcheck

Including empty files can cause unnecessary warnings for packagers.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoRevert commit e7e23e73, work around Automake bug
Michael Hanselmann [Thu, 4 Nov 2010 14:56:13 +0000 (15:56 +0100)]
Revert commit e7e23e73, work around Automake bug

After commit e7e23e73 the build would fail in distcheck on systems with
Automake 1.10. An investigation identified Automake bug #533[1] as the
cause. Applying the changes in Automake commit 3a12ed5e[2] to the
generated Makefile.in file made distcheck work again.

The underlying problem is that in our case both doc/html and
doc/html/.dir were included in the distributed files. When distcheck
copied the former from the source to the staging directory, it was
marked as read-only (distcheck makes the whole source read-only). It
then tried to copy doc/html/.dir from the build directory, which failed.
Automake 1.11 and newer avoid this problem by adjusting the permissions.

Since depending on Automake 1.11 or above is not an option at this time,
a work-around was found by not using a “.dir” file in doc/html, but
using “index.html” as a flag for creating the directory.

[1] http://sourceware.org/cgi-bin/gnatsweb.pl?cmd=view&database=automake&pr=533
[2] http://git.savannah.gnu.org/gitweb/?p=automake.git;a=commit;h=3a12ed5e97dc193a38dd14e031658cbd329b50ca

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoFix disk checks in “gnt-cluster verify”
Michael Hanselmann [Wed, 3 Nov 2010 12:56:16 +0000 (13:56 +0100)]
Fix disk checks in “gnt-cluster verify”

Tests have shown that the changes in commit b8d26c6e5 don't work as
wanted. If any disk wasn't found on the node, all disks located on the
same node would show as faulty. The cause was incorrect exception
handling on the node.

This patch changes the RPC call to return a per-disk success/error
status, avoiding the problem.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Luca Bigliardi <shammash@google.com>

13 years agoQA: Run “gnt-cluster verify” while DRBD instance exists
Michael Hanselmann [Wed, 3 Nov 2010 12:49:43 +0000 (13:49 +0100)]
QA: Run “gnt-cluster verify” while DRBD instance exists

This tests some parts of the disk information collection.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Luca Bigliardi <shammash@google.com>

13 years agoRemove empty file from doc/html in distribution
Michael Hanselmann [Tue, 2 Nov 2010 13:42:32 +0000 (14:42 +0100)]
Remove empty file from doc/html in distribution

It's not needed and some packaging systems complain about empty
files.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoRemove shebang line from ganeti.server.*
Michael Hanselmann [Tue, 2 Nov 2010 13:41:58 +0000 (14:41 +0100)]
Remove shebang line from ganeti.server.*

Some of then were forgotten.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoFix typos in NEWS
Michael Hanselmann [Tue, 2 Nov 2010 13:16:34 +0000 (14:16 +0100)]
Fix typos in NEWS

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoBump version for Ganeti 2.3 v2.3.0rc0
Michael Hanselmann [Tue, 2 Nov 2010 10:49:55 +0000 (11:49 +0100)]
Bump version for Ganeti 2.3

Also update cfgupgrade and NEWS.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoAdd -s option to gnt-node modify
Guido Trotter [Sat, 30 Oct 2010 08:39:22 +0000 (09:39 +0100)]
Add -s option to gnt-node modify

We can now change a nodes' secondary ip.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoluxi: disable two lint errors
Guido Trotter [Mon, 1 Nov 2010 10:17:21 +0000 (10:17 +0000)]
luxi: disable two lint errors

This is already disabled for the same type of request a couple of lines
above. The new code was introduced in e986f20c but didn't have the
disables.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoRemove private ip mention in error message
Guido Trotter [Mon, 1 Nov 2010 12:44:22 +0000 (12:44 +0000)]
Remove private ip mention in error message

There is no "private" ip in Ganeti, we only have primary and secondary
ones. Whether they are public or private is a per-installation detail.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd ConfigWriter.GetNodeGroup
Guido Trotter [Sat, 30 Oct 2010 09:16:20 +0000 (10:16 +0100)]
Add ConfigWriter.GetNodeGroup

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoImprove LookupNodeGroup's docstring
Guido Trotter [Sat, 30 Oct 2010 09:15:58 +0000 (10:15 +0100)]
Improve LookupNodeGroup's docstring

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoMerge the common options between import and add
Guido Trotter [Fri, 29 Oct 2010 11:43:13 +0000 (12:43 +0100)]
Merge the common options between import and add

The "I always wanted to do this" commit.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoDrop the -g shortcut for --vg-name
Guido Trotter [Fri, 29 Oct 2010 10:42:11 +0000 (11:42 +0100)]
Drop the -g shortcut for --vg-name

Changing the volume group is a lot less frequent than acting on a node
group. As such we drop the "-g" shortcut and require the long option to
be passed. In 2.3 the commands which used to accept the volume group as
"-g" won't have any node group option, so no confusion will arise. Later
on we may pass "-g" as the initial node group name to gnt-cluster init,
although that's not strictly necessary, as modifying it later is always
possible.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoUpdate NEWS for Ganeti 2.3
Michael Hanselmann [Mon, 1 Nov 2010 15:00:57 +0000 (16:00 +0100)]
Update NEWS for Ganeti 2.3

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoQA: Test ssconf_instance_list file on rename and creation
Michael Hanselmann [Mon, 1 Nov 2010 13:01:23 +0000 (14:01 +0100)]
QA: Test ssconf_instance_list file on rename and creation

This test would've caught the bug fixed in the previous patch.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoconfig: Write ssconf after renaming instance
Michael Hanselmann [Mon, 1 Nov 2010 13:00:33 +0000 (14:00 +0100)]
config: Write ssconf after renaming instance

This fixes a bug where the ssconf_instance_list file was
not updated after an instance rename.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoChange qa_utils.ResolveInstanceName to take name
Michael Hanselmann [Mon, 1 Nov 2010 12:59:47 +0000 (13:59 +0100)]
Change qa_utils.ResolveInstanceName to take name

… instead of an object. Allows it to be used in places where
only the name is available.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMakefile: Add PYTHON_BOOTSTRAP to linted code
Michael Hanselmann [Fri, 29 Oct 2010 14:55:05 +0000 (16:55 +0200)]
Makefile: Add PYTHON_BOOTSTRAP to linted code

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMake *.in non-executable
Michael Hanselmann [Fri, 29 Oct 2010 14:26:28 +0000 (16:26 +0200)]
Make *.in non-executable

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMove ganeti-rapi to ganeti.server.rapi
Michael Hanselmann [Fri, 29 Oct 2010 14:10:50 +0000 (16:10 +0200)]
Move ganeti-rapi to ganeti.server.rapi

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMove ganeti-noded to ganeti.server.noded
Michael Hanselmann [Fri, 29 Oct 2010 14:08:48 +0000 (16:08 +0200)]
Move ganeti-noded to ganeti.server.noded

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMove ganeti-confd to ganeti.server.confd
Michael Hanselmann [Fri, 29 Oct 2010 14:05:20 +0000 (16:05 +0200)]
Move ganeti-confd to ganeti.server.confd

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMove ganeti-masterd to ganeti.server.masterd
Michael Hanselmann [Fri, 29 Oct 2010 13:26:20 +0000 (15:26 +0200)]
Move ganeti-masterd to ganeti.server.masterd

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoPrepare move of daemons to ganeti.server
Michael Hanselmann [Fri, 29 Oct 2010 13:13:51 +0000 (15:13 +0200)]
Prepare move of daemons to ganeti.server

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMove ganeti-watcher to ganeti.watcher
Michael Hanselmann [Wed, 27 Oct 2010 17:51:30 +0000 (19:51 +0200)]
Move ganeti-watcher to ganeti.watcher

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMakefile: Generalize bootstrap script generator
Michael Hanselmann [Wed, 27 Oct 2010 17:52:17 +0000 (19:52 +0200)]
Makefile: Generalize bootstrap script generator

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMakefile: Stricter checks for release distchecks
Michael Hanselmann [Wed, 27 Oct 2010 15:20:20 +0000 (17:20 +0200)]
Makefile: Stricter checks for release distchecks

This should avoid cases like commit f64de30f where the release
date was forgotten from NEWS.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agosetup-ssh: Better error reporting
René Nussbaumer [Fri, 29 Oct 2010 12:52:52 +0000 (14:52 +0200)]
setup-ssh: Better error reporting

Together with Michael we refactored the code to make it better and
easier error reporting. Without printing backtraces for authentication
and verification issues.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoMakefile: Streamline directory creation
Michael Hanselmann [Thu, 28 Oct 2010 15:06:21 +0000 (17:06 +0200)]
Makefile: Streamline directory creation

Some directories don't exist in the repository, but are required at build time
(e.g. doc/html). Until now some were created explicitly, some through the
target “stamp-directories” and other target simply relied on a previous target
to create the directory.

This patch tries to clean this up by getting rid of “stamp-directories” and
instead use rules to recreate any missing directory. As described in a comment
in the code, a file inside each directory is necessary, named “.dir”.

Order-only dependencies are used for directory creation to avoid rebuilding
where only the “.dir” file is missing (see “info make”, section “4.3 Types of
Prerequisites”).

The target for building the documentation is also changed to use “…/index.html”
instead of a hidden file. Some style changes are also made.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoAdd support and checks for version in LUXI
Michael Hanselmann [Thu, 28 Oct 2010 16:48:20 +0000 (18:48 +0200)]
Add support and checks for version in LUXI

A new constant, LUXI_VERSION, is used to verify the peer's version. The
version is optional, so old(er) clients and servers talking to peers not
supporting it won't break. Example with mismatching library:

$ gnt-instance list
Unhandled Ganeti error: LUXI version mismatch, server 2020000, request
1010000

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoluxi.ProtocolError: Derive from errors.LuxiError
Michael Hanselmann [Thu, 28 Oct 2010 15:11:43 +0000 (17:11 +0200)]
luxi.ProtocolError: Derive from errors.LuxiError

This allows LUXI errors to be encoded and serialized.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoLUExportInstance: Accept instance already shut down
Michael Hanselmann [Thu, 28 Oct 2010 16:03:14 +0000 (18:03 +0200)]
LUExportInstance: Accept instance already shut down

To remove the instance after an export it needs to be stopped. This can
be achived using the parameter “shutdown”, or by explicitly shutting
down the instance before exporting. The latter would still require the
“shutdown” parameter to be set. To make it more intuitive, this
requirement is changed with this patch. Instances already stopped are
accepted for automatic removal.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoQA: Allow job queue test to be disabled
Michael Hanselmann [Thu, 28 Oct 2010 15:58:45 +0000 (17:58 +0200)]
QA: Allow job queue test to be disabled

On my machine it takes over 30 seconds, disabling it can
speed up the QA.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoConfigWriter.GetNodeGroupList
Guido Trotter [Wed, 27 Oct 2010 13:47:12 +0000 (14:47 +0100)]
ConfigWriter.GetNodeGroupList

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoGanetiLockManager, remove default values
Guido Trotter [Wed, 27 Oct 2010 13:14:47 +0000 (14:14 +0100)]
GanetiLockManager, remove default values

The nodes and instances parameters to the constructor are mandatory
anyway, as a value of None will fail when creating the LockSet. Rather
than fixing this adding code lines, since we never used the default
value, let's remove them and require that the parameters are passed.

This also fixes the only places where we inited GanetiLockManager with
keyed parameters and without arguments.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd test for modifiable locking levels
Guido Trotter [Wed, 27 Oct 2010 14:09:01 +0000 (15:09 +0100)]
Add test for modifiable locking levels

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoUpdate 2.3 design doc regarding node group features and behavior.
Adeodato Simo [Thu, 28 Oct 2010 11:57:04 +0000 (12:57 +0100)]
Update 2.3 design doc regarding node group features and behavior.

In particular:

  - introduce a "gnt-group" command to hold group-level operations.
  - ditch the concept of "default node group", except for single-group
    clusters.
  - introduce an "alloc_policy" attribute for node groups, indicating
    how they should be treated by automated allocation tools.
  - introduce a "drain" operation on node groups.
  - define iallocator modes for new instance allocation and
    inter-group moves (choosing among all groups, or providing a
    limiting list).
  - indicate and explain that changing the group of a node will be
    initially only supported for nodes that are empty.

Signed-off-by: Adeodato Simo <dato@google.com>
Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoPrevent onlining a node without working noded
Iustin Pop [Thu, 28 Oct 2010 12:32:07 +0000 (14:32 +0200)]
Prevent onlining a node without working noded

This is just a basic check, plus a warning. In the future, we might do
more checks, or prevent simple onlining (without readd) if --force is
not passed.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoYet another rework in LUSetNodeParms
Iustin Pop [Thu, 28 Oct 2010 11:57:28 +0000 (13:57 +0200)]
Yet another rework in LUSetNodeParms

We will need the new role in CheckPrereq, so move its computation there
and save the new role to self.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoPrevent moving/creating instances on non-vm nodes
Iustin Pop [Wed, 27 Oct 2010 15:17:14 +0000 (17:17 +0200)]
Prevent moving/creating instances on non-vm nodes

This small patch modifies LUCreateInstance, LUReplaceDisks and
LUMoveInstance to not use non-vm_capable nodes.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd a CheckNodeVmCapable helper in cmdlib
Iustin Pop [Wed, 27 Oct 2010 15:11:05 +0000 (17:11 +0200)]
Add a CheckNodeVmCapable helper in cmdlib

Also changes the error code for the other CheckNode* helpers to
ECODE_STATE, not ECODE_INVAL: ECODE_INVAL is for requests that are
invalid (e.g. create drbd instance with one node), whereas ECODE_STATE
denote requests that are not satisfiable due to cluster/node/instance
state.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd the capability flags in node info output
Iustin Pop [Wed, 27 Oct 2010 15:06:23 +0000 (17:06 +0200)]
Add the capability flags in node info output

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd the master/vm_capable flags in node add
Iustin Pop [Wed, 27 Oct 2010 15:03:03 +0000 (17:03 +0200)]
Add the master/vm_capable flags in node add

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd support for vm_capable in file distribution
Iustin Pop [Wed, 27 Oct 2010 14:36:42 +0000 (16:36 +0200)]
Add support for vm_capable in file distribution

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd an UploadHelper to cmdlib
Iustin Pop [Wed, 27 Oct 2010 14:34:01 +0000 (16:34 +0200)]
Add an UploadHelper to cmdlib

This is used in two places already, and will be needed in a third, so
let's abstract it.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd support for vm_capable in cluster verify
Iustin Pop [Wed, 27 Oct 2010 14:12:01 +0000 (16:12 +0200)]
Add support for vm_capable in cluster verify

The method to make vm_capable integrate easily into cluster verify is as follows:

- we add a new NV_VMNODES that represents *non*-vm-capable nodes
- the LU populates this list (it's expected that non-vm_capable nodes
  are few compared to vm_capable nodes)
- backend skips the checks that are related to VM hosting
- in the LU, we reorder the VM-related checks so that they occur after
  the non-VM (generic) tests, and we only execute them conditionally

Additionally, we add some support to the instance checks to detect
instances living on bad nodes.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd vm_capable to gnt-node modify
Iustin Pop [Wed, 27 Oct 2010 12:43:32 +0000 (14:43 +0200)]
Add vm_capable to gnt-node modify

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>