ganeti-local
10 years agoDocument changes to file-based disks in NEWS
Klaus Aehlig [Thu, 30 Jan 2014 11:54:14 +0000 (12:54 +0100)]
Document changes to file-based disks in NEWS

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

10 years agoPreserve disk basename on instance rename
Klaus Aehlig [Thu, 30 Jan 2014 11:46:03 +0000 (12:46 +0100)]
Preserve disk basename on instance rename

For file-based instances, upon rename, the directory containing
the instance disks is moved. Therefore, the basename needs to
be preserved in this case. Fix this. Note that so far, this
worked by accident as before 94e252a3 file names used to be
"disk" followed by the index.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

10 years agoUpdate NEWS file
Hrvoje Ribicic [Tue, 28 Jan 2014 19:21:37 +0000 (19:21 +0000)]
Update NEWS file

This patch updates the NEWS file with NEWS of the bugfix, adding the
new 2.9.4 version in progress.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoModify test to reflect RAPI operation changes
Hrvoje Ribicic [Wed, 29 Jan 2014 13:04:57 +0000 (14:04 +0100)]
Modify test to reflect RAPI operation changes

A rlib2 unittest tested for the wrong behaviour, and this patch changes
the inputs and expected values to account for this.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoAdd QA tests for RAPI multi-instance allocation
Hrvoje Ribicic [Tue, 28 Jan 2014 18:04:44 +0000 (18:04 +0000)]
Add QA tests for RAPI multi-instance allocation

The instance multi-allocation had no tests to detect its breakage, and
this patch fixes that.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoFix multi-allocation RAPI method
Hrvoje Ribicic [Tue, 28 Jan 2014 15:26:32 +0000 (15:26 +0000)]
Fix multi-allocation RAPI method

The OpInstanceMultiAlloc that the instances-multi-alloc RAPI method
uses accepts a list of OpInstanceCreate opcodes rather than a list of
dictionaries as provided by the method. This patch correctly constructs
the opcodes, allowing the RAPI call to work as expected.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoAssign unique filenames to filebased disks
Ilias Tsitsimpis [Tue, 28 Jan 2014 15:23:46 +0000 (17:23 +0200)]
Assign unique filenames to filebased disks

With the new format for cmdline arguments, the user is able to add a
disk to an instance at a specific index. But filebased disks' filenames
have the form "{0}/disk{1}" where '{0}' is the file_storage_dir and
'{1}' is the index of the disk. So if an instance has 3 disks and we
try to create a new one at index 1, the operation will fail because the
filename "{0}/disk1" already exists.

This patch fixes the above problem and also makes the naming of file and
shared disks uniform with other templates.

Signed-off-by: Ilias Tsitsimpis <iliastsi@grnet.gr>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoRevision bump for 2.9.3
Klaus Aehlig [Fri, 24 Jan 2014 10:42:08 +0000 (11:42 +0100)]
Revision bump for 2.9.3

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Jose Lopes <jabolopes@google.com>

10 years agoSchedule 2.9.3 release
Klaus Aehlig [Fri, 24 Jan 2014 10:41:44 +0000 (11:41 +0100)]
Schedule 2.9.3 release

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Jose Lopes <jabolopes@google.com>

10 years agoDocument fix of issue 691 in NEWS
Klaus Aehlig [Fri, 24 Jan 2014 10:32:59 +0000 (11:32 +0100)]
Document fix of issue 691 in NEWS

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Jose Lopes <jabolopes@google.com>

10 years agoNEWS: fix typo in 2.8.4 release
Guido Trotter [Thu, 23 Jan 2014 16:07:25 +0000 (17:07 +0100)]
NEWS: fix typo in 2.8.4 release

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Petr Pudlák <pudlak@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoFix 'hvparams' of '_InstanceStartupMemory' on hypervisors
Jose A. Lopes [Fri, 24 Jan 2014 00:24:51 +0000 (01:24 +0100)]
Fix 'hvparams' of '_InstanceStartupMemory' on hypervisors

Most hypervisors were calling '_InstanceStartupMemory' but not passing
the 'hvparams' keyword argument.  Actually, it is not necessary to
pass this argument given that it is an attribute in the instance
object, which is passed.  This patch removes the 'hvparams' arg all
together, fixes the function and the calls to it.

Fixes issue 691.

Signed-off-by: Jose A. Lopes <jabolopes@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoAdd missing option to gnt-instance documentation
Hrvoje Ribicic [Thu, 23 Jan 2014 17:24:50 +0000 (18:24 +0100)]
Add missing option to gnt-instance documentation

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoUpdate NEWS file
Klaus Aehlig [Thu, 23 Jan 2014 15:40:02 +0000 (16:40 +0100)]
Update NEWS file

With the merge of stable-2.8 into stable-2.9, quite a few fixes
got inherited.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

10 years agoMerge branch 'stable-2.8' into stable-2.9
Klaus Aehlig [Thu, 23 Jan 2014 13:24:08 +0000 (14:24 +0100)]
Merge branch 'stable-2.8' into stable-2.9

* stable-2.8
  Version bump for 2.8.4 and NEWS update
  Update NEWS file with news about job cancellation bugfix
  Fix QA flakiness
  Linting fix: remove unused import
  Add missing parameter entry to man file
  Add QA test for job cancellation
  Add correct locking of master node to gnt-debug delay
  Add job id type assert to jqueue.py
  Add job id transformation/check to Luxi Python client
  Start-master/stop-master always fail if confd is disabled
  Improve backwards compatibility of Issue 649 fix
  Add missing NEWS entries from stable-2.8
  Change usb_devices separator to whitespace

Conflicts:
NEWS: take both additions
configure.ac: ignore revision bump
lib/cmdlib/test.py: manually redo changes of stable-2.8
    on stable-2.9 version

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

10 years agoFix disk_type error in hypervisor parameter documentation
Hrvoje Ribicic [Thu, 23 Jan 2014 10:20:40 +0000 (10:20 +0000)]
Fix disk_type error in hypervisor parameter documentation

According to the code, presenting disks as paravirtual is supported on
both HVM and KVM, while IDE works only on KVM. This patch updates docs
to be accurate.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoVersion bump for 2.8.4 and NEWS update
Michele Tartara [Thu, 23 Jan 2014 08:52:42 +0000 (08:52 +0000)]
Version bump for 2.8.4 and NEWS update

Update the version number to 2.8.4 and insert the final details for this
release in the NEWS file, including the release date.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoUpdate NEWS file with news about job cancellation bugfix
Hrvoje Ribicic [Mon, 20 Jan 2014 16:25:02 +0000 (17:25 +0100)]
Update NEWS file with news about job cancellation bugfix

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoFix QA flakiness
Hrvoje Ribicic [Mon, 20 Jan 2014 16:22:23 +0000 (17:22 +0100)]
Fix QA flakiness

The newly added job QA has some flakiness with respect to its use of
gnt-job watch. Fix this by waiting until the canceling status is
replaced with the canceled status, or a timeout is reached.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoLinting fix: remove unused import
Hrvoje Ribicic [Mon, 20 Jan 2014 15:30:05 +0000 (16:30 +0100)]
Linting fix: remove unused import

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoUpdate NEWS file: issue 687 and configure fix
Klaus Aehlig [Mon, 20 Jan 2014 13:12:38 +0000 (14:12 +0100)]
Update NEWS file: issue 687 and configure fix

Add entries to the NEWS file for the two user-visible changes that
happened since the last update: issue 687 got fixed, and configure
now supports Sphinx versions 1.2+.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

10 years agoluxid: fix detection of master node in node query
Apollon Oikonomopoulos [Mon, 20 Jan 2014 12:55:26 +0000 (14:55 +0200)]
luxid: fix detection of master node in node query

Ganeti.Config.getNodeRole would rely on clusterMasterNode returning the
master node name, however clusterMasterNode returns the master node's
UUID. We fix this and a similar issue in Ganeti.Query.Node.nodeFields.

Together with 1ec34e26, this fixes issue #687.

Signed-off-by: Apollon Oikonomopoulos <apoikos@gmail.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoAdd missing parameter entry to man file
Hrvoje Ribicic [Mon, 20 Jan 2014 12:06:45 +0000 (13:06 +0100)]
Add missing parameter entry to man file

The gnt-instance manual was lacking an entry for the vnc-password-file
hypervisor parameter. This patch adds one, and also some information on
the default value of the parameter.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoAdd QA test for job cancellation
Hrvoje Ribicic [Thu, 16 Jan 2014 10:14:08 +0000 (10:14 +0000)]
Add QA test for job cancellation

This patch introduces a QA test in which a job is cancelled while
waiting.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoAdd correct locking of master node to gnt-debug delay
Hrvoje Ribicic [Thu, 16 Jan 2014 12:08:18 +0000 (12:08 +0000)]
Add correct locking of master node to gnt-debug delay

The gnt-debug delay command required locks for all nodes except the
master - this patch fixes the issue by adding master to the locks
whenever needed.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoAdd job id type assert to jqueue.py
Hrvoje Ribicic [Wed, 15 Jan 2014 13:51:01 +0000 (13:51 +0000)]
Add job id type assert to jqueue.py

While the changes introduced in previous patches should stop any job
id parameters reaching the queue as strings, add an assertion here to
catch any strings making it through.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoAdd job id transformation/check to Luxi Python client
Hrvoje Ribicic [Wed, 15 Jan 2014 13:48:51 +0000 (13:48 +0000)]
Add job id transformation/check to Luxi Python client

This patch adds checks to the Luxi client, making sure that job ids
are converted from strings to ints before being passed on, or that an
error is reported.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoquery: fix detection of master in _GetNodeRole()
Apollon Oikonomopoulos [Fri, 17 Jan 2014 15:32:38 +0000 (17:32 +0200)]
query: fix detection of master in _GetNodeRole()

Commit 1c3231aa changed the invocation of _GetNodeRole() to pass the
master node by UUID and not by name, but didn't change the
implementation to compare the nodes by name. As a result, the master
node (which is also a master candidate) would always fall through to the
second option and be marked as 'C' instead as 'M'.

We fix this by modifying the implementation of _GetNodeRole() to perform
the comparison by UUID and also change the respective tests to test the
new API.

Signed-off-by: Apollon Oikonomopoulos <apoikos@gmail.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoStart-master/stop-master always fail if confd is disabled
Jose A. Lopes [Fri, 17 Jan 2014 00:43:43 +0000 (01:43 +0100)]
Start-master/stop-master always fail if confd is disabled

In 'daemons/daemon-util.in', 'start-master' and 'stop-master' always
fail if confd is disabled.

Fixes issue 685.

Signed-off-by: Jose A. Lopes <jabolopes@gmail.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoBreak line longer than 80 chars in configure.ac
Klaus Aehlig [Mon, 13 Jan 2014 12:54:14 +0000 (13:54 +0100)]
Break line longer than 80 chars in configure.ac

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Jose Lopes <jabolopes@google.com>

10 years agoTechnical writing: improve documentation and glossary
Jose A. Lopes [Mon, 13 Jan 2014 12:37:06 +0000 (13:37 +0100)]
Technical writing: improve documentation and glossary

Improve structure and content on the main documentation page of Ganeti
and the glossary.

Signed-off-by: Betsy Beyer <bbeyer@google.com>
Reviewed-by: Jose A. Lopes <jabolopes@google.com>

10 years agoconfigure: allow detection of Sphinx 1.2+
Apollon Oikonomopoulos [Mon, 13 Jan 2014 12:12:14 +0000 (14:12 +0200)]
configure: allow detection of Sphinx 1.2+

The regular expression used for parsing the Sphinx version does not work
with Sphinx versions after 1.1, as reported in issue #502. The reason
for this is that upstream commit 8f28af8e2ed8[1] introduced proper
support for --version, which ganeti was already using but sphinx-build
was lacking (outputting generic usage information instead).

Since it seems that upstream has no reason to change the output format
again, we support the new versioning scheme with a strict-as-possible
match.

This fixes issue 502.

[1] https://bitbucket.org/birkenfeld/sphinx/commits/8f28af8e2ed8619087738d83b4f55e3db938a104

Signed-off-by: Apollon Oikonomopoulos <apoikos@gmail.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoRemove deprecated _ERROR_DATA_KEY in QMP
Dimitris Aragiorgis [Sat, 11 Jan 2014 16:48:18 +0000 (18:48 +0200)]
Remove deprecated _ERROR_DATA_KEY in QMP

Commit de253f14 of QEMU repo "BREAKS QMP's compatibility for
the error response" as it removes "data" key from qmp error
response messages.  To this end we only log "class" and "desc"
values of the message.

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: Stratos Psomadakis <psomas@grnet.gr>
Reviewed-by: Jose A. Lopes <jabolopes@google.com>

10 years agoTechnical writing: improve main documentation page
Jose A. Lopes [Thu, 9 Jan 2014 15:50:11 +0000 (16:50 +0100)]
Technical writing: improve main documentation page

Improve structure and content on the main documentation page of
Ganeti.

Signed-off-by: Betsy Beyer <bbeyer@google.com>
Reviewed-by: Jose A. Lopes <jabolopes@google.com>

10 years agoImprove backwards compatibility of Issue 649 fix
Michele Tartara [Wed, 8 Jan 2014 14:01:24 +0000 (14:01 +0000)]
Improve backwards compatibility of Issue 649 fix

Commit e6e4ff4cf8d0100f331f94f7a27aa1e03a5d0e7d fixed Issue 649 by switching the
separator for usb_devices from comma to space. That solved the problem with
the command line, but RAPI was able to work with commas too, so, for backwards
compatibility we need to keep supporting that as well.

Also, in order to avoid changing the format of the config file, the default
internal representation is brought back to being comma-based, and it is changed
at the interface level (CLI or RAPI) before being passed on.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoAdd missing NEWS entries from stable-2.8
Michele Tartara [Tue, 7 Jan 2014 16:04:48 +0000 (16:04 +0000)]
Add missing NEWS entries from stable-2.8

Some fixes where pushed to the stable-2.8 branch without a corresponding NEWS
entry. This patch adds them.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoChange usb_devices separator to whitespace
Michele Tartara [Tue, 7 Jan 2014 15:05:54 +0000 (16:05 +0100)]
Change usb_devices separator to whitespace

The usb_devices parameter was using comma as a list separator, but this cannot
work because comma is already used as the hypervisor parameter separator.

Change it to use whitespace as a separator, in accordance to what already done
for the extra parameters.

The NEWS file is updated accordingly.

Fixes Issue 649.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoUpdate the NEWS file with the Issue 640 fix
Michele Tartara [Thu, 19 Dec 2013 18:29:24 +0000 (19:29 +0100)]
Update the NEWS file with the Issue 640 fix

Add an entry in the NEWS file describing the fix of Issue 640.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoEnsure that all the hypervisors exist in the config file
Michele Tartara [Thu, 19 Dec 2013 17:27:38 +0000 (18:27 +0100)]
Ensure that all the hypervisors exist in the config file

All the hypervisors are supposed to exist in the config file, but it might not
be so after upgrades from old versions. This patch ensures that all the missing
hypervisors are added with their default values to the config file.

Also, some tests are adapted, because now they receive the default values
instead of an empty dictionary when they are working using a minimal cluster
config as their input.

Fixes Issue 640.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoFix testEncodeInstance test input
Michele Tartara [Thu, 19 Dec 2013 17:13:13 +0000 (18:13 +0100)]
Fix testEncodeInstance test input

The input of the testEncodeInstance test was not adherent to the actual format
of the Ganeti configuration file: kvm has no HV_BLOCKDEV_PREFIX, and "hvparams"
inside an instance should only contain the values of the hypervisor parameters,
not the hypervisor name, which is already declared in the "hypervisor" field,
and which was not correctly aligned with the parameters in the "hvparams"
section.

All these problems are now fixed, and the assertions are changed accordingly.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoMerge branch 'stable-2.8' into stable-2.9
Klaus Aehlig [Mon, 16 Dec 2013 16:23:19 +0000 (17:23 +0100)]
Merge branch 'stable-2.8' into stable-2.9

* stable-2.8
  Add support for blktap2 file-driver
  Update opcodes test to include network tags
  Make network tags searchable
  Add network tag tests to QA
  Fix RAPI network tag handling
  Fix gnt-network list-tags

Conflicts:
lib/cmdlib/tags.py
test/py/ganeti.hypervisor.hv_xen_unittest.py
Resolution: manually apply the changes from stable-2.8 to
            the stable-2.9 code.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

10 years agoAdd support for blktap2 file-driver
Michele Tartara [Fri, 13 Dec 2013 13:26:23 +0000 (13:26 +0000)]
Add support for blktap2 file-driver

Newer Xen versions use blktap2 instead of blktap. This patch adds support
for it in Ganeti.

Fixes Issue 638.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

10 years agoUpdate opcodes test to include network tags
Hrvoje Ribicic [Thu, 12 Dec 2013 15:33:04 +0000 (16:33 +0100)]
Update opcodes test to include network tags

This patch adds the network tags to the list of all other tag types
that can be tried in QuickCheck tests.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoMake network tags searchable
Hrvoje Ribicic [Fri, 13 Dec 2013 12:47:27 +0000 (12:47 +0000)]
Make network tags searchable

This patch adds the network tags to the tags searched by gnt-cluster
search-tags, and in the process cleans up the code slightly.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoAdd network tag tests to QA
Hrvoje Ribicic [Thu, 12 Dec 2013 15:36:04 +0000 (16:36 +0100)]
Add network tag tests to QA

The QA did not have a test for network tags until now, and this patch
remedies the situation.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoFix RAPI network tag handling
Hrvoje Ribicic [Mon, 16 Dec 2013 13:21:04 +0000 (14:21 +0100)]
Fix RAPI network tag handling

The network tags were absent from an if check used to actually list
tags. The patch fixes the oversight, and adds a proper error message in
case the issue occurs again for a new tag type.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoFix gnt-network list-tags
Dimitris Aragiorgis [Thu, 12 Dec 2013 13:04:11 +0000 (15:04 +0200)]
Fix gnt-network list-tags

Define network tags in haskell part.

This fixes issue 641.

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

10 years agoBump revision for 2.9.2
Klaus Aehlig [Fri, 13 Dec 2013 12:03:41 +0000 (13:03 +0100)]
Bump revision for 2.9.2

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoUpdate NEWS for 2.9.2 release
Klaus Aehlig [Fri, 13 Dec 2013 12:03:19 +0000 (13:03 +0100)]
Update NEWS for 2.9.2 release

Besides a few local fixes, the main improvement are the changes
inherited from stable 2.8.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoPass hvparams to GetInstanceInfo
Klaus Aehlig [Thu, 12 Dec 2013 13:59:23 +0000 (14:59 +0100)]
Pass hvparams to GetInstanceInfo

...so that the xen command to be called can be determined. This
fixes another semantical conflict of the last merge.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Jose Lopes <jabolopes@google.com>

10 years agoAdapt parameters that moved to instance variables
Klaus Aehlig [Thu, 12 Dec 2013 12:40:57 +0000 (13:40 +0100)]
Adapt parameters that moved to instance variables

Due to a change in the code organization in stable-2.9, some
method variables became instance variables, causing a semantic
merge conflict. Fix this.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoAvoid lines longer than 80 chars
Klaus Aehlig [Thu, 12 Dec 2013 09:05:20 +0000 (10:05 +0100)]
Avoid lines longer than 80 chars

...as they're a lint error.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoMerge branch 'stable-2.8' into stable-2.9
Klaus Aehlig [Wed, 11 Dec 2013 13:35:02 +0000 (14:35 +0100)]
Merge branch 'stable-2.8' into stable-2.9

* stable-2.8
  Version bump for 2.8.3
  Update NEWS for 2.8.3 release
  Support reseting arbitrary params of ext disks
  Allow modification of arbitrary params for ext
  Do not clear disk.params in UpgradeConfig()
  SetDiskID() before accepting an instance
  Lock group(s) when creating instances
  Fix job error message after unclean master shutdown
  Add default file_driver if missing
  Update tests
  Xen handle domain shutdown
  Fix evacuation out of drained node
  Refactor reading live data in htools
  master-up-setup: Ping multiple times with a shorter interval
  Add a packet number limit to "fping" in master-ip-setup
  Fix a bug in InstanceSetParams concerning names
  build_chroot: hard-code the version of blaze-builder
  Fix error printing
  Allow link local IPv6 gateways
  Fix NODE/NODE_RES locking in LUInstanceCreate
  eta-reduce isIpV6
  Ganeti.Rpc: use brackets for ipv6 addresses
  Update NEWS file with socket permission fix info
  Fix socket permissions after master-failover

Conflicts:
NEWS
configure.ac
lib/cmdlib/instance.py
lib/cmdlib/instance_migration.py
lib/hypervisor/hv_xen.py
lib/masterd/iallocator.py
lib/objects.py
src/Ganeti/HTools/Backend/IAlloc.hs
src/Ganeti/HTools/Backend/Luxi.hs
src/Ganeti/HTools/Backend/Rapi.hs
Resolution:
NEWS: take both additions
configure.ac: ignore revision bump on stable-2.8
Rest: manually apply the stable-2.8 changes on stable-2.9 code;
              for lib/hypervisor/hv_xen.py this also includes passing
              the additonal hvparams around, and adapting tests.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

10 years agoVersion bump for 2.8.3
Michele Tartara [Mon, 9 Dec 2013 13:21:12 +0000 (14:21 +0100)]
Version bump for 2.8.3

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

10 years agoUpdate NEWS for 2.8.3 release
Michele Tartara [Mon, 9 Dec 2013 13:20:28 +0000 (14:20 +0100)]
Update NEWS for 2.8.3 release

List all the changes that happened between 2.8.2 and 2.8.3.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

10 years agoSupport reseting arbitrary params of ext disks
Dimitris Aragiorgis [Tue, 10 Dec 2013 09:14:54 +0000 (11:14 +0200)]
Support reseting arbitrary params of ext disks

If param=default and the param already exists then we remove
it from params dict. This is stolen by GetUpdatedParams() which
is used for hvparams modification/inheritance.

This means that 'default' value is not accepted for an arbitrary
param of an ext disk.

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoAllow modification of arbitrary params for ext
Dimitris Aragiorgis [Tue, 10 Dec 2013 16:00:55 +0000 (18:00 +0200)]
Allow modification of arbitrary params for ext

Disks of ext template are allowed to have arbitrary parameters
stored in the Disk object's params slot. Those parameters can be
passed during creation of a new disk, either in LUInstanceCreate()
or in LUInsanceSetParams(). Still those parameters can not be
changed afterwards. With this patch we override this limitation.

Currently, for the other disk templates we allow modifying only
'name' and 'mode'. Therefore, we introduce new constants
MODIFIABLE_IDISK_PARAM* to include those params. If any other
parameter is passed, _VerifyDiskModification() will raise an
exception.

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoDo not clear disk.params in UpgradeConfig()
Dimitris Aragiorgis [Tue, 10 Dec 2013 09:14:52 +0000 (11:14 +0200)]
Do not clear disk.params in UpgradeConfig()

Commits 5dbee5e and cce4616 fix disk upgrades concerning params
slot. Since 2.7 params slot should be empty and gets filled
any time needed.

Still ext template allows passing arbitrary params per disk.
These params should be saved in config file for future use.
For instance if we have the shared-filer provider and we
specify shared_dir param during instance create, this param
is needed when we want to attach the disk e.g., during
retrieving instance info. If it gets overridden during a daemon
restart or a config reload we fail to get the instance's info.

To avoid such a failure, we set params slot to an empty dict
only if params not found in the first place.

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoSetDiskID() before accepting an instance
Dimitris Aragiorgis [Mon, 9 Dec 2013 12:00:27 +0000 (14:00 +0200)]
SetDiskID() before accepting an instance

SetDiskID() fills physical_id slot of a Disk object.

LUInstanceSetParams() does not invoke SetDiskID() upon creation of a
new disk. As a result the physical_id slot of the Disk object in
config data is missing.

In case of ext disk template, in AcceptInstance() we invoke
_GatherAndLinkBlockDevices(). This takes `instance` as an argument
which includes current disks info. So, after adding a disk,
migration of ext instances will fail because FindDevice() expects
the physical_id slot.

With this patch we invoke SetDiskID() for every disk of the instance
before accept_instance() RPC.

Fixes Issue 633.

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoLock group(s) when creating instances
Petr Pudlak [Thu, 28 Nov 2013 14:38:57 +0000 (15:38 +0100)]
Lock group(s) when creating instances

This is required to prevent race conditions such as removing a network
from a group and adding an instance at the same time. (See issue 621#2.)

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoFix job error message after unclean master shutdown
Hrvoje Ribicic [Thu, 5 Dec 2013 09:49:01 +0000 (10:49 +0100)]
Fix job error message after unclean master shutdown

According to commit 599ee321eb, any job-related error messages should
be encoded within a Ganeti-specific error and not passed on as a
string, to allow for easier parsing.

For jobs suffering from an undesirable status after an unclean master
daemon shutdown, the message was not encoded, as reported in issue 618.
This patch fixes the problem.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoAdd default file_driver if missing
Michele Tartara [Wed, 4 Dec 2013 17:49:50 +0000 (18:49 +0100)]
Add default file_driver if missing

If the file driver of an instance with file based storage is not specified, the
default one is automatically added by the UpgradeConfig function.

Fixes Issue 571.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoUpdate tests
Jose A. Lopes [Mon, 2 Dec 2013 12:07:39 +0000 (13:07 +0100)]
Update tests

Update hypervisor unit tests.

Partial cherry-pick from d2e4e099e4248832fef8ed7b0755d01bd4178e3a

Signed-off-by: Jose A. Lopes <jabolopes@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoXen handle domain shutdown
Jose A. Lopes [Mon, 2 Dec 2013 11:41:33 +0000 (12:41 +0100)]
Xen handle domain shutdown

Update Xen backend to properly recognize when a domain has been
shutdown by the user and to properly cleanup a shutdown domain when
Ganeti requests Xen to stop this domain.

Partial cherry-pick from 9d22cc90609e3ee8f0f2b34b793a3daced3c0e61

Signed-off-by: Jose A. Lopes <jabolopes@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoFix evacuation out of drained node
Jose A. Lopes [Thu, 28 Nov 2013 10:04:25 +0000 (11:04 +0100)]
Fix evacuation out of drained node

* fix node daemon not to skip data, such as, memory and disk size,
  when building the node list to send to HBal, given that these data
  are important for HBal to determine whether an evacuation is
  possible
* fix iallocator to properly load drained nodes from the list passed
  by the node daemon, instead of zeroing all the data, such as, the
  memory and disk size
* this fixes issue 615

Signed-off-by: Jose A. Lopes <jabolopes@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

10 years agoRefactor reading live data in htools
Bernardo Dal Seno [Tue, 4 Jun 2013 16:38:11 +0000 (18:38 +0200)]
Refactor reading live data in htools

This simplifies different handling of individual items.

Cherry-picked from 8c72f7119f50a11661aacba2a1abffdfdc6f7cfa.

Signed-off-by: Jose A. Lopes <jabolopes@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

10 years agomaster-up-setup: Ping multiple times with a shorter interval
Petr Pudlak [Tue, 3 Dec 2013 08:03:28 +0000 (09:03 +0100)]
master-up-setup: Ping multiple times with a shorter interval

In the case of network problems, one ping packet can possibly get lost.
Sending multiple packets is safer. The interval between packets is set
to 200ms so that the check finishes faster.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoAdd a packet number limit to "fping" in master-ip-setup
Petr Pudlak [Mon, 2 Dec 2013 10:04:54 +0000 (11:04 +0100)]
Add a packet number limit to "fping" in master-ip-setup

This fixes issue #630. Apparently there is a bug in fping 3.5 where it
loops forever without "-c" given an unreachable host, even though
"-c 1" should be the default according to the man page.

The "-c" flag works on Squeeze. Checking the man pages on the Internet,
fping supported "-c" at least since 2007. So there should be no backward
compatibility problems.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Jose A. Lopes <jabolopes@google.com>

10 years agoFix a bug in InstanceSetParams concerning names
Dimitris Aragiorgis [Thu, 28 Nov 2013 08:19:19 +0000 (10:19 +0200)]
Fix a bug in InstanceSetParams concerning names

In case no name is passed in disk modifications we should
keep the old one. If name=none then set disk name to None.

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Reviewed-by: Jose A. Lopes <jabolopes@google.com>

10 years agoSingleNotifyPipeCondition: don't share pollers
Guido Trotter [Fri, 29 Nov 2013 10:09:07 +0000 (11:09 +0100)]
SingleNotifyPipeCondition: don't share pollers

As widely known Ganeti uses a better[1] lock condition notification
library based on operating system pipes.

Inside this library we were using a shared poller for all threads
waiting for a condition. While poller is not thread safe, since (1)
we're holding the condition lock while calling poll and while parsing
results, and (2) we don't reuse the poller between different conditions,
or with newer fds our usage *is* actually safe. Unfortunately newer
versions of python take a hand-holding approach and don't trust us to do
the right thing. As such we are forced to create a new poller each time
we call wait.

This costs one system call more per wait, but practical measurements
have shown no significant impact on Ganeti. This is also a temporary
measures as newer versions will get away with the threading altogether
and move job schedulers to luxid.

The patch is losely based on a patch submitted by Daniel Néri, but has
been modified to reduce even further the scope of the poller variable to
just the waiter class.

[1] because I say so.
[1 bis] also because it produces fairer results, avoids possible
starvation and eliminates busy-wait polling.
[2] http://bugs.python.org/issue8865

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agobuild_chroot: hard-code the version of blaze-builder
Petr Pudlak [Thu, 28 Nov 2013 10:46:38 +0000 (11:46 +0100)]
build_chroot: hard-code the version of blaze-builder

The newest version does not build on Debian squeeze, so avoid
it being pulled in as a dependency.

This is the same issue that has been fixed in [1e078ef3] on master.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoFix error printing
Jose A. Lopes [Fri, 22 Nov 2013 13:44:02 +0000 (14:44 +0100)]
Fix error printing

Fixes issue 616.

Signed-off-by: Jose A. Lopes <jabolopes@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoAllow link local IPv6 gateways
Thomas Thrainer [Mon, 25 Nov 2013 14:48:41 +0000 (15:48 +0100)]
Allow link local IPv6 gateways

Each host using IPv6 always has a link local address in fe80::/10. It is
common to use fe80::1 as default gateway to ease client configuration.
Ganeti prevented this usage, because it made sure that the IPv6 gateway
is in the IPv6 network the instance is connected to.

This patch also allows to specify a IPv6 gateway in the link local
network in addition to the network the instance is connected to.

This fixes issue 624.

Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoFix NODE/NODE_RES locking in LUInstanceCreate
Thomas Thrainer [Mon, 25 Nov 2013 10:37:06 +0000 (11:37 +0100)]
Fix NODE/NODE_RES locking in LUInstanceCreate

Both NODE and NODE_RES locks were acquired opportunistically if so
requested by the user. LUInstanceCreate requires, however, that the
actually locked elements on NODE and NODE_RES level are the same.

This patch changes the locking of NODE_RES such that those locks are not
acquired opportunistically any more. Instead, the mandatory locks are
set to the acquired NODE locks once they are actually granted.

This fixes issue 622.

Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

10 years agoeta-reduce isIpV6
Klaus Aehlig [Tue, 26 Nov 2013 19:45:39 +0000 (20:45 +0100)]
eta-reduce isIpV6

This is not only better style, but also fixes a lint error.
Also use the infix form of `elem` to increase readability.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoGaneti.Rpc: use brackets for ipv6 addresses
Guido Trotter [Tue, 26 Nov 2013 15:27:17 +0000 (16:27 +0100)]
Ganeti.Rpc: use brackets for ipv6 addresses

We detect an IPv6 vs V4 address based on columns, rather than passing
the family from the cluster object to be more future proof (in case
we'll ever support mixed clusters).

Unfortunately quite a bit more code is required to test this: we need an
arbitrary node that does the right thing w.r.t. ip addresses and also
test-only exports. As such we'll do this out of the stable branch.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoKVM: use custom KVM path if set for version checking
Santi Raffa [Tue, 26 Nov 2013 11:19:26 +0000 (12:19 +0100)]
KVM: use custom KVM path if set for version checking

This commit fixes two TODOs from 2008 about using the hardcoded
"default" path for KVM where a custom one could've been set through
`gnt-cluster modify`.

As a result, `gnt-cluster verify` will no longer fail if a custom
path was set in such a manner.

Signed-off-by: Santi Raffa <rsanti@google.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoUpdate NEWS file with socket permission fix info
Hrvoje Ribicic [Fri, 15 Nov 2013 16:39:56 +0000 (16:39 +0000)]
Update NEWS file with socket permission fix info

The NEWS file now contains a 2.8.3 entry, describing the fix of the
previous patch.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoFix socket permissions after master-failover
Hrvoje Ribicic [Fri, 15 Nov 2013 10:44:04 +0000 (10:44 +0000)]
Fix socket permissions after master-failover

When using gnt-cluster master-failover, on the soon-to-be-master the
luxi daemon is started by the node daemon. This makes the luxi
daemon inherit the node daemon's umask 077, making the communication
socket unreadable to group members. When using Ganeti with non-root
users, this causes problems, as reported in issue 477.

To fix this, the socket permissions are set explicitly.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoBump revision for 2.9.1 v2.9.1
Klaus Aehlig [Tue, 12 Nov 2013 15:44:22 +0000 (16:44 +0100)]
Bump revision for 2.9.1

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoUpdate NEWS and schedule release for 2.9.1
Klaus Aehlig [Tue, 12 Nov 2013 15:43:46 +0000 (16:43 +0100)]
Update NEWS and schedule release for 2.9.1

Now that issue 608 fixed, schedule a new release date
for 2.9.1.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

10 years agoFix retrieval of xen command in class method
Helga Velroyen [Tue, 12 Nov 2013 13:45:53 +0000 (14:45 +0100)]
Fix retrieval of xen command in class method

This patch fixes issue 608. When introducing the
configurability of the xen toolstack in commit
8ef418bb92, the hypervisor api was accidentally changed
in a way that let to this error in KVM.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoFix docstring for ganeti.storage.filestorage_unittest.py
Santi Raffa [Mon, 11 Nov 2013 17:44:29 +0000 (18:44 +0100)]
Fix docstring for ganeti.storage.filestorage_unittest.py

Signed-off-by: Santi Raffa <rsanti@google.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoUndo revision bump
Klaus Aehlig [Fri, 8 Nov 2013 16:06:09 +0000 (17:06 +0100)]
Undo revision bump

Before releasing 2.9.2, we still have issue 608 to fix; if
no release date is set, we still have to be at the lower
version.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoPostpone release of 2.9.1
Klaus Aehlig [Fri, 8 Nov 2013 15:08:55 +0000 (16:08 +0100)]
Postpone release of 2.9.1

...until issue 608 is fixed.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoRevision bump for 2.9.1
Klaus Aehlig [Thu, 7 Nov 2013 15:21:46 +0000 (16:21 +0100)]
Revision bump for 2.9.1

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoUpdate NEWS for 2.9.1 release
Klaus Aehlig [Thu, 7 Nov 2013 15:20:23 +0000 (16:20 +0100)]
Update NEWS for 2.9.1 release

Add a section in the file for the new upcoming release. Besides
the fix of the DRBD race condition inherited from 2.8.2, this
also fixes handling and readding of offline nodes.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoReadd nodes as online
Klaus Aehlig [Thu, 7 Nov 2013 14:19:22 +0000 (15:19 +0100)]
Readd nodes as online

Patch d0d7d7cf accidentally removed the offline-flag reset
when readding a node. Readd it.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

10 years agoWhen verifying BRBD version, ignore missing values
Klaus Aehlig [Thu, 7 Nov 2013 13:42:10 +0000 (14:42 +0100)]
When verifying BRBD version, ignore missing values

When comparing for consistency of the DRBD versions, some
versions might not be available via RPC, typically, if the
node is offline. In this case, leave these nodes out of the
test, instead of failing with an internal python error.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

10 years agoMerge branch 'stable-2.8' into stable-2.9
Klaus Aehlig [Thu, 7 Nov 2013 12:37:08 +0000 (13:37 +0100)]
Merge branch 'stable-2.8' into stable-2.9

* stable-2.8
  Version bump for 2.8.2
  Update NEWS file for 2.8.2 release
  DRBD: ensure peers are UpToDate for dual-primary

Conflicts:
NEWS: trivial
configure.ac: ignore version bump on stable-2.8
lib/bdev.py: manually apply the part of commit
            73e15b5e that applies to lib/bdev.py to
            lib/storage/drbd_info.py, and keep lib/bdev.py
            removed

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoVersion bump for 2.8.2 v2.8.2
Michele Tartara [Wed, 6 Nov 2013 12:26:24 +0000 (12:26 +0000)]
Version bump for 2.8.2

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

10 years agoUpdate NEWS file for 2.8.2 release
Michele Tartara [Wed, 6 Nov 2013 12:25:16 +0000 (12:25 +0000)]
Update NEWS file for 2.8.2 release

Add a section in the file for the new upcoming release.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

10 years agoDRBD: ensure peers are UpToDate for dual-primary
Apollon Oikonomopoulos [Tue, 5 Nov 2013 14:30:45 +0000 (16:30 +0200)]
DRBD: ensure peers are UpToDate for dual-primary

DrbdAttachNet supports both, normal primary/secondary node operation, and
(during live migration) dual-primary operation. When resources are newly
attached, we poll until we find all of them in connected or syncing operation.

Although aggressive, this is enough for primary/secondary operation, because
the primary/secondary role is not changed from within DrbdAttachNet. However,
in the dual-primary ("multimaster") case, both peers are subsequently upgraded
to the primary role.  If - for unspecified reasons - both disks are not
UpToDate, then a resync may be triggered after both peers have switched to
primary, causing the resource to disconnect:

  kernel: [1465514.164009] block drbd2: I shall become SyncTarget, but I am
    primary!
  kernel: [1465514.171562] block drbd2: ASSERT( os.conn == C_WF_REPORT_PARAMS )
    in /build/linux-rrsxby/linux-3.2.51/drivers/block/drbd/drbd_receiver.c:3245

This seems to be extremely racey and is possibly triggered by some underlying
network issues (e.g. high latency), but it has been observed in the wild. By
logging the DRBD resource state in the old secondary, we managed to see a
resource getting promoted to primary while it was:

  WFSyncUUID Secondary/Primary Outdated/UpToDate

We fix this by explicitly waiting for "Connected" cstate and
"UpToDate/UpToDate" disks, as advised in [1]:

  "For this purpose and scenario,
   you only want to promote once you are Connected UpToDate/UpToDate."

[1] http://lists.linbit.com/pipermail/drbd-user/2013-July/020173.html

Signed-off-by: Apollon Oikonomopoulos <apoikos@gmail.com>
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoRevision bump for 2.9.0 v2.9.0
Klaus Aehlig [Mon, 4 Nov 2013 09:40:13 +0000 (10:40 +0100)]
Revision bump for 2.9.0

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoSchedule release of 2.9.0
Klaus Aehlig [Mon, 4 Nov 2013 09:39:36 +0000 (10:39 +0100)]
Schedule release of 2.9.0

...and mention the last change pulled in from stable-2.8.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoMerge branch 'stable-2.8' into stable-2.9
Klaus Aehlig [Mon, 4 Nov 2013 15:36:02 +0000 (16:36 +0100)]
Merge branch 'stable-2.8' into stable-2.9

* stable-2.8
  Improve error message for replace-disks

Conflicts:
lib/cmdlib/instance_storage.py
Resolved by manually applying the node name to uuid
transition on the version of stable-2.9.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoImprove error message for replace-disks
Michele Tartara [Mon, 4 Nov 2013 15:20:07 +0000 (15:20 +0000)]
Improve error message for replace-disks

In some conditions, replace-disks will fail if the disks are not properly
activated. Improve the error message suggesting to run activate-disks before
executing replace-disks.

Fixes Issue 606.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

10 years agoMerge branch 'stable-2.8' into stable-2.9
Klaus Aehlig [Wed, 30 Oct 2013 12:33:58 +0000 (13:33 +0100)]
Merge branch 'stable-2.8' into stable-2.9

* stable-2.8
  Add all dependencies for confd as test dependencies
  Add snap-server to the test-relevenat packages
  Placate warnings on ganeti.outils_unittest.py

Conflicts:
configure.ac: take both additions (and fix)

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoAdd all dependencies for confd as test dependencies
Klaus Aehlig [Wed, 30 Oct 2013 10:13:48 +0000 (11:13 +0100)]
Add all dependencies for confd as test dependencies

Since our tests pull in confd as a dependency, all build dependencies
for confd are also necessary to run the tests.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>

10 years agoAdd snap-server to the test-relevenat packages
Klaus Aehlig [Tue, 29 Oct 2013 15:09:14 +0000 (16:09 +0100)]
Add snap-server to the test-relevenat packages

While snap-server is only needed for the optional monitoring daemon,
some tests, notably those testing these optional features, still depend
on it. So, if snap-server is missing, the Haskell tests should not be
run, as they cannot even be build.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>