Bernardo Dal Seno [Tue, 30 Apr 2013 14:07:17 +0000 (16:07 +0200)]
cfgupgrade: Downgrade is a NO-OP
The configuration is still the same as in 2.8 (the reference stable version
for this branch), so downgrade shouldn't do anything.
Unit tests are also updated, with a new 2.8 configuration file. The
configuration file used for the upgrade+downgrade test was tailored to the
2.7 downgrade, and it's not needed any more.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Michele Tartara [Fri, 17 May 2013 09:42:47 +0000 (10:42 +0100)]
Design doc for internal shutdown detection
Ganeti is currently not able to detect a legit shutdown request performed by a
user from inside a Xen domain.
This patch provides a design document to implement a mechanism able to cope with
such events.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Wed, 22 May 2013 12:19:17 +0000 (14:19 +0200)]
Document recent hroller changes in the NEWS file
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Wed, 22 May 2013 12:57:16 +0000 (14:57 +0200)]
Document hroller options recently added
hroller now also supports the options --skip-non-redundant and
--ignore-non-redundant, and this should be documented in the
man page as well.
While there, also use the same order in the options section
as in the synopsis, and in the synopsis group the algorithms
into
- those that modify the set of nodes to be scheduled, and
- those that modify the constraints to be taken into account.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Wed, 22 May 2013 10:49:17 +0000 (12:49 +0200)]
Extend hroller tests by options for non-redundant instances
The cluster now consists of 3 nodes, with drbd instances between
nodes 1 and 2, and 2 and 3. Additionally, nodes 1 and 3 each contain
a non-redundant instance, but node 2 cannot hold two additional
instances.
So,
- if we take non-redundant instances into account (the new default
behavior), the nodes have to be rebooted individually,
- if we ignore non-redundant instances, nodes 1 and 3 can be rebooted
simultaneously, and
- if we skip nodes with non-redundant instances, only a single node
remains (of course, forming a single reboot group).
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Tue, 14 May 2013 09:45:22 +0000 (11:45 +0200)]
Test for hroller taking non-redundant instances into account
The example cluster consists of 6 nodes, each hosting 2 instances and
having capacity for 3. So, while the drbd-induced graph consists of
only insulated nodes, no more than two nodes can be rebooted at the
same time.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Thu, 16 May 2013 17:36:47 +0000 (19:36 +0200)]
hroller: option to ignore non-redundant instances
Add an option to hroller restoring the old behavior on not taking
any non-redundant instances into account when forming reboot
groups.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Tue, 14 May 2013 09:34:03 +0000 (11:34 +0200)]
Make hroller also plan for non-redundant instances
Non-redundant instances need to be moved to a different node
before maintenance of the node. Even though they can be moved to
any node, there must be enough capacity to host the instances of the
reboot group to be evacuated.
This is achieved by greedily moving the non-redundant instances
to other nodes, till we run out of capacity. In this way we
refine the groups obtained by coloring the drbd-induced graph.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Thu, 16 May 2013 15:12:33 +0000 (17:12 +0200)]
hroller: option to skip nodes with non-redundant instances
So far, hroller ignores the fact, that non-redundant instances exist.
One option to deal is non-redundant instances is to not schedule those
nodes for reboot. This is supported by adding the option --skip-non-redundant.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Thu, 16 May 2013 10:48:55 +0000 (12:48 +0200)]
Remove trailing whitespace
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Thomas Thrainer [Tue, 21 May 2013 09:53:06 +0000 (11:53 +0200)]
Improve installation documentation
Based on user feedback the installation documentation is clarified and
extended.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>
Helga Velroyen [Mon, 6 May 2013 20:10:21 +0000 (22:10 +0200)]
RPC 'node_info': <storage_type,key> instead of vg_names
This replaces the field 'vg_names' in the RPC call of 'node info' by
'storage_units'. A storage unit is a tuple <storage_type,key>
and a generalization of a vg_name. The list of vg names is replaced by
a list of storage units. The modified RPC call will be used to report
storage space for more than just lvm volume groups. What the 'key' is
depends on the storage type. For storage type lvm-vg, the key is the
volume group name. To keep backward compatibility, all functions that
use the old vg_names, convert them to a list where every volume group
is mapped to a tuple [('lvm-vg',volume_group)] before making the call.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Bernardo Dal Seno [Fri, 17 May 2013 13:18:13 +0000 (15:18 +0200)]
Merge branch 'stable-2.8' into master
* stable-2.8: (45 commits)
Update NEWS with disk creation fixes
Sort cmdlib-related entries in Makefile.am
cmdlib: Cleanup public/private functions
cmdlib: Extract instance query related functionality
cmdlib: Extract instance operation functionality
cmdlib: Extract migration related functionality
cmdlib: Extract storage related functionality
Reformat and define exports in cmdlib/__init__.py
Extract miscellaneous logical units from cmdlib
Extract os related logical units from cmdlib
Extract query related logical units from cmdlib
Extract backup related logical units from cmdlib
Extract instance related logical units from cmdlib
Extract node related logical units from cmdlib
Extract group related logial units from cmdlib
Extract cluster related logical units from cmdlib
Extract test logical units from cmdlib
Extract network related logical units from cmdlib
Extract tags related logical units from cmdlib
Extract base classes from cmdlib
...
Conflicts:
devel/build_chroot
lib/cmdlib.py
devel/build_chroot is straightforward: one side has added versions, the
other has added one library. lib/cmdlib.py has been split in many files in
stable-2.8, so I've semi-manually applied the changes from master.
This merge also fixes a problem with merge
f2d87a5e5, which partially
reverted changes from
912737ba by mistake.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Bernardo Dal Seno [Fri, 17 May 2013 11:12:01 +0000 (13:12 +0200)]
Merge branch 'stable-2.7' into stable-2.8
* stable-2.7:
Update NEWS with disk creation fixes
Don't fail to deactivate master IP if already down
Add QA for recreating single instance disks
Add QA for gnt-instance modify --disk
Clean up when "gnt-instance modify" fails to create a disk
recreate-disks honors the prealloc_wipe_disks flag
Introduce wrapper for cmdlib._WipeDisks()
Don't catch an exception that cannot be raised
Wipe disks added through "gnt-instance modify"
Support /var/run being a symlink in upload
Final NEWS and configure.ac update for 2.7.0~rc1
gnt-job list: deal with non-ascii encoding in jobs
Conflicts:
NEWS
lib/cmdlib.py
qa/ganeti-qa.py
qa/qa-sample.json
NEWS, qa/ganeti-qa.py and qa/qa-sample.py had trivial conflicts. But I've
updated QA changes to use the new interfaces. lib/cmdlib.py was renamed and
split, so I had to semi-manually apply the changes to the new files. I had
to change the names of some functions by removing or adding the initial
underscore and update the imported names.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Bernardo Dal Seno [Fri, 17 May 2013 00:40:21 +0000 (02:40 +0200)]
Update NEWS with disk creation fixes
Also document a couple more fixes.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Thomas Thrainer [Fri, 17 May 2013 09:21:30 +0000 (11:21 +0200)]
Sort cmdlib-related entries in Makefile.am
Files in the cmdlib directory are sorted alphabetically in
Makefile.am.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Tue, 14 May 2013 12:30:08 +0000 (14:30 +0200)]
cmdlib: Cleanup public/private functions
All functions/classes which are used outside of their defining module
(with tests as an exception) no longer have a leading underscore.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Tue, 14 May 2013 12:02:29 +0000 (14:02 +0200)]
cmdlib: Extract instance query related functionality
Split instance.py further by extracting instance querying related
logical units and functions to instance_query.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Tue, 14 May 2013 11:52:28 +0000 (13:52 +0200)]
cmdlib: Extract instance operation functionality
Split instance.py further by extracting instance operations
(start/stop/reboot/etc.) related logical units and functions to
instance_operation.py.
The extracted operations have in common that they affect the operating
system in a running instance directly.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Tue, 14 May 2013 11:38:23 +0000 (13:38 +0200)]
cmdlib: Extract migration related functionality
Split instance.py further by extracting migration related logical units
and functions to instance_migration.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Thu, 16 May 2013 07:13:48 +0000 (09:13 +0200)]
cmdlib: Extract storage related functionality
Split instance.py further by extracting storage related logical units
and functions to instance_storage.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Thu, 16 May 2013 07:12:59 +0000 (09:12 +0200)]
Reformat and define exports in cmdlib/__init__.py
cmdlib/__init__.py now simply defines the interface of the cmdlib module
by importing all classes which should be visible to clients.
Also don't ignore C0302 (Too many lines in module) any more.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Tue, 14 May 2013 08:24:50 +0000 (10:24 +0200)]
Extract miscellaneous logical units from cmdlib
All remaining classes in __init__.py are extracted to misc.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Tue, 14 May 2013 08:17:04 +0000 (10:17 +0200)]
Extract os related logical units from cmdlib
All LUOs* classes are extracted to operating_system.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Tue, 14 May 2013 08:07:29 +0000 (10:07 +0200)]
Extract query related logical units from cmdlib
All LUQuery* classes are extracted to query.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Tue, 14 May 2013 07:55:31 +0000 (09:55 +0200)]
Extract backup related logical units from cmdlib
All LUBackup* classes are extracted to backup.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Tue, 14 May 2013 07:19:16 +0000 (09:19 +0200)]
Extract instance related logical units from cmdlib
All LUInstance* classes are extracted to instance.py. Common functions
are moved to common.py if used by non-instance logical units as well.
Additionally, helper functions which are only used by LUBackup* and
LUInstance* are moved to instance_utils.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Mon, 13 May 2013 13:16:27 +0000 (15:16 +0200)]
Extract node related logical units from cmdlib
All LUNode* classes are extracted to node.py. Common functions are moved
to common.py if used by non-node logical units as well.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Mon, 13 May 2013 12:54:49 +0000 (14:54 +0200)]
Extract group related logial units from cmdlib
All LUGroup* classes are moved to group.py. Common functions are
extracted to common.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Mon, 13 May 2013 11:49:33 +0000 (13:49 +0200)]
Extract cluster related logical units from cmdlib
All LUCluster* classes are extracted to cluster.py. Shared functions are
extracted to common.py, helper functions only used by LUCluster* are
extracted to cluster.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Mon, 13 May 2013 10:17:01 +0000 (12:17 +0200)]
Extract test logical units from cmdlib
LUTest* are moved to test.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Mon, 13 May 2013 09:38:08 +0000 (11:38 +0200)]
Extract network related logical units from cmdlib
LUNetwork* and associated helper functions are extracted to network.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Mon, 13 May 2013 09:16:43 +0000 (11:16 +0200)]
Extract tags related logical units from cmdlib
LUTags* and their base class, TagsLU, are extracted to tags.py. An
additional shared function, _ShareAll, is extracted to common.py for
shared usage.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Mon, 13 May 2013 08:48:48 +0000 (10:48 +0200)]
Extract base classes from cmdlib
Base classes holding common functionality is extracted into base.py.
Utility functions used by both base classes and subclasses is moved to
common.py.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Thomas Thrainer [Tue, 14 May 2013 14:21:25 +0000 (16:21 +0200)]
Don't fail to deactivate master IP if already down
The master IP setup script now checks if the master IP is actually
configured on the machine before trying to remove the IP.
This fixes issue 460.
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Helga Velroyen [Wed, 15 May 2013 10:46:24 +0000 (12:46 +0200)]
Compatibility test for instances
This patch introduces a test to check the compatibility
of the Haskell and the Python representation of instances.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Wed, 15 May 2013 11:46:46 +0000 (13:46 +0200)]
Instance generators
This patch introduces and enhances generators for
instances:
- 'genInstWithNets' is split into the generation of an
arbitrary instance and enhancing an instance with nets
- 'genInst' calls 'genInstWithNets' with an empty set
of initial networks to provide a reasonable default
- the Arbitrary instance of 'Instance' uses now 'genDisks'
to create instances with a reasonable set of disks
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Wed, 15 May 2013 13:01:28 +0000 (15:01 +0200)]
Annotate every arbitrary instance field
The Arbitrary instance of the 'Instance' object is written
using the <*> syntax. Since it often uses the 'arbitrary'
generator for the instance's fields it is hard to figure
out which 'arbitrary' fills which instance field. This
patch annotates all fields with their name to make
maintenance of this code easier.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Wed, 15 May 2013 12:52:47 +0000 (14:52 +0200)]
Generators for disks
This patch adds generators for Disk instances to the Haskell
test code. It uses somewhat more reasonable generators to
fill the fields instead of just arbitrary values.
'genDiskWithChildren' is a generator that generates a disk
with a specified number of disk children. To avoid shooting
ourselves in the foot we do not generate further (grand)
child disks for the child disks. 'genDisk' calls
'genDiskWithChildren' by requesting three children as a
resonable default.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Wed, 15 May 2013 09:45:40 +0000 (11:45 +0200)]
Use os.statvfs to determine free disk space
This simplifies my previous commit (
820bade90) by using os.statvfs
instead of parsing the output of 'df'.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Helga Velroyen [Tue, 14 May 2013 15:59:12 +0000 (17:59 +0200)]
Reorder unit tests in Makefile.am
In a previous commit, I accidentially changed the order of
unit tests in Makefile.am to not be alphabetically anymore.
This fixes it.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Michele Tartara <mtartara@google.com>
Helga Velroyen [Tue, 14 May 2013 13:23:32 +0000 (15:23 +0200)]
Backend function for file storage space reporting
This adds functionality to retrieve disk space information
for file storage. It calls the 'df' tool and parses its
output to extract the total and free amount of disk space
on the disk where the given path is located.
The code is not integrated yet, but thoroughly unit-tested.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Bernardo dal Seno <bdalseno@google.com>
Michele Tartara [Tue, 14 May 2013 17:26:07 +0000 (18:26 +0100)]
Remove extra newline
Also, properly set the date of the last modification.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Michele Tartara [Tue, 14 May 2013 11:04:33 +0000 (12:04 +0100)]
Make HS ConfD client IPv6 compatible
The Haskell ConfD client was assuming internet addresses to be IPv4. This
patch modifies the client so that it is able to automatically detect the
protocol it should use by analyzing the address it is told to connect to.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Michele Tartara [Tue, 14 May 2013 15:52:06 +0000 (16:52 +0100)]
Factor out resolveAddr function
This function can be useful to many parts of the code to convert the string
representation of an IP (v4 or v6) address into the proper data type.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Michele Tartara [Mon, 13 May 2013 14:05:08 +0000 (14:05 +0000)]
Add MonD to the watcher
The monitoring daemon should always be alive, therefore it's added to the
watcher.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michele Tartara [Mon, 13 May 2013 14:00:17 +0000 (14:00 +0000)]
Start the monitoring and node daemons together
Add the monitoring daemon to the command starting the node daemon, given that
they both have to be started on all nodes.
Note that daemon-util only supports starting one daemon at the time, so the
actual command has to be composed as a sequence of two different daemon-util
invocations.
Also, the monitoring daemon invocation is conditional, depending on whether it
was enabled at configure time.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michele Tartara [Mon, 13 May 2013 13:57:46 +0000 (13:57 +0000)]
Add a constant stating whether monitoring is enabled
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Bernardo Dal Seno [Fri, 10 May 2013 23:37:45 +0000 (01:37 +0200)]
Add QA for recreating single instance disks
So far QA only recreated the whole set of disks at once.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Bernardo Dal Seno [Fri, 10 May 2013 23:23:08 +0000 (01:23 +0200)]
Add QA for gnt-instance modify --disk
Just a very basic test that adds and then removes a disk.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Bernardo Dal Seno [Fri, 10 May 2013 14:27:20 +0000 (16:27 +0200)]
Clean up when "gnt-instance modify" fails to create a disk
cmdlib.LUInstanceSetParams now uses helper functions to create and wipe
disks, so that when the creation of a disk fails, any leftover device is
cleaned up. As a bonus, exceptions raised by _CreateBlockDev() are caught
correctly.
Now cmdlib._CreateDisks() is used every time there are disks to create.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Bernardo Dal Seno [Fri, 10 May 2013 13:05:56 +0000 (15:05 +0200)]
recreate-disks honors the prealloc_wipe_disks flag
Now even recreate-disks wipes the newly-created disks, if the flag is set.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Bernardo Dal Seno [Fri, 10 May 2013 13:44:01 +0000 (15:44 +0200)]
Introduce wrapper for cmdlib._WipeDisks()
The wrapper handles errors by logging them and cleaning up freshly-created
disks.
Also, the correct disk is used in the error message when an error happens
in cmdlib._CreateDisks() and the resulting disk clean-up fails.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Bernardo Dal Seno [Thu, 9 May 2013 17:07:34 +0000 (19:07 +0200)]
Don't catch an exception that cannot be raised
Since
9b221ea4, _CreateBlockDev() doesn't raise OpExecError any more. Yet
some code was left in place to catch it. By removing that code we have two
advantages:
1. Dead code is removed.
2. If for whatever reason _CreateBlockDev() raises OpExecError, the
exception is not silently dropped and we notice (so we can fix it).
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 25 Feb 2013 15:27:27 +0000 (16:27 +0100)]
Wipe disks added through "gnt-instance modify"
In issue 353 Sascha Lucas reported that disks are not wiped when added
through “gnt-instance modify”. This patch adds this functionality and
updates the docstring for “_WipeDisks”.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
(cherry picked from commit
965e0e6a88e09f96d4c9b6030ab8753366c84a78)
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Helga Velroyen [Tue, 14 May 2013 12:31:04 +0000 (14:31 +0200)]
Move 'container.py' to storage directory
Moving 'container.py' to the storage directory.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Helga Velroyen [Tue, 14 May 2013 11:33:28 +0000 (13:33 +0200)]
Rename dir 'block' to 'storage'
Renaming the 'block' directory to 'storage', because I plan to
place code there that is related to file storage and leaving
it named 'block' would be misleading.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Helga Velroyen [Tue, 14 May 2013 11:04:53 +0000 (13:04 +0200)]
Rename storage.py to container.py
Renaming 'storage.py' to 'container.py'. It will be moved into the new
'storage' directory, which will come in later patches to avoid clashes of
notation.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Michele Tartara [Tue, 14 May 2013 13:07:18 +0000 (14:07 +0100)]
Monitoring QA: Remove superfluous import
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Michele Tartara [Mon, 13 May 2013 18:10:11 +0000 (19:10 +0100)]
Non-Xen support for monitoring QA
The QA tests the Xen instance status collector, but that is expected to fail
when run on machines that do not use Xen.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michele Tartara [Fri, 10 May 2013 10:07:07 +0000 (10:07 +0000)]
Add QA for instance status collector
This commit introduces the QA for the instance status collector.
Begin the first QA for a monitoring-related component, the files and some
functions are named after monitoring because they are meant to contain
future monitoring QAs as well.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Fri, 10 May 2013 09:10:32 +0000 (09:10 +0000)]
QA: factor out some instance management functions
Some functions for managing instances will have to be used by new upcoming
unit tests, so they are taken out of the instances QA file and put in a new
utilities file accessible by other QA files as well.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Wed, 8 May 2013 13:00:21 +0000 (15:00 +0200)]
Add inst-status-xen to the monitoring daemon
Enable the monitoring daemon to invoke the Xen instance status data collector.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Wed, 8 May 2013 12:51:33 +0000 (14:51 +0200)]
Run the monitoring daemon as root
The monitoring daemon needs to be able to run some commands that require root
access (such as "xm") in order to fulfill its duties.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Tue, 7 May 2013 16:37:05 +0000 (16:37 +0000)]
Export the Instance Status collector report
It will need to be accessed by the monitoring daemon.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Mon, 28 Jan 2013 13:36:53 +0000 (14:36 +0100)]
Add instance status collector to mon-collector man page
Add a section related to the new collector.
Also, fix some formatting issue (white spaces, line longer than 80 chars)
in the DRBD collector section.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Tue, 30 Apr 2013 12:17:54 +0000 (12:17 +0000)]
Add global status field to the instance status collector
The global status is computed from the statuses of the single instances.
The output json format is adapted to include this piece of information, as
prescribed by the design document.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Tue, 30 Apr 2013 12:14:30 +0000 (12:14 +0000)]
Factor out the mergeStatuses function
It will be used by multiple data collectors, not only the DRBD collector.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Tue, 30 Apr 2013 12:23:48 +0000 (12:23 +0000)]
Monitoring design doc: better specify field names
The name of the list of instances was not specified.
Also, fix a line that was longer than 80 characters.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Tue, 30 Apr 2013 11:11:49 +0000 (11:11 +0000)]
Use dcName in mon-collector
Instead of manually specify the name of the data collectors in mon-collector,
just use the dcName field each of them exports.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Tue, 30 Apr 2013 10:34:57 +0000 (12:34 +0200)]
Factor out function for building report
Instead of building the report as part of the "Main" function, have it
built by its own dedicated function, so that it will be able to export it
directly to the monitoring daemon when needed.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Fri, 26 Apr 2013 06:50:07 +0000 (06:50 +0000)]
Export Instance Status collector information
Name, version, format version, category and kind of the Instance Status data
collector are now exported.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Thu, 25 Apr 2013 13:54:59 +0000 (13:54 +0000)]
Include the reason trail in the instance collector output
Fetch the reason trail from file, failing gracefully if it is not found, and
include it in the output of the instance status data collector.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Thu, 25 Apr 2013 13:03:43 +0000 (13:03 +0000)]
Determine status of one instance
Added function for determining whether the status of an instance is ok, and to
represent this information in the corresponding field in the report.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Thu, 25 Apr 2013 14:36:10 +0000 (14:36 +0000)]
Export the actual instance state
Compute the actual state of the instance and export it.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Thu, 25 Apr 2013 14:28:58 +0000 (14:28 +0000)]
Add the core of the instance status collector
Add the Xen instance status data collector with only its core features.
The next commits will add more reporting functionalities.
The access to the collector is made possible through the mon-collector
tool.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Thu, 25 Apr 2013 14:20:05 +0000 (14:20 +0000)]
Add module containing function for getting info from Xen
The Xen instance status data collector will require to get some information
from the hypervisor. This commit introduces a module providing such functions.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Michele Tartara [Thu, 25 Apr 2013 08:27:48 +0000 (08:27 +0000)]
Add HS functions for getting the instance reason path
The getInstReasonFilename is built to resemble the python corresponding
function.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Wed, 24 Apr 2013 14:55:01 +0000 (14:55 +0000)]
Add dependency on the process library
The tests are already using this library, so it's not really a new build
dependency, but it was not specified esplicitly.
Furthermore, it's going to be used by the instance status collector, so it's
added to the requirements for the monitoring subsystem.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Fri, 10 May 2013 13:52:39 +0000 (15:52 +0200)]
Add example for online rolling reboots using tags
While this use case was described in the design document, and
mentioned several times as motivation for changes in commit messages,
it has never been added to a user-facing documentation. This commit
adds at least an example to the man page.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Fri, 10 May 2013 16:12:55 +0000 (18:12 +0200)]
Move cmdlib.py to cmdlib/__init__.py
cmdlib.py has grown *really* too much. Move it into its own package to
allow splitting it further.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Michele Tartara [Fri, 10 May 2013 13:46:51 +0000 (15:46 +0200)]
Allow build_chroot to work from any directory
build_chroot used to work only if launched from ./devel/, whereas now
it can be launched from anywhere, and it will store the resulting files
in the current directory.
Fixes Issue 459.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michele Tartara [Fri, 10 May 2013 13:45:14 +0000 (15:45 +0200)]
build_chroot: check whether the data dir exists
If the data directory is not in the expected place, the script complains
with an error message and stops, instead of giving obscure messages.
Partially fixes Issue 459.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Fri, 10 May 2013 11:26:27 +0000 (13:26 +0200)]
Extend hroller test to also verify tag-based node selection
While the multiple-tags test was added to verify that coloring is done
only after node selection (otherwise it wouldn't be possible to get in
both cases a single reboot group), it can easily be extended to also
verify that the correct nodes are selected by --node-tags.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Wed, 8 May 2013 16:21:59 +0000 (18:21 +0200)]
Add a test for online rolling reboot scheduling
In the example configuration, the graph constructed by just connecting
primary and secondary instances is two-colorable. However, when taking
conflicting locations of secondary nodes into account, three reboot
groups are needed. Moreover, these reboot groups are not subordinated
to any two-coloring of the first-mentioned graph.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Klaus Aehlig [Tue, 7 May 2013 13:21:19 +0000 (15:21 +0200)]
Support online-maintenance in hroller
Make hroller take into account the nodes (redundant) instances
will be migrated to. This be behavior can be overridden by the
--offline-maintenance option which will make hroller plan under
the assumption that all instances will be shutdown before starting
with the rolling reboots.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Klaus Aehlig [Tue, 7 May 2013 12:51:48 +0000 (14:51 +0200)]
Support construction of the graph of all reboot constraints
For online rolling reboots, there are two kind of restrictions. First,
we cannot reboot the primary and secondary nodes of an instance
together. Secondly, two nodes cannot be rebooted simultaneously, if
they are the primary nodes of two instances with the same secondary
node. The second condition requires knowledge of all nodes, not only
those the graph is to be constructed on.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Klaus Aehlig [Mon, 6 May 2013 13:51:20 +0000 (15:51 +0200)]
Add option --one-step-only to hroller
Add a new option to hroller to only output information about the first
reboot group. Together with the option --node-tags this allows for the
following work flow. First tag all nodes; then repeatedly compute the
first node group, handle these nodes and remove the tags. In between
these steps, other operations can be carried out on the cluster.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Klaus Aehlig [Wed, 8 May 2013 11:46:40 +0000 (13:46 +0200)]
Sort reboot groups by size
Make hroller output the node groups not containing the master node
sorted by size, largest group first. The master node still remains
the last node of the last reboot group. In this way, most progress
is made when switching back to normal cluster operations after the
first reboot group.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>
Klaus Aehlig [Wed, 8 May 2013 18:20:34 +0000 (20:20 +0200)]
Fix expectation in hroller test
Regular expressions are not shell globs. So "any symbol" is expressed
by a dot, not a question mark. In this case, the confusion lead to a
too liberal expectation, hence the test passed. Fix it nevertheless.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Bernardo Dal Seno [Tue, 7 May 2013 13:01:50 +0000 (15:01 +0200)]
Refactor check for exclusive_storage in LUInstanceCreate
The order of evaluation of the conditions is changed, so it's easier to add
more (foreseen) checks for exclusive_storage.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Bernardo Dal Seno [Tue, 7 May 2013 14:43:18 +0000 (16:43 +0200)]
Refactor disk checks in LUInstanceSetParams
Prereq checks relative disks are grouped together and moved in a separate
method. This reduces the clutter in CheckPrereq().
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Weiwei Jia [Wed, 8 May 2013 12:46:22 +0000 (20:46 +0800)]
Fix a misspelled word in design-storagetypes
Signed-off-by: Weiwei Jia <harryxiyou@gmail.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Tue, 7 May 2013 16:28:21 +0000 (18:28 +0200)]
Fix lint errors (redundant bracket)
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Fri, 3 May 2013 09:12:58 +0000 (11:12 +0200)]
Add a test demonstrating the --node-tags option of hroller
The example is a cluster of 6 nodes, paired into 3 group by three
instances. So the whole cluster would need two reboot groups. The two
tags select, in two different ways, one node of each group. So, when
restricting to one tag, a single reboot group suffices, but no
coloring of the whole cluster would achieve this.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Thu, 2 May 2013 13:12:00 +0000 (15:12 +0200)]
Add option to hroller to select nodes based on tags
Add option --node-tags to tell hroller to consider only nodes
with these tags. A use case would be a tag tracking on which
nodes the maintenance has not yet been carried out, e.g., if
rolling reboots are interleaved with other cluster operations.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Thu, 2 May 2013 12:31:04 +0000 (14:31 +0200)]
Make Rapi backed set node tags correctly
Since the htools representation of a node now allows adding
the node tags, populate this field correctly in the Rapi
backend.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Thu, 2 May 2013 11:41:53 +0000 (13:41 +0200)]
Make LUXI backed set node tags correctly
Since the htools representation of a node now allows adding
the node tags, populate this field correctly in the LUXI
backend.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Thu, 2 May 2013 11:10:42 +0000 (13:10 +0200)]
Extend the text format to contain node tags
In order to allow htools to make use of node tags, add them to the
text format. This is done by adding a new column at the end of the
node lines. If this column is missing, the default value (which
is the empty list) is left unchanged, thus yielding the current
behavior.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Klaus Aehlig [Thu, 2 May 2013 10:36:29 +0000 (12:36 +0200)]
Extend the Node in the htools to allow adding node tags
Since hroller (and probably other tools in the future) will support
node selection based on node tags, extend the node data structure to
allow adding this information.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>