Guido Trotter [Thu, 18 Mar 2010 13:40:59 +0000 (13:40 +0000)]
Merge branch 'devel-2.1'
* devel-2.1:
burnin: implement basic confd testing
AsyncUDPSocket.process_next_packet
WaitForSocketCondition: rename, handle EINTR
move http.WaitForSocketCondition to utils
ConfdCountingCallback
ConfdClient: add synchronous features
Replace @keyword with @param in confd client
AsyncUDPSocket: abstract do_read function
Burnin: don't add/remove routed nics
Only override any and all if not defined
Conflicts:
lib/http/__init__.py
trivial, double removal
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 15 Mar 2010 13:23:01 +0000 (13:23 +0000)]
burnin: implement basic confd testing
Just a few queries are checked, but this should give us confidence that
at least the basic confd framework is working properly.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 15 Mar 2010 11:43:21 +0000 (11:43 +0000)]
AsyncUDPSocket.process_next_packet
This function allows receiving socket data synchronously.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Tue, 16 Mar 2010 15:08:31 +0000 (15:08 +0000)]
WaitForSocketCondition: rename, handle EINTR
- Rename WaitForSocketCondition to SingleWaitForFdCondition
- Avoid potentially infinite loop, if we continue to get interrupted
- Handle eintr correctly
- Avoid the poller try/finally, as the poller object gets destroyed
anyway
- Provide a new WaitForFdCondition
- Using retry, guarantee to continue checking until the timeout
expires
- Needs an extra helper class, as it uses retry in a very custom way
(no sleep happens, because the poller sleeps by itself)
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 16 Mar 2010 13:59:25 +0000 (13:59 +0000)]
move http.WaitForSocketCondition to utils
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 15 Mar 2010 11:42:12 +0000 (11:42 +0000)]
ConfdCountingCallback
This new confd callback counts received replies for the registered
queries.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 15 Mar 2010 11:40:54 +0000 (11:40 +0000)]
ConfdClient: add synchronous features
By sending requests with async=False, and receiving replies with
ReceiveReply we can more easily use confd from a synchronous client.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 12 Mar 2010 12:39:46 +0000 (12:39 +0000)]
Replace @keyword with @param in confd client
@keyword was used inappropriately.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 15 Mar 2010 13:21:22 +0000 (13:21 +0000)]
AsyncUDPSocket: abstract do_read function
This basically implements read handling, without catching all
exceptions. When using the socket in synchronous mode, it's useful to
avoid losing exception data (which, in an async daemon, can only be
logged)
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 11 Mar 2010 15:17:17 +0000 (15:17 +0000)]
Burnin: don't add/remove routed nics
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 15 Mar 2010 11:37:38 +0000 (11:37 +0000)]
Only override any and all if not defined
If any or all are already defined (because we're using a new version of
python) just link them inside "utils" rather than redefining them.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 17 Mar 2010 16:47:14 +0000 (17:47 +0100)]
Add RPC calls to create and remove X509 certificates
Certificates and keys generated using these functions will be used for
inter-cluster instance moves. As per design, the private key should never
leave the node.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 17 Mar 2010 17:01:41 +0000 (18:01 +0100)]
Merge remote branch 'origin/devel-2.1'
* origin/devel-2.1:
backend: Two small style fixes
Allow cluster copy file over the replication net
Enhance cli.GetOnlineNodes query/filtering
Instance creation: implement --no-install mode
Allow OS changes without reinstallation
cmdlib: Abstract OS checks
Conflicts:
lib/cli.py: Trivial
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 17 Mar 2010 16:52:41 +0000 (17:52 +0100)]
backend: Two small style fixes
- Pass keyword parameter as such
- Replace “not x == y” with “x != y”
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 17 Mar 2010 16:08:13 +0000 (17:08 +0100)]
Allow cluster copy file over the replication net
This patch introduces the option “--use-replication-network” for the
cluster copyfile functionality, which is useful if the primary and
secondary network are significantly different (see issue 32).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Wed, 17 Mar 2010 15:49:59 +0000 (16:49 +0100)]
Enhance cli.GetOnlineNodes query/filtering
This patch allows GetOnlineNodes to return the secondary IPs instead of
the node names, and to provide filtering of the master node (required to
be done in this function in case we return the secondary IPs).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Wed, 17 Mar 2010 15:18:08 +0000 (16:18 +0100)]
utils: Add functions to sign and verify X509 certs using HMAC
Certificates exchanged via an untrusted third party should be
signed to ensure they haven't been modified.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 16 Mar 2010 13:49:26 +0000 (14:49 +0100)]
Add cluster domain secret
Information exchanged between different clusters via untrusted
third parties (e.g. for remote instance import/export) must be
signed with a secret shared between all involved clusters to
ensure the third party doesn't modify the information.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 17 Mar 2010 14:00:14 +0000 (15:00 +0100)]
Instance creation: implement --no-install mode
This is a simple patch that adds the no-install mode for instance
creation, allowing import from foreign source of the actual OS (instead
of requiring the preparation of data in a form expected by the import
scripts).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Wed, 17 Mar 2010 13:33:44 +0000 (14:33 +0100)]
Allow OS changes without reinstallation
This patch modifies LUSetInstanceParms to allow OS name changes, without
reinstallation, in case an OS gets renamed on-disk.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Wed, 17 Mar 2010 13:19:50 +0000 (14:19 +0100)]
cmdlib: Abstract OS checks
This patch moves the node-has-os checks to a separate function.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Tue, 16 Mar 2010 14:05:50 +0000 (15:05 +0100)]
Merge remote branch 'origin/devel-2.1'
* origin/devel-2.1:
Improve “gnt-cluster renew-crypto”
Fix behaviour of gnt-node modify -C no
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 16 Mar 2010 13:51:38 +0000 (14:51 +0100)]
Improve “gnt-cluster renew-crypto”
- Report exception text immediately instead of just logging it
- Remove leftover assertion from when it still used “gnt-cluster
modify”
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 16 Mar 2010 10:34:06 +0000 (11:34 +0100)]
Fix behaviour of gnt-node modify -C no
The current check on whether we require auto_promote or not is wrong, as
we check whether we will have exactly the correct number of master
candidates left. But it is fine if we have more (e.g. when CPS=10 and
mc_remaning=19) than the current number, and in that case we shouldn't
require auto promotion.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Mon, 15 Mar 2010 16:46:28 +0000 (17:46 +0100)]
Merge remote branch 'origin/devel-2.1'
* origin/devel-2.1:
Rightname confd's HMAC key
Rename SSL_CERT_FILE to NODED_CERT_FILE
Clarify the error message for ':' in PV names
Conflicts:
lib/bootstrap.py: Trivial
lib/constants.py: Trivial
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 15 Mar 2010 15:53:22 +0000 (16:53 +0100)]
Rightname confd's HMAC key
Currently, the ganeti-confd's HMAC key is called “cluster HMAC key” or
simply “HMAC key” everywhere. With the implementation of inter-cluster
instance moves, another HMAC key will be introduced for signing critical
data. They can not be the same, so this patch clarifies the purpose of the
“cluster HMAC key” by renaming it. The actual file name is not changed.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 15 Mar 2010 15:15:47 +0000 (16:15 +0100)]
Rename SSL_CERT_FILE to NODED_CERT_FILE
To be consistent with RAPI_CERT_FILE, the rather generic named
“SSL_CERT_FILE” constant is renamed to “NODED_CERT_FILE”. The actual file
name is not changed.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Mon, 15 Mar 2010 16:14:25 +0000 (17:14 +0100)]
Clarify the error message for ':' in PV names
As described in issue 93, just saying ':' is not a valid char can be
confusing.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 15 Mar 2010 15:47:51 +0000 (16:47 +0100)]
Merge remote branch 'origin/devel-2.1'
* origin/devel-2.1:
Implement QA tests for disk template changes
Update instance modify documentation
Implement conversion from drbd to plain
Implement conversion from plain to drbd
Abstract check that an instance is down
Abstract node free disk space check
Abstract disk template verification
Update documentation for disk adoption
Implement disk adoption mode in gnt-instance
LUCreateInstance: implement disk adoption mode
LUCreateInstance: Move parameter init earlier
ConfigWriter: add an LV reservation manager
Fix two issues related to check-man
utils.RunCmd: Test case with reset_env set and setting variables
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sun, 14 Mar 2010 15:39:01 +0000 (16:39 +0100)]
Implement QA tests for disk template changes
The new test depends on the drbd type tests being enabled, and test
conversion to plain and back to drbd.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sun, 14 Mar 2010 15:39:00 +0000 (16:39 +0100)]
Update instance modify documentation
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sun, 14 Mar 2010 15:38:59 +0000 (16:38 +0100)]
Implement conversion from drbd to plain
This is much simpler than the opposite, with fewer possibilities of
failures.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sun, 14 Mar 2010 15:38:58 +0000 (16:38 +0100)]
Implement conversion from plain to drbd
This patch adds a new mode to instance modify, the changing of the disk
template. For now only plain to drbd conversion is supported, and the
new secondary node must be specified manually (no iallocator support).
The procedure for conversion works as follows:
- a completely new disk template is created, matching the count, size
and mode of the instance's current disks
- we create manually (not via _CreateDisks) all the missing volumes
- we rename on the primary the LVs to the new name
- we create manually the DRBD devices
Failures during the creation of volumes will leave orphan volumes.
Failure during the rename might leave some disks renamed and some not,
leading to an inconsistent instance.
Once the disks are renamed, we update the instance information and wait
for resync. Any failures of the DRBD sync must be manually handled (like
a normal failure, e.g. by running replace-disks, etc.).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sun, 14 Mar 2010 15:38:57 +0000 (16:38 +0100)]
Abstract check that an instance is down
Multiple LUs require that an instance is not running while they operate
on the instance (reinstall, rename, modify, recreate disks, deactivate
disks). The code to do this check is duplicate many times, and not very
consistent (some use call_instance_list, some call_instance_info).
The patch moves this check into a separate function that is then reused.
The only drawback is that _SafeShutdowInstanceDisks now raises an
OpPrereqError (even though it is run during Exec()), but this use case
is fine (there are no other modifications in that Exec).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sun, 14 Mar 2010 15:38:56 +0000 (16:38 +0100)]
Abstract node free disk space check
Both create instance and grow disk check the free disk space on nodes
using the same, duplicate code. Since we'll need this in other places in
the future, we abstract the check into a new function.
The patch adjusts the error message to be more in-line with the one for
memory checking, and fixes the exception raised for RPC errors.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sun, 14 Mar 2010 15:38:55 +0000 (16:38 +0100)]
Abstract disk template verification
This is a simple check, but we'll need it in multiple places.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sun, 14 Mar 2010 00:55:19 +0000 (01:55 +0100)]
Update documentation for disk adoption
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 14 Mar 2010 00:55:18 +0000 (01:55 +0100)]
Implement disk adoption mode in gnt-instance
This patch modifies the parsing of the “--disk” argument to instance
create to accept “adopt” as a valid key, which builds the correct disk
structure for OpCreateInstance.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 14 Mar 2010 00:55:17 +0000 (01:55 +0100)]
LUCreateInstance: implement disk adoption mode
This new mode, valid only for the plain template disk, allows creation
of an instance based on existing logical volumes (preserving data),
rather than creation of new volumes and OS creation.
The new mode works as follows:
- instead of size, all disks passed in must have an 'adopt' key, which
signifies the LV name to be used
- all disks must have this key, or neither should
- we check the volume existence, and from the result we fill in the
actual size
- online (in-use) volumes are not allowed
- 'stealing' of another's instance volumes is prevented via reservation
of the LV names
- during creation, we rename the logical volumes to the standard Ganeti
format (based on UUID)
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 14 Mar 2010 00:55:16 +0000 (01:55 +0100)]
LUCreateInstance: Move parameter init earlier
This way, the parameters are available in CheckArguments too.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Sun, 14 Mar 2010 00:55:15 +0000 (01:55 +0100)]
ConfigWriter: add an LV reservation manager
This patch adds an LV reservation manager to be used for LV names. Since
we now have four such managers, we create a list for easier release.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 15 Mar 2010 12:55:48 +0000 (13:55 +0100)]
Fix two issues related to check-man
First, we don't need to check man pages at sed time, because this means
everyone building the package will do - we only need to check at docbook
time, which is mostly at developer time.
Second, don't force LC_ALL to C, as this breaks newer man-db. I've
verified and removing LC_ALL works fine across etch, hardy, lenny aid
squeeze/sid.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 15 Mar 2010 14:33:47 +0000 (15:33 +0100)]
utils.RunCmd: Test case with reset_env set and setting variables
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 15 Mar 2010 12:54:53 +0000 (13:54 +0100)]
Merge remote branch 'origin/devel-2.1'
* origin/devel-2.1: (116 commits)
Implement replacing cluster certs and keys via “gnt-cluster renew-crypto”
cli: Add helper function to stop and start whole cluster
cfgupgrade: Use new bootstrap function for certs and keys
bootstrap: Add new function to create cluster certs and keys
utils.CreateBackup: Use human-readable instead of seconds since Epoch
Add unittest for daemon-util
Add support for non-Python unittests
daemon-util: Generate daemon path in separate function
daemon-util: Use “return” instead of “exit” in all functions
daemon-util: Add function to start and stop all daemons
ganeti.initd: Move all daemon names from init script to daemon-util
ganeti.initd: Move code checking daemon exit code to daemon-util
ganeti.initd: Move code checking config to daemon-util
daemon-util: Require dashes in commands
Improve ganeti.serializer unittests
Add unittests for ganeti.errors
Verify cluster certificates in LUVerifyCluster
utils: Add function to extract X509 cert validity
Add constant with cluster X509 certificates
Release version 2.1.1
...
Conflicts:
lib/backend.py: Trivial
lib/bootstrap.py: Trivial
lib/constants.py: Trivial
lib/http/server.py: Trivial
lib/utils.py: RunCmd parameter “reset_env”
test/ganeti.utils_unittest.py: Trivial
tools/cfgupgrade: Trivial
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Fri, 12 Mar 2010 15:16:08 +0000 (16:16 +0100)]
Implement replacing cluster certs and keys via “gnt-cluster renew-crypto”
Recent changes to “gnt-cluster verify” made it complain on expiring SSL
certificates. While it was possible to replace the SSL certificates and
other cluster secrets manually before, doing so was cumbersome. Cluster
certificates, keys and secrets can now be replaced easily.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 12 Mar 2010 10:51:22 +0000 (11:51 +0100)]
cli: Add helper function to stop and start whole cluster
Replacing cluster certificates and keys requires all cluster daemons to be
shut down. This might also be handy for the cluster merger tool, though
the function might need a few more extensions.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 12 Mar 2010 10:49:16 +0000 (11:49 +0100)]
cfgupgrade: Use new bootstrap function for certs and keys
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 12 Mar 2010 10:49:47 +0000 (11:49 +0100)]
bootstrap: Add new function to create cluster certs and keys
The code to generate cluster certificates, keys and secrets is currently
spread over several places. It makes sense to move it to a separate
function as we want to provide the user with the ability to automatically
replace all cluster certificates and keys.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 12 Mar 2010 14:35:02 +0000 (15:35 +0100)]
utils.CreateBackup: Use human-readable instead of seconds since Epoch
Seconds since the Epoch are not easily readable by a human. Using a
formatted timestamp makes it easier (e.g.
“….backup-2010-03-12_14_02_43.…”). This patch also makes OS logfiles use
this formatted timestamp.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 11 Mar 2010 17:52:59 +0000 (18:52 +0100)]
Add unittest for daemon-util
This test doesn't cover everything, but it's better than nothing.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 11 Mar 2010 17:28:54 +0000 (18:28 +0100)]
Add support for non-Python unittests
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 11 Mar 2010 16:42:19 +0000 (17:42 +0100)]
daemon-util: Generate daemon path in separate function
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 11 Mar 2010 16:16:44 +0000 (17:16 +0100)]
daemon-util: Use “return” instead of “exit” in all functions
This is important if they're called directly within daemon-util.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 11 Mar 2010 16:16:28 +0000 (17:16 +0100)]
daemon-util: Add function to start and stop all daemons
This is, to some degree, duplicated code from the init script. However,
the init script has to conform to standards of the underlying Linux
distributions, while these functions will be called by Ganeti itself. By
moving more code into daemon-util, the amount of duplication has been
reduced.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Thu, 11 Mar 2010 15:52:17 +0000 (16:52 +0100)]
ganeti.initd: Move all daemon names from init script to daemon-util
The list of daemon names will be used in daemon-util, too.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Thu, 11 Mar 2010 11:51:44 +0000 (12:51 +0100)]
ganeti.initd: Move code checking daemon exit code to daemon-util
This is again for re-using code.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Thu, 11 Mar 2010 11:27:11 +0000 (12:27 +0100)]
ganeti.initd: Move code checking config to daemon-util
This allows for more code re-use. daemon-util will also be used to start
all daemons.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Thu, 11 Mar 2010 16:15:29 +0000 (17:15 +0100)]
daemon-util: Require dashes in commands
Even though the script uses underscores (_) internally, the external
commands are supposed to be written using dashes (-).
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Wed, 10 Mar 2010 17:00:21 +0000 (18:00 +0100)]
Improve ganeti.serializer unittests
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 10 Mar 2010 16:59:54 +0000 (17:59 +0100)]
Add unittests for ganeti.errors
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 23 Feb 2010 16:14:04 +0000 (17:14 +0100)]
Verify cluster certificates in LUVerifyCluster
When using pyOpenSSL 0.7 or above, LUClusterVerify will start to show a
warning 30 days before a certificate expires. 7 days before the
certificate expires, the warning becomes an error. Once expired,
LUVerifyCluster will always report an error. The latter is also supported
with pyOpenSSL 0.6.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 23 Feb 2010 16:09:03 +0000 (17:09 +0100)]
utils: Add function to extract X509 cert validity
X509 uses ASN1 GENERALIZEDTIME or UTCTIME to store certificate validity.
pyOpenSSL 0.7 and above allow us to retrieve both “notBefore” and
“notAfter” as strings. Parsing them turned out to be a challenge since
they can be in a variety of formats (YYYYMMDDhhmmssZ, YYYYMMDDhhmmss+hhmm
or YYYMMDDhhmmss-hhmm).
This will be used to verify the validity of cluster certificates in
LUVerifyCluster.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 23 Feb 2010 16:10:37 +0000 (17:10 +0100)]
Add constant with cluster X509 certificates
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Fri, 12 Mar 2010 13:15:27 +0000 (14:15 +0100)]
Merge branch 'stable-2.1' into devel-2.1
* stable-2.1:
Release version 2.1.1
Update NEWS file for the 2.1.1 release
Validate the os-specific hypervisor parameters
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Fri, 12 Mar 2010 10:44:43 +0000 (11:44 +0100)]
Release version 2.1.1
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Fri, 12 Mar 2010 08:34:45 +0000 (09:34 +0100)]
Improve cluster verify with hypervisor errors
In case the hypervisor has issues on one node, currently
backend.VerifyNode will exit via an exception (two exit paths possible,
one via HypervisorError from hypervisor.Verify(), and one via RPCFail
from GetInstanceList). This is bad as it invalidates all other checks of
that node.
This patch catches these two errors and allows the rest of the
VerifyNode function to run. This leads to a more complete verify cluster
run, for example now only real missing LVs are reported, not all of
them.
The cluster verify is not perfect as it will skip some tests even if it
has data, but this will require a more complete rewrite (see issue 90).
Also, the patch fixes and improves some error messages in cmdlib.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Thu, 11 Mar 2010 15:31:04 +0000 (16:31 +0100)]
Fix wrong indentation
Sorry…
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
René Nussbaumer [Wed, 10 Mar 2010 10:25:15 +0000 (11:25 +0100)]
Adding qa tests for gnt-os modify
This adds basic qa tests for gnt-os modify
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 11 Mar 2010 14:07:35 +0000 (15:07 +0100)]
Switch burnin to cli.JobExecutor
Burnin has a custom job executor, because of its need to retry some job
series.
While we cannot replace all of it, at least the execution we can switch
to cli.JobExecutor, to take advantage of the recently-introduced
out-of-order waiting.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Thu, 11 Mar 2010 13:54:45 +0000 (14:54 +0100)]
Extend JobExecutor to allow custom feedback_fn
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Thu, 11 Mar 2010 12:35:38 +0000 (13:35 +0100)]
cli.JobExecutor: poll jobs in execution order
… rather than submission order. The results are still returned in the
submission order, and for this we needed to track internally the index
of the submission.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Thu, 11 Mar 2010 12:33:29 +0000 (13:33 +0100)]
Add a partition function to split a list in two
This is similar to the Haskell function, except that the signature is
reverse to match the 'any' and 'all' Python functions.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Thu, 11 Mar 2010 12:29:36 +0000 (13:29 +0100)]
Improve burnin's Log function
This makes the Log function able to take multiple args for simplified
message construction, similar to the ToStdout one.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Thu, 11 Mar 2010 14:35:25 +0000 (15:35 +0100)]
Fix cluster verify with simulate-errors
In simulate errors mode, the test "ntime_diff is not None" will be
ignored, and thus a None value will try to be formatted as %.01f. We
workaround this by formatting it before, and then only using %s, which
can format a 'None' value.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Thu, 11 Mar 2010 10:57:27 +0000 (10:57 +0000)]
KVM: remove unused variable
We don't need the pwentry when checking if a username exists, just to be
sure the KeyError is not returned. Remove the variable, and thus shut up
lint.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Thu, 11 Mar 2010 09:52:02 +0000 (10:52 +0100)]
Update NEWS file for the 2.1.1 release
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 11 Mar 2010 08:48:54 +0000 (09:48 +0100)]
Validate the os-specific hypervisor parameters
This adds a validation similar to the one for cluster-wide hypervisor
paramters.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Guido Trotter [Wed, 10 Mar 2010 12:58:52 +0000 (12:58 +0000)]
Document the security_* hypervisor parameters
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Tue, 9 Mar 2010 17:59:45 +0000 (17:59 +0000)]
KVM: add security model and domain parameters
Initially we only support the "user" model (in which the user running
the virtual machine can be specified as an additional parameter).
We use usernames rather than uids in this mode, because the kvm -runas
flag doesn't support uids anyway, and we check the passed username for
validity.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 9 Mar 2010 11:19:49 +0000 (11:19 +0000)]
KVM security: add global constants
These constants add two new kvm hypervisor parameters, specifying the
security model (user/pool) and the security domain, within that model.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 15 Feb 2010 15:59:17 +0000 (16:59 +0100)]
Update inter-cluster instance move design with HMAC signatures
This also adds a large piece of pseudo code for explanatory purposes.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
René Nussbaumer [Wed, 10 Mar 2010 14:08:12 +0000 (15:08 +0100)]
Adding unittests for objects.Cluster.FillHV
This adds tests for the stacking of objects.Cluster.FillHV to verify
that the override is working as expected.
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Wed, 10 Mar 2010 13:56:44 +0000 (13:56 +0000)]
Fix man build error on older distributions
Passing <quote> rather than ' avoids having special characters at the
beginning of the line, which man doesn't like.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 10 Mar 2010 13:57:43 +0000 (14:57 +0100)]
http.auth: Disable pylint warnings
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 10 Mar 2010 13:45:08 +0000 (14:45 +0100)]
Implement verify checks for node/instance names
Since we index the nodes and instances by their name, we should have
checks that the dict key to object.name mapping is correct.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Wed, 10 Mar 2010 10:25:31 +0000 (11:25 +0100)]
Fix a python 2.6.5 compatibility
The upcoming python 2.6.5 release has a change that makes delattr(obj,
attr) fail for slots-enabled objects if the attr is not already set.
To prevent against this, we only run the delattr if the attribute is
already set.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Wed, 10 Mar 2010 11:16:55 +0000 (11:16 +0000)]
KVM: pass the instance name as the first kvm flag
This makes it the first argument show, for example under "ps".
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 10 Mar 2010 10:49:42 +0000 (10:49 +0000)]
KVM: Remove boot restriction for paravirtual nics
Newer virtio can boot from the network perfectly well, so there's no
point in keeping this restriction in place. This will still fail on
older kernels.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 10 Mar 2010 11:56:51 +0000 (11:56 +0000)]
Document boot_order syntax for kvm
The gnt-instance manpage only contained the correct syntax for xen-pvm.
Specify what the kvm syntax is, and also warn about a problem with
virtio+netboot, for older kvm versions.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 10 Mar 2010 09:31:29 +0000 (10:31 +0100)]
Update documentation for hashed passwords
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 9 Mar 2010 20:12:49 +0000 (21:12 +0100)]
http.server: Improve request logging in debug mode
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 9 Mar 2010 20:12:40 +0000 (21:12 +0100)]
Provide unittests for http.auth
To simplify writing unittests, one data structure class in http.server is
also changed. According to the coverage utility, this provides 95%
coverage.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 9 Mar 2010 20:11:55 +0000 (21:11 +0100)]
http.auth: Fix bug with checking hashed passwords
When username and password were sent for a resource not requiring
authentication, it wouldn't be accepted if the user in question had a
hashed password. The reason was that the function GetAuthRealm used to
return None if no authentication was necessary. However, the
authentication realm is necessary to verify hashed passwords. This is
fixed by requiring GetAuthRealm to always return a realm and separating
the decision whether to require authentication or not to a separate
function.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 9 Mar 2010 16:03:19 +0000 (16:03 +0000)]
Clarify cluster nic parameters in install.rst
There were a few outdated options specified there. This patch unifies
the description under only one section, and updates it.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 9 Mar 2010 12:21:54 +0000 (13:21 +0100)]
Add the auto_promote option to cli and gnt-node
This allows one to cleanly set a node offline and promote as needed
other nodes.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 9 Mar 2010 12:07:08 +0000 (13:07 +0100)]
Rework the node modify for mc-demotion
The current code in LUSetNodeParms regarding the demotion from master
candidate role is complicated and duplicates the code in ConfigWriter,
where such decisions should be made. Furthermore, we still cannot demote
nodes (not even with force), if other regular nodes exist.
This patch adds a new opcode attribute ‘auto_promote’, and changes the
decision tree as follows:
- if the node will be set to offline or drained or explicitly demoted
from master candidate, and this parameter is set, then we lock all
nodes in ExpandNames()
- later, in CheckPrereq(), if the node is
indeed a master candidate, and the future state (as computed via
GetMasterCandidateStats with the current node in the exception list)
has fewer nodes than it should, and we didn't lock all nodes, we exit
with an exception
- in Exec, if we locked all nodes, we do a AdjustCandidatePool() run, to
ensure nodes are locked as needed (we do it before updating the node
to remove a warning, and prevent the situation that if the LU fails
between these, we're not left with an inconsistent state)
Note that in Exec we run the AdjustCP irrespective of any node state
change (just based on lock status), so we might simplify the CheckPrereq
even more by not checking the future state, basically requiring
auto_promote/lock_all for master candidates, since the case where we
have more than needed master candidates is rarer; OTOH, this would prevent
manual promotion ahead of time of another node, which is why I didn't
choose this way.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 9 Mar 2010 14:08:37 +0000 (15:08 +0100)]
Fix node volumes list for stripped volumes
Currently backend.NodeVolumes() drops everything except the first PV,
thus we get a truncated result. The patch is not the nicest, as Python
doesn't have a simple `concat' function, so I had to change the list
comprehension to an explicit loop.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 9 Mar 2010 14:15:40 +0000 (15:15 +0100)]
Fix typo that makes cluster verify to ignore hooks
The return from LUVerifyCluster should be True (or equivalent) for pass,
and False (or equivalent) for fail. The HooksCallBack function uses '1'
(= True) when a hook fails, which is exactly the opposite of what we
want - it will make failed hooks to reset the result to success,
overriding actual failures in cluster verify.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Tue, 9 Mar 2010 12:39:29 +0000 (13:39 +0100)]
Fix redistribute config and offline nodes
We need to manually filter out offline nodes before using
rpc.call_upload_file and rpc.call_write_ssconf_files, since these method
are static (they work without a ConfigWriter instance) and thus do not
know which nodes are offline and which are not).
Note that we add a new ConfigWriter._UnlockedGetOnlineNodeList() method
rather than hardcoding the filtering of online nodes in _WriteConfig.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
René Nussbaumer [Tue, 9 Mar 2010 09:40:47 +0000 (10:40 +0100)]
Adding documentation for “gnt-os modify”
This finishes the integration of per-os-hypervisor parameters by updating
the man page.
Signed-off-by: René Nussbaumer <rn@google.com>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>