Make max_running_jobs queryable
As we have introduced a new cluster parameter, it shouldbe also visible when querying about the cluster configuration.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>
Add a command-line parameter for max_running_jobs
...so that this opcode parameter can become available for 'gnt-cluster modify'.
Add opcode parameter for the maximal number of running jobs
This parameter of OpClusterSetParams will allow to set themaximal number of jobs to be run simultaneously.
Add parameter max_running_jobs to the cluster configuration
This cluster-wide parameter will determine how many non-finalized jobs maximallyshould be in a not queued state at the same time.
Simplify 'GetMasterInfo' RPC
RPC 'GetMasterInfo' returns several fields, namely, 'master_netdev','master_ip', 'master_netmask', 'master_node', and 'primary_ip_family',of which only the 'master_node' is actually used.
Link Xen instance shutdown design doc with KVM's
Update instance shutdown for Xen design document by linking it to thedesign document for the KVM daemon and also improve the description ofsome paragraphs.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>...
Implement job cancellation in luxid
As luxid handles the job queue, this daemon is the naturalplace to handle job cancellation. Answering to CancelJob requestsis also necessary for luxid to be feature compliant with masterd,even for command-line requests only....
Provide a function to compute the canceled version of a job
When a job gets canceled while still queued, dequeuing requiresluxid to mark it as cancelled. So provide the necessary purefunction to do so.
Support canceling dequeued jobs
Even after jobs have been handed over for execution, it mightstill be possible to cancel them. On such case would be thejob still waiting for a lock. Eventually, we will have tocommunicate to the job directly, but as long as execution is...
Add dequeuing to the job scheduler
This only removes queued jobs from the queueand indicates whether the job was found in the queue.For jobs that are already started from the queue'spoint of view, it might still be possible to cancelthem, e.g., if they are still waiting for locks....
Add certificate of auto-promoted master candidates to map
When a normal node is auto-promoted to be a mastercandidate, its SSL client certificate digest needsto be added to the map of candidate certificatesas well.
Signed-off-by: Helga Velroyen <helgav@google.com>...
Fix Kvmd imports for Ubuntu 13.04 64
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Unit tests for KVM daemon
Add unit tests for KVM daemon.
QA for KVM instance shutdown
Add QA for instance shutdown for KVM.
Manpage for 'gnt-instance'
Modify manpage for 'gnt-instance' detailing the 'user_shutdown'parameter and how it related to the 'acpi' parameter.
Manpage for KVM daemon
Add manpage for the KVM daemon.
Hook KVM hypervisor with KVM daemon shutdown files
User shutdown hypervisor parameter
Add user shutdown parameter for KVM. Based on this parameter, decidewhat information to report for a KVM instance, for example,distinguish between 'ADMIN_down' and 'USER_down'.
Add helper function to tell if a daemon is alive
Add helper function 'utils.IsDaemonAlive' to tell if a daemon is aliveby name. This function will be necessary for the KVM hypervisor todetermine if the KVM daemon is running and otherwise start it.
Add KVM daemon daemonize
Add KVM daemon entry point, command-line options, backgrounding, etc
Add KVM daemon logic
Add KVM daemon logic, which contains monitors for Qmp sockets anddirectory/file watching.
Generalize and reuse Unix domain sockets
Refactor module 'Ganeti.UDSServer' so the KVM daemon can reuse codedeclared in this module to handle Unix domain sockets.
KVM daemon datatype, user and group
Fix whitespace
Fix whitespace in several modules.
Fix according to the Ganeti style guide
Fix docstring for 'AsyncStreamServer'
Document automatic actions taken at upgrade
When upgrading from any version below 2.11 to 2.11 or higher,Ganeti will generate new RPC client certificates when upgradingwith ``gnt-cluster upgrade``. Document this behavior in theUPGRADE notes to avoid user surprises....
Add generating node certificates as post-upgrade task
While, technically, Ganeti is still working without individual nodecertificates, it is considered an error by gnt-cluster verify tonot have it done immediately after upgrading. So, to make automatic...
Add utility to compare versions
This will be needed, e.g., for post-upgrade task, as theyhave to decide whether a feature was not yet present atthe version started from.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Merge branch 'stable-2.10' into master
Signed-off-by: Klaus Aehlig <aehlig@google.com>...
Run postupgrade hook after upgrade
To allow for necessary last-moment adaptions, of the new cluster,we run the post-upgrade hook of the target version, providingthe version we originally started from.
Provide path to post-upgrade
Add an empty post-upgrade hook
As 2.10 is the first version from which you can do automatic upgrades,there is nothing to do when going to any other version in the 2.10branch.
design: support post-upgrade hooks
While the general policy for Ganeti is to just accept the situationit finds after being upgraded from an older version, in some casesadditional actions might be necessary. So support a hook for doingso.
Also add the current version to the intent-to-upgrade file
Our design states, that the intent-to-upgrade file contains "the currentversion of ganeti, the version to change to, and the process ID". Make theimplementation fit with that design.
admin.rst: update and reword disk template section
The disk template section was not updated for Gluster. This commitalso refactors the section slightly by unifying the different remarksabout /etc/ganeti/file-storage-paths.
sphinx_ext is also changed in order to not hardcode too much...
Design document for KVM daemon
Design document for KVM daemon which is needed by the instanceshutdown detection for KVM.
Improve the point-free section of the style guide
Distinguish declaring functions in the point-free style and usinga very similar technique to avoid parentheses (which isn't technicallypoint-free).
Signed-off-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Document 2.11 to 2.10 specific downgrade tasks
While the recommended way of downgrading from version 2.11 to 2.10is ``gnt-cluster upgrade --to 2.10``, manual downgrade is supported.So, the version-specific steps need to be documented in the UPGRADEnotes....
Remove certification on 2.11 to 2.10 downgrade
While version 2.10 ignores any leftover client certificates, theirpresence will prevent a the cluster working after an upgrade backto version 2.11 again. So we have to remove them right at thedowngrade.
Add support for version-specific downgrade tasks
Upgrading can have no specific knowledge about additionaltasks besides upgrading the configuration, as upgrades needto be able to go to any future version (within the same majorversion). Downgrading, however, is version specific and always...
design: version-specific downgrade actions
Some new features, like client-specific ssl certificates, require additionalsteps at downgrade, so add this to the design. Two things should be noted.
- There cannot be explicit version-specific upgrade actions; upgrades...
Document support for automatic downgrades
The recommended way of downgrading a cluster from 2.11 onwardsis to use the ``gnt-cluster upgrade`` command. Document this inthe section on downgrades.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Clean up epydoc comments
Add missing colons, and improve descriptions of parameters.
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Use options for turning functionality on/off
Two command-line options are added: one for confirming that the testhas been started intentionally, and one for showing the methodinvocation output, which is useful, but not always needed.
Signed-off-by: Hrvoje Ribicic <riba@google.com>...
Add job cancellation workload
To examine if jobs can be cancelled correctly, provide workload relatedto this as well.
Add cluster parameter change workload
One of the few leftover unused RAPI methods is the cluster modifymethod. This patch tests it by setting and unsetting various safevalues.
Make an instance move workload that works in 2.6
The instance move workload present before this patch works on 2.11, butfails on 2.6. The 2.11 workload will still be useful should any laterversion of Ganeti use it as reference, but a 2.6 workload has to be...
Add instance move workload
Through the use of functions provided by the rapi QA, all the requestsrelated to instance moves can be exercised.
Make the move-instance tool more fault tolerant
The move-instance tool raises an exception when used with a clusterrunning an earlier version of Ganeti. As the tool is meant to performinter-cluster moves, this situation could be encountered outside the...
Allow the skipping of checks for inter-cluster move test
The inter-cluster instance move test is very interesting for the RAPIcompatibility tests, as it uses many RAPI requests that are otherwisehard to exercise. It uses no command-line functionality apart from...
Make the finish function return the error status explicitly
The earlier version of the Finish function assumed that checking if thevalue of the response is None would suffice to check if any errors haveoccurred. This is not, and this patch adjusts the function to expose...
Add migration and failover workload
This patch introduces additional calls adding migration and failoverRAPI operations, moving a DRBD-disk template instance between nodes.
Add tracking of used client methods
As a helper or a warning to anyone extending the RAPI client, theclient wrapper now warns of unused methods or method arguments.
Add network workload
This patch exercises the network RAPI commands.
Add miniature query filtering workload
As query filtering was not a part of the previous workloads, this patchadds a single example of its use.
Add per-resource query workload
The query requests are done to receive data about a certain resourcetype. With tests for all the resources barring networks in place, thequery workloads can be added at the point where the existence of enoughresources in the system can be confirmed, making the results of the...
Add group-related workload
This patch further extends the RAPI workload by exercising all thegroup-related functionality.
Add node-related workload
This patch further expands the workload by performing various nodeoperations.
Add warning about the RecreateInstanceDisks invocation
A test relying on RAPI alone cannot exercise the RecreateInstanceDisksfunctionality properly - simply because it cannot damage an instanceto the point where its disks would be missing and in need of...
Add various single instance operations
To further expand the number of RAPI methods in the workload, thesingle instance operations are added in this patch. An instance iscreated, deleted, shutdown, restarted, reinstalled, renamed, andhas its disks activated and deactivated and grown....
Add tag method testing
This patch adds a generic way to test tagging of various entities viaRAPI. More tags testing will be added as other entitt tests are added.
Add helper function that waits for jobs to finish
Some RAPI calls result in the creation of a long-running job,returning a job id to be used to extract the results later. To reducethe amount of boilerplate, introduce a simple function to do thewaiting....
Add simple retrieval operations to workload
This patch expands the RAPI workload with simple Get* commands.
Add the first version of the RAPI workload script
The RAPI workload script supplies work for the RAPI compatibilitytests. The initial version does very little, but can be expandedas needed.
Make the qa_rapi setup method return the RAPI client
Move RAPI secret lookup to qa_rapi
The RAPI secret lookup is a helper function used by the Ganeti QA toretrieve the RAPI password of an already setup cluster. As this couldbe useful to other utilities performing QA, move it to the qa_rapimodule.
Add code style document to documentation
The Ganeti code style has been stored on the project wiki at:
https://code.google.com/p/ganeti/wiki/StyleGuide https://code.google.com/p/ganeti/wiki/HaskellStyleGuide
This commit combines the two pages into an .rst file with minimal...
Correct exception when ssconf file does not exist
After an upgrade to 2.11, the ssconf file for the mastercertificates might not exist. Based on the non-existance,noded falls back to a compatibility mode regarding dealingwith SSL certificates. The check for the ssconf file...
Also downgrade gluster parameters
Support for gluster was added only in version 2.11. So,when downgrading to the 2.10 branch, these parametersneed to be removed.
Create client certificate for normal nodes
The vcluster QA revealed a bug in the SSL certificatehandling code, where certificates were only createdwhen the node is a master-candidate. However, every nodeshould have a certificate, but only the digests of the...
Also consider filter fields for deciding if using live data
If the query fields don't require live data, we use the shortcutand don't request live data. However, we cannot take this shortcutif the fields the filter depends on requires live data.
Catch exceptions when calling curses.setupterm() in QA
If it's running on a non-standard terminal, such asrxvt-unicode-256color, the call fails with an exception. Instead, catchthe exception and proceed without coloring warnings/errors.
Signed-off-by: Petr Pudlak <pudlak@google.com>...
Increase job queue polling interval
Now that all jobs are monitored with inotify, increase the polling interval.
After detecting a finished job, schedule again
In order to obtain a higher throughput of jobs, schedule new jobsas soon as a job was detected to have finished.
Attach a watcher for jobs
Add a function that can serve as an event handler for inotifyupdating a job in the job queue if the corresponding job filechanges. Also attach it to all jobs selected to be run.
JQScheduler: always pass JobWithStat
When attaching inotifies to jobs, we need to preserveit through potential requeuing actions. Also, this informationis needed for cleaning up.
Cleanup inotifies
When cleaning up finished jobs, remove the inotifyattached to them, if any.
Add an optional inotify to jobs in the scheduler
This provides the infrastructure to monitor running jobsby inotify, and hence update the queue promptly uponjob changes.
Make luxid handle SetDrainFlag
Make luxid also handle queries to drain the job queue.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Add RPC for setting the queue drain flag
As luxid is also responsible for handling requests to drain the job queue,we need the corresponding RPC in Haskell as well.
Fix sign in drain_flag request
The drain flag is set, if the queue is not open.
Eliminate installation modes in OS reinstalls doc
Eliminate installation modes in OS reinstalls design doc and insteadallow disk images and OS scripts to be combined, with an optionalvirtualized environment.
Reinstantiate inotify after a lost file
When watching a file, reinstantiate the inotify if notifiedof an event that removes the watch. Such events are likelyto happen, as our usual way to "modify" a file is to atomicallyreplace it by another one.
Improve debug-logging for watch file
Also log, at debug level only, when a change of a watchedfile was observed, but the change did not result in anychange of derived value.
Improve debugging by logging inotify events
At debug level, not only log that an inotify triggered,but also log the actual event.
Update design doc to match implementation
This patch contains some minor changes in the design docto make sure the details match the implementation.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Update UPGRADE nodes
Adds to the upgrade nodes that a renewal of the nodecertificates is necessary.
Update NEWS wrt to client RPC certificates
This updates the NEWS file regarding the changes inRPC communication.
Verify client certificates
This patch adds a step to 'gnt-cluster verify' to verifythe existence and validity of the nodes' clientcertificates. Since this is a crucial point of thesecurity concept, the verification is very detailed withexpressive error messages and well tested by unit tests....
Verify incoming RPCs against candidate map
From this patch on, incoming RPC calls are checked againstthe map of valid master candidate certificates. If no mapis present, the cluster is assumed to be inbootstrap/upgrade mode and compares the incoming call...
Handle promoting/demoting nodes wrt to client certificates
This patch makes Ganeti correctly handle the clientcertificates when nodes get promoted to master candidatesor demoted to normal nodes.
Extend RPC call to create SSL certificates
So far the RPC call 'node_crypto_tokens' did only retrievethe certificate digest of an existing certificate. Thiscall is now enhanced to also create a new certificate andreturn the respective digest. This will be used in various...
Create client SSL certificates on cluster init
This patch makes Ganeti create a client SSL certificate forthe master node on cluster initialization. Note that some ofthe code in this patch is later moved into an LU to serverequirements for crypto renewal and updates, but for this...
Store candidate certificates in ssconf
This patch enables Ganeti to store the candidatecertificate map in ssconf. A utility function toread it is provided as well.
Handle client certificates on node add/remove
This patch adds the certificate of a newly added orreadded master candidate node to the map of master candidatecertificates. It removes a master candidate node's certificatedigest from the candidate certificate map if the node is...
Add certificate for master node
On cluster initialization, the master node'sSSL certificate digest is added to the list of mastercandidate certificates.
Add candiate certificate map to configuration
At the end of this patch series, incoming RPC calls arelegitimized against a map of master candidate nodes'SSL certificate digests. This patch adds the map itselfto the cluster's configuration.
Retrieve a node's certificate digest
In various cluster operations, the master node needs toretrieve the digest of a node's SSL certificate. For thispurpose, we add an RPC call to retrieve the digest. Thefunction is designed in a general way to make it possible...
Utility functions to manipulate the candidate map
This patch adds a couple of utility functions to manipulatethe map of master candidate SSL certificate digests.