Anchor OS reinstall design doc in Makefile and index
Add OS reinstall design doc to the list of design docs in theMakefile, otherwise it does not get compiled when modified, and add italso to the index page of the documentation, where all the otherdesign docs are anchored....
When updating job queue, support virtual paths
When replicating parts of the job queue, allow for virtualpaths in the RPC call. In this way, replication will alsowork correctly in a vcluster setup. Note that makeVirtualPathlives in IO, and hence cannot be part of the pure encoding...
Add a module to support virtual clusters
Virtual clusters are an efficient way to test how Ganeti behaveson a large cluster without requiring a large number of machines.Now that more tasks like job replication are done by luxid, providethat functionality in Haskell as well....
Move vcluster-related constants to Constants.hs
...as, in that way, they will also be available in Haskell,where job replication happens as well.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>
Merge branch 'stable-2.10' into stable-2.11
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Fix 'design-internal-shutdown' not being in a toctree
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Add 'design-2.11.rst' which kvmd and instance shutdown
Add 'design-2.11.rst' which kvmd and instance shutdown to thetop-level documentation and Makefile.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Clarify spacing in record syntax
So far, our code base does not have a consistent way of spacingrecords. To work towards more consistency, add a recommendationinto out style guide. We standardize on what seems most commonin the Haskell world and also is the dominant form in our code...
Instance shutdown doc from draft to partially implemented
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Update NEWS entry about job scheduling
As the new run-time parameter about job scheduling is user visible,mention the changes to scheduling in the NEWS file.
Clean up luxidMaxRunningJobs
Now that the number of jobs maximally running in parallel isa run-time option, this magic constant is not needed any more.
Make the scheduler use the max_running_jobs config parameter
Use the run-time configuration to decide on the number of jobsscheduled for execution instead of using a hard-coded constant.
Make configuration available to the scheduler
In this way, scheduling decisions can depend on the configurationof the cluster. At the moment, this is only the maximal numberjobs to be run in parallel, but in the future this will also includejob filters....
Make max_running_jobs queryable
As we have introduced a new cluster parameter, it shouldbe also visible when querying about the cluster configuration.
Add a command-line parameter for max_running_jobs
...so that this opcode parameter can become available for 'gnt-cluster modify'.
Add opcode parameter for the maximal number of running jobs
This parameter of OpClusterSetParams will allow to set themaximal number of jobs to be run simultaneously.
Add parameter max_running_jobs to the cluster configuration
This cluster-wide parameter will determine how many non-finalized jobs maximallyshould be in a not queued state at the same time.
Simplify 'GetMasterInfo' RPC
RPC 'GetMasterInfo' returns several fields, namely, 'master_netdev','master_ip', 'master_netmask', 'master_node', and 'primary_ip_family',of which only the 'master_node' is actually used.
Link Xen instance shutdown design doc with KVM's
Update instance shutdown for Xen design document by linking it to thedesign document for the KVM daemon and also improve the description ofsome paragraphs.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>...
Implement job cancellation in luxid
As luxid handles the job queue, this daemon is the naturalplace to handle job cancellation. Answering to CancelJob requestsis also necessary for luxid to be feature compliant with masterd,even for command-line requests only....
Provide a function to compute the canceled version of a job
When a job gets canceled while still queued, dequeuing requiresluxid to mark it as cancelled. So provide the necessary purefunction to do so.
Support canceling dequeued jobs
Even after jobs have been handed over for execution, it mightstill be possible to cancel them. On such case would be thejob still waiting for a lock. Eventually, we will have tocommunicate to the job directly, but as long as execution is...
Add dequeuing to the job scheduler
This only removes queued jobs from the queueand indicates whether the job was found in the queue.For jobs that are already started from the queue'spoint of view, it might still be possible to cancelthem, e.g., if they are still waiting for locks....
Add certificate of auto-promoted master candidates to map
When a normal node is auto-promoted to be a mastercandidate, its SSL client certificate digest needsto be added to the map of candidate certificatesas well.
Signed-off-by: Helga Velroyen <helgav@google.com>...
Fix Kvmd imports for Ubuntu 13.04 64
Unit tests for KVM daemon
Add unit tests for KVM daemon.
QA for KVM instance shutdown
Add QA for instance shutdown for KVM.
Manpage for 'gnt-instance'
Modify manpage for 'gnt-instance' detailing the 'user_shutdown'parameter and how it related to the 'acpi' parameter.
Manpage for KVM daemon
Add manpage for the KVM daemon.
Hook KVM hypervisor with KVM daemon shutdown files
User shutdown hypervisor parameter
Add user shutdown parameter for KVM. Based on this parameter, decidewhat information to report for a KVM instance, for example,distinguish between 'ADMIN_down' and 'USER_down'.
Add helper function to tell if a daemon is alive
Add helper function 'utils.IsDaemonAlive' to tell if a daemon is aliveby name. This function will be necessary for the KVM hypervisor todetermine if the KVM daemon is running and otherwise start it.
Add KVM daemon daemonize
Add KVM daemon entry point, command-line options, backgrounding, etc
Add KVM daemon logic
Add KVM daemon logic, which contains monitors for Qmp sockets anddirectory/file watching.
Generalize and reuse Unix domain sockets
Refactor module 'Ganeti.UDSServer' so the KVM daemon can reuse codedeclared in this module to handle Unix domain sockets.
KVM daemon datatype, user and group
Fix whitespace
Fix whitespace in several modules.
Fix according to the Ganeti style guide
Fix docstring for 'AsyncStreamServer'
Document automatic actions taken at upgrade
When upgrading from any version below 2.11 to 2.11 or higher,Ganeti will generate new RPC client certificates when upgradingwith ``gnt-cluster upgrade``. Document this behavior in theUPGRADE notes to avoid user surprises....
Add generating node certificates as post-upgrade task
While, technically, Ganeti is still working without individual nodecertificates, it is considered an error by gnt-cluster verify tonot have it done immediately after upgrading. So, to make automatic...
Add utility to compare versions
This will be needed, e.g., for post-upgrade task, as theyhave to decide whether a feature was not yet present atthe version started from.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Merge branch 'stable-2.10' into master
Signed-off-by: Klaus Aehlig <aehlig@google.com>...
Run postupgrade hook after upgrade
To allow for necessary last-moment adaptions, of the new cluster,we run the post-upgrade hook of the target version, providingthe version we originally started from.
Provide path to post-upgrade
Add an empty post-upgrade hook
As 2.10 is the first version from which you can do automatic upgrades,there is nothing to do when going to any other version in the 2.10branch.
design: support post-upgrade hooks
While the general policy for Ganeti is to just accept the situationit finds after being upgraded from an older version, in some casesadditional actions might be necessary. So support a hook for doingso.
Also add the current version to the intent-to-upgrade file
Our design states, that the intent-to-upgrade file contains "the currentversion of ganeti, the version to change to, and the process ID". Make theimplementation fit with that design.
admin.rst: update and reword disk template section
The disk template section was not updated for Gluster. This commitalso refactors the section slightly by unifying the different remarksabout /etc/ganeti/file-storage-paths.
sphinx_ext is also changed in order to not hardcode too much...
Design document for KVM daemon
Design document for KVM daemon which is needed by the instanceshutdown detection for KVM.
Improve the point-free section of the style guide
Distinguish declaring functions in the point-free style and usinga very similar technique to avoid parentheses (which isn't technicallypoint-free).
Signed-off-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Document 2.11 to 2.10 specific downgrade tasks
While the recommended way of downgrading from version 2.11 to 2.10is ``gnt-cluster upgrade --to 2.10``, manual downgrade is supported.So, the version-specific steps need to be documented in the UPGRADEnotes....
Remove certification on 2.11 to 2.10 downgrade
While version 2.10 ignores any leftover client certificates, theirpresence will prevent a the cluster working after an upgrade backto version 2.11 again. So we have to remove them right at thedowngrade.
Add support for version-specific downgrade tasks
Upgrading can have no specific knowledge about additionaltasks besides upgrading the configuration, as upgrades needto be able to go to any future version (within the same majorversion). Downgrading, however, is version specific and always...
design: version-specific downgrade actions
Some new features, like client-specific ssl certificates, require additionalsteps at downgrade, so add this to the design. Two things should be noted.
- There cannot be explicit version-specific upgrade actions; upgrades...
Document support for automatic downgrades
The recommended way of downgrading a cluster from 2.11 onwardsis to use the ``gnt-cluster upgrade`` command. Document this inthe section on downgrades.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Clean up epydoc comments
Add missing colons, and improve descriptions of parameters.
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Use options for turning functionality on/off
Two command-line options are added: one for confirming that the testhas been started intentionally, and one for showing the methodinvocation output, which is useful, but not always needed.
Signed-off-by: Hrvoje Ribicic <riba@google.com>...
Add job cancellation workload
To examine if jobs can be cancelled correctly, provide workload relatedto this as well.
Add cluster parameter change workload
One of the few leftover unused RAPI methods is the cluster modifymethod. This patch tests it by setting and unsetting various safevalues.
Make an instance move workload that works in 2.6
The instance move workload present before this patch works on 2.11, butfails on 2.6. The 2.11 workload will still be useful should any laterversion of Ganeti use it as reference, but a 2.6 workload has to be...
Add instance move workload
Through the use of functions provided by the rapi QA, all the requestsrelated to instance moves can be exercised.
Make the move-instance tool more fault tolerant
The move-instance tool raises an exception when used with a clusterrunning an earlier version of Ganeti. As the tool is meant to performinter-cluster moves, this situation could be encountered outside the...
Allow the skipping of checks for inter-cluster move test
The inter-cluster instance move test is very interesting for the RAPIcompatibility tests, as it uses many RAPI requests that are otherwisehard to exercise. It uses no command-line functionality apart from...
Make the finish function return the error status explicitly
The earlier version of the Finish function assumed that checking if thevalue of the response is None would suffice to check if any errors haveoccurred. This is not, and this patch adjusts the function to expose...
Add migration and failover workload
This patch introduces additional calls adding migration and failoverRAPI operations, moving a DRBD-disk template instance between nodes.
Add tracking of used client methods
As a helper or a warning to anyone extending the RAPI client, theclient wrapper now warns of unused methods or method arguments.
Add network workload
This patch exercises the network RAPI commands.
Add miniature query filtering workload
As query filtering was not a part of the previous workloads, this patchadds a single example of its use.
Add per-resource query workload
The query requests are done to receive data about a certain resourcetype. With tests for all the resources barring networks in place, thequery workloads can be added at the point where the existence of enoughresources in the system can be confirmed, making the results of the...
Add group-related workload
This patch further extends the RAPI workload by exercising all thegroup-related functionality.
Add node-related workload
This patch further expands the workload by performing various nodeoperations.
Add warning about the RecreateInstanceDisks invocation
A test relying on RAPI alone cannot exercise the RecreateInstanceDisksfunctionality properly - simply because it cannot damage an instanceto the point where its disks would be missing and in need of...
Add various single instance operations
To further expand the number of RAPI methods in the workload, thesingle instance operations are added in this patch. An instance iscreated, deleted, shutdown, restarted, reinstalled, renamed, andhas its disks activated and deactivated and grown....
Add tag method testing
This patch adds a generic way to test tagging of various entities viaRAPI. More tags testing will be added as other entitt tests are added.
Add helper function that waits for jobs to finish
Some RAPI calls result in the creation of a long-running job,returning a job id to be used to extract the results later. To reducethe amount of boilerplate, introduce a simple function to do thewaiting....
Add simple retrieval operations to workload
This patch expands the RAPI workload with simple Get* commands.
Add the first version of the RAPI workload script
The RAPI workload script supplies work for the RAPI compatibilitytests. The initial version does very little, but can be expandedas needed.
Make the qa_rapi setup method return the RAPI client
Move RAPI secret lookup to qa_rapi
The RAPI secret lookup is a helper function used by the Ganeti QA toretrieve the RAPI password of an already setup cluster. As this couldbe useful to other utilities performing QA, move it to the qa_rapimodule.
Add code style document to documentation
The Ganeti code style has been stored on the project wiki at:
https://code.google.com/p/ganeti/wiki/StyleGuide https://code.google.com/p/ganeti/wiki/HaskellStyleGuide
This commit combines the two pages into an .rst file with minimal...
Correct exception when ssconf file does not exist
After an upgrade to 2.11, the ssconf file for the mastercertificates might not exist. Based on the non-existance,noded falls back to a compatibility mode regarding dealingwith SSL certificates. The check for the ssconf file...
Also downgrade gluster parameters
Support for gluster was added only in version 2.11. So,when downgrading to the 2.10 branch, these parametersneed to be removed.
Create client certificate for normal nodes
The vcluster QA revealed a bug in the SSL certificatehandling code, where certificates were only createdwhen the node is a master-candidate. However, every nodeshould have a certificate, but only the digests of the...
Also consider filter fields for deciding if using live data
If the query fields don't require live data, we use the shortcutand don't request live data. However, we cannot take this shortcutif the fields the filter depends on requires live data.
Catch exceptions when calling curses.setupterm() in QA
If it's running on a non-standard terminal, such asrxvt-unicode-256color, the call fails with an exception. Instead, catchthe exception and proceed without coloring warnings/errors.
Signed-off-by: Petr Pudlak <pudlak@google.com>...
Increase job queue polling interval
Now that all jobs are monitored with inotify, increase the polling interval.
After detecting a finished job, schedule again
In order to obtain a higher throughput of jobs, schedule new jobsas soon as a job was detected to have finished.
Attach a watcher for jobs
Add a function that can serve as an event handler for inotifyupdating a job in the job queue if the corresponding job filechanges. Also attach it to all jobs selected to be run.
JQScheduler: always pass JobWithStat
When attaching inotifies to jobs, we need to preserveit through potential requeuing actions. Also, this informationis needed for cleaning up.
Cleanup inotifies
When cleaning up finished jobs, remove the inotifyattached to them, if any.
Add an optional inotify to jobs in the scheduler
This provides the infrastructure to monitor running jobsby inotify, and hence update the queue promptly uponjob changes.
Make luxid handle SetDrainFlag
Make luxid also handle queries to drain the job queue.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Add RPC for setting the queue drain flag
As luxid is also responsible for handling requests to drain the job queue,we need the corresponding RPC in Haskell as well.
Fix sign in drain_flag request
The drain flag is set, if the queue is not open.
Eliminate installation modes in OS reinstalls doc
Eliminate installation modes in OS reinstalls design doc and insteadallow disk images and OS scripts to be combined, with an optionalvirtualized environment.
Reinstantiate inotify after a lost file
When watching a file, reinstantiate the inotify if notifiedof an event that removes the watch. Such events are likelyto happen, as our usual way to "modify" a file is to atomicallyreplace it by another one.
Improve debug-logging for watch file
Also log, at debug level only, when a change of a watchedfile was observed, but the change did not result in anychange of derived value.
Improve debugging by logging inotify events
At debug level, not only log that an inotify triggered,but also log the actual event.
Update design doc to match implementation
This patch contains some minor changes in the design docto make sure the details match the implementation.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>