Also consider filter fields for deciding if using live data
If the query fields don't require live data, we use the shortcutand don't request live data. However, we cannot take this shortcutif the fields the filter depends on requires live data.
Signed-off-by: Klaus Aehlig <aehlig@google.com>...
Increase job queue polling interval
Now that all jobs are monitored with inotify, increase the polling interval.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
After detecting a finished job, schedule again
In order to obtain a higher throughput of jobs, schedule new jobsas soon as a job was detected to have finished.
Attach a watcher for jobs
Add a function that can serve as an event handler for inotifyupdating a job in the job queue if the corresponding job filechanges. Also attach it to all jobs selected to be run.
JQScheduler: always pass JobWithStat
When attaching inotifies to jobs, we need to preserveit through potential requeuing actions. Also, this informationis needed for cleaning up.
Cleanup inotifies
When cleaning up finished jobs, remove the inotifyattached to them, if any.
Add an optional inotify to jobs in the scheduler
This provides the infrastructure to monitor running jobsby inotify, and hence update the queue promptly uponjob changes.
Make luxid handle SetDrainFlag
Make luxid also handle queries to drain the job queue.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Add RPC for setting the queue drain flag
As luxid is also responsible for handling requests to drain the job queue,we need the corresponding RPC in Haskell as well.
Fix sign in drain_flag request
The drain flag is set, if the queue is not open.
Reinstantiate inotify after a lost file
When watching a file, reinstantiate the inotify if notifiedof an event that removes the watch. Such events are likelyto happen, as our usual way to "modify" a file is to atomicallyreplace it by another one.
Improve debug-logging for watch file
Also log, at debug level only, when a change of a watchedfile was observed, but the change did not result in anychange of derived value.
Improve debugging by logging inotify events
At debug level, not only log that an inotify triggered,but also log the actual event.
Verify client certificates
This patch adds a step to 'gnt-cluster verify' to verifythe existence and validity of the nodes' clientcertificates. Since this is a crucial point of thesecurity concept, the verification is very detailed withexpressive error messages and well tested by unit tests....
Verify incoming RPCs against candidate map
From this patch on, incoming RPC calls are checked againstthe map of valid master candidate certificates. If no mapis present, the cluster is assumed to be inbootstrap/upgrade mode and compares the incoming call...
Extend RPC call to create SSL certificates
So far the RPC call 'node_crypto_tokens' did only retrievethe certificate digest of an existing certificate. Thiscall is now enhanced to also create a new certificate andreturn the respective digest. This will be used in various...
Store candidate certificates in ssconf
This patch enables Ganeti to store the candidatecertificate map in ssconf. A utility function toread it is provided as well.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Add candiate certificate map to configuration
At the end of this patch series, incoming RPC calls arelegitimized against a map of master candidate nodes'SSL certificate digests. This patch adds the map itselfto the cluster's configuration.
Signed-off-by: Helga Velroyen <helgav@google.com>...
Retrieve a node's certificate digest
In various cluster operations, the master node needs toretrieve the digest of a node's SSL certificate. For thispurpose, we add an RPC call to retrieve the digest. Thefunction is designed in a general way to make it possible...
Merge branch 'stable-2.10' into master
break line longer than 80 chars
hsqueeze: tag nodes before offlining them
hsqueeze is supposed to tag nodes before powering them down, so thatit later can recognize which nodes can be activated later. When showingthe commands to execute, also add the tagging commands.
hsqueeze: only consider nodes that are not secondaries
If an instance has a secondary node, it cannot be easilymoved to every node (in the same node group), as otherwiseno node would be distinguished as secondary. As hsqueezeshould only consider nodes were moving the instances away...
Gluster: add the Shared File storage type
The shared file and gluster disk templates should not report their diskspace information like file does, because they do not behave the same.
If a cluster pulls from the same, shared source of storage then it is...
Gluster: add userspace access support
Add support for the QEMU gluster: protocol. Also change the accessmode routines so they check the access parameter for all templates.
Signed-off-by: Santi Raffa <rsanti@google.com>Signed-off-by: Thomas Thrainer <thomasth@google.com>...
Gluster: mount automatically
Add parameters to the Gluster disk template so Gluster can manage themount point point autonomously.
Signed-off-by: Santi Raffa <rsanti@google.com>Signed-off-by: Thomas Thrainer <thomasth@google.com>Reviewed-by: Thomas Thrainer <thomasth@google.com>
Gluster: use ssconf value for mountpoint directory
Gluster still does not mount anything autonomously, but this commitchanges where Gluster expects its mountpoint to be.
ssconf: Add Gluster mount directory
This commit adds the gluster storage directory to ssconf (withoutactually using its value just yet).
Gluster: minimal implementation
Add Gluster to Ganeti by essentially cloning the shared file behavioureverywhere in the code base.
Implement fields query for instance
Support the query for the fields available for instances.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>
Remove the hvsGlobals from instance query fields
...to be consistent with the python implementation.
When interpreting [] as "all fields", sort nicely
When asked for all fields, we promise to return the list of fieldssorted according to niceSort. Keep this promise.
Fix race in watchFile
As the calling of watchFile and the evaluation of the initialgetFStatSafe takes non-zero time, the file could have changedbefore inotify was set up properly. Solve this problem by anadditional check for the watched value to have changed immediately...
Merge branch 'stable-2.9' into stable-2.10
Use a data type when generating Python types of OpCodes
Currently they are generated only as Strings.
Signed-off-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Refactor OpCodeDescriptor from a tuple to a data type
This greatly enhances code readability.
Also fix monadic types "Q ExpQ" [which is "Q (Q Exp)"] to "Q Exp".
Add showValueList to PyValue for proper String instances
It's the same trick ShowS uses. We add a type class function forshowing a list to PyValue and then just use it in the instance for`[a]`. This way we have the proper String instance without anyoverlapping/incoherent instances....
Rename PyValueInstances.hs to PyValue.hs
Now the file contains the type class declaration as well.
Move PyValue into PyValueInstances.hs, import it in THH.hs
This puts all PyValue code into one module, getting rid of orphaninstances.
Make the duration field optional null-serialized
The time in SetWatcherPause is optional (with Nothing meaningthat the pause should be canceled), but the serialization isnot that of a Maybe Double; instead Just values serialize asthey are and Nothing serializes to null. Fortunately, we already...
Handle QueryConfigValues
Make luxid handle the QueryConfigValues call providing certainsimple status information about the cluster.
Add a predicate for watcher pause
Add a predicate, in IO, to test whether the watcher ispaused.
Provide path to watcher pause file
Extend Path.hs to also provide the path to the file indicatingwhether watcher is paused.
Implement SetWatcherPause in luxid
Make luxid handle SetWatcherPause correctly.
Add the RPC-call set_watcher_pause
With luxid taking over responsibility for handling watcher-pause requests,it needs to know about this RPC. So have it available in Haskell as well.
The time field for SetWatcherPause is optional
A JSON null value is used to indicate that the pause should be canceled.
Generate a separate return type for the job queue update RPC
The instantiation of RPC requires a bidirectional functional dependencybetween call type and return type. Hence we cannot use Unit everywhere.
Avoid lines longer than 80 chars
...as they're a lint error.
Merge branch 'stable-2.8' into stable-2.9
Move the generalized IO client from Luxi to UDSServer
No code is changed in this patch (except imports and qualifiers), onlymoved.
Signed-off-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Generalize the IO client handling in Luxi
... to be usable for WConfd as well. A daemon handler is encapsulatedinto `Handler` data type, which is then passed to a generic `listener`.
The changes are done in Luxi.hs so that the differences are visible and...
Add MonadLog instance for `ReaderT r m`
This allows to use logging with the ReaderT monad transformer.
Add a MonadLog typeclass for monads that allow logging
This separates logging from IO, allowing to create unit tests in futurefor functions that use it.
Add fromJResultE and fromJVal that uses MonadError
Using MonadError is more correct than just "fail" on an arbitrarymonad, and more scalable when using monad type classes or monad stacks.
Add an Error instance for GanetiException
This allows it to be used with MonadError.
Add MonadPlus and MonadError instances for GenericResult
.. and ResultT.
While at it, generalize also the MonadPlus instance of GenericResult andadd some Functor/Applicative instances.
Generalize "validateCall" to be usable outside LUXI
Return the method (as any instance of JSON) and the arguments of a call.
Add the Unix domain socket path to the Server data type
This simplifies code for closing such a socket.
Encapsulate a server socket and its parameters
Instead of passing a bare server socket around, we pass it encapsulatedin a data type together with parameters such as read/write timeouts.
Rename getClient/Server to getLuxiClient/Server
Later they will be split into LUXI-specific and general parts.
Split Luxi.hs into LUXI-specific functions and general ones
This will allow WConfD to use the general functions without importingLuxi.hs.
Make luxid support WaitForJobChange
Make support the WaitForJobChange, waiting for a job tochange on certain monitored fields.
Add a generic function capable of watching a file
Add a method to return the new value of a function if it changes withinthe given timeout. If not, return the old value. Make use of the fact,that the function only changes, if the specified file changes on disk....
Add a safe version of getFStat
The function getFStat causes an IOError if the file to be stated doesnot exist. In some cases, however, the only thing we care about is whetherit has changed, with disappearing being a legitimate change. So add a wrapperthat catches the IOError and returns nullFStat....
Make luxid inspect the job queue on startup
Since luxid handled the scheduling, make luxid also read the queueupon restart. In this way, jobs get scheduled in the same way, independentof luxid restarts.
Add a predicate to determine if a job has been started
Add a predicate jobs indicated that it has left the queue. Thiswill be needed, to allow restarts of luxid (which now handlesthe queue) independent of jobs (currently running in masterd).
Export getFStat from Utils
Use the jobFinalized predicate in JQScheduler
...to improve readability.
Provide a function to determine whether a job is finalized
While there is a function to calculate the job status, sometimesit is only relevant if the job is finalized. In this case, it ismore readable not having to know the internal order of JobStatus....
Fix evacuation out of drained node
Refactor reading live data in htools
This simplifies different handling of individual items.
Cherry-picked from 8c72f7119f50a11661aacba2a1abffdfdc6f7cfa.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Thomas Thrainer <thomasth@google.com>
Don't assume we win the archive race
The job scheduler in luxid regularly watches for changesof the job files to determine progress of jobs. As thesefiles are updated atomically, reading them will alwayssucceed---until they're archived. While luxid is quite...
ganeti-mond: Add the "-b" option to specify the bind address
This parameter was missing for this particular deamon and was requestedin issue #629.
Signed-off-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Support size suffixes in minmem/maxmem backed parameters
The backed parameters specifying the minimal/maximal memorycan also be passed as values with suffixes. Support parsingthese values.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Support fieldRead in partial params
While from parameters both full and partial versions are generated,with in the partial version all types mapped to Maybe, the fieldReadparameter of the field was not wrapped accordingly. So far, that didn'tmatter, as it was always Nothing in this case, but for supporting special...
Make disk size a special numerical field
For disk sizes, instead of plain numbers (naming the value in MiB),also accept expressions with units like 'GiB'.
Add a field-transformer for accepting parser
Add a transformer for numerical fields, to also acceptstrings instead of numbers if they can be parsed by thegiven parser.
Add a new unit parsing function taking all suffixes binary
In python, when parsing units (like disk sizes) we take allsuffixes (M, G, T) as 1024-based. To be backwards compatiblewhile moving to job management to luxid, in particular on RAPI,add such a parsing function in haskell as well....
Make JQScheduler handle failure on job starting
Given that luxid (at the moment) connects to masterd for startingjobs, it may be that this inter-process communication fails. Inthis case, just reschedule the jobs instead of killing the schedulerthread....
fix typo in log message
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Jose Lopes <jabolopes@google.com>
Differentiate watchers in luxid
luxid has two time-based watcher threads, one for theconfiguration, and one for the job queue. To improve readabilityof the debug output, make both watcher use a different debugmessage when the timer fires.
Make luxid use the JQScheduler
Make luxid use the job scheduler instead of immediatelystarting every received job.
Add a scheduler to keep track of the job queue
In order to allow informed decissions on when to start a job,it is necessary for luxid to keep track of the (active partof the) job queue. Add a scheduler, similar to the config reader,that does this, but also schedules jobs to be executed. At the...
Move FStat related function to Utils
In this way, the functions to to decide, based on fstat, whethera file needs to be reloaded can used by other parts as well,in particular to monitor progress in the job queue.
Rename enqueueJobs to startJobs
This reflects better what the method actually does. Later,we will add a job scheduler that will provide a proper enqueuemethod.
Add default_iallocator_params cluster parameter
Add a cluster parameter to hold the iallocator parameters usedby the default instance allocator. Implement the option tomodify config.data, query config.data and upgrade man pages,tests and cfgupgrade tool. The new default_iallocator_params is...
Modify --mond to yes|no option
Modify --mond option used by hail, hbal and hinfo from nonargument to yes|no option.
Signed-off-by: Spyros Trigazis <strigazi@gmail.com>Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Activate QA for rapi queries via luxi
This patch enables QA testing for rapi queries for thenewly transformed queries from python to haskell(groups, instances, nodes, export, and networks). So far,the QA did not distinguish between resources that cannot be...
Set the received time stamp for new jobs
Since luxid now handles the job submission requests, it is alsoits responsibility to set the received time stamps. Do this.
Provide a function to set the received times tamp of a job
This is the pure function for changing the received time stamp;obtaining the actual time stamp has to be done in IO.
Document the jobqueue timestamp format
...and also provide a method to get the current time inthat format.
Fix removal of duplicates
Commit ede6df3d02 introduced a bug in the node querieswhere disk templates where paired up wrongly to theirstorage unit keys due to removal of duplicates at thewrong place. This patch fixes it.
Fix retrieval of number of instances of a node
This patch fixes a FIXME to make the retrieval of thenumber of primary and secondary instances share morecommon code.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Use hypervisor / storage information only when requested
So far, the node queries ignored the list of fields andjust requested all available information from the backend.That means, for example if only hypervisor information isrequested, still the storage space calculation is...
Remove duplicate storage units in node query
This is a little performance tweak for the node queries.So far, the query code mapped the disk templates to storageunits. It could happen that two disk templates were mappedto the same storage unit and therefore the storage space...
eta-reduce isIpV6
This is not only better style, but also fixes a lint error.Also use the infix form of `elem` to increase readability.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Ganeti.Rpc: use brackets for ipv6 addresses
We detect an IPv6 vs V4 address based on columns, rather than passingthe family from the cluster object to be more future proof (in casewe'll ever support mixed clusters).
Unfortunately quite a bit more code is required to test this: we need an...
Make luxid job submission be defined by replication
When receiving jobs to be submitted, make luxid replicate them to allmaster candidates and then return. The actual execution can be handledasynchronously.
Add function to enqueue jobs
Add a function that ensures that a given set of jobs gets executed atthe appropriate time. At the moment, this is still the simplemechanism of handing over everything to masterd; but even at thisstage, it has the benefit of allowing to remove code duplication in...
Add a function justBad to filter the Bad value of a list
In the same way as justOk allows to filter the Ok values,add justBad to filter the Bad values. While there, simplifythe definition of justOk.