Add function to get master IP parameters from configuration
Add a function to extract the MasterNetworkParameters fromthe ConfigData. That will be needed to set up the master IP.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>
Add an object describing the master network parameters
This will be used in the RPC call to the node daemon askingit to set up the master IP address.
Use getMasterOrCandidates
...instead of replicating the functionality on the fly.
Add the compression tools parameter
This patch makes the myriad of changes necessary for the compressiontool parameter to be added. The filtering of compression tools forsuspicious entries has been added for this exact purpose.
Signed-off-by: Hrvoje Ribicic <riba@google.com>...
Make arbitrary compression tools work
We assume that the compression tools the user supplies use stdin andstdout for handling data, and that a switch is used to distinguishcompression from decompression. This patch introduces these constraintsby adding the invocation of these tools to the import-export daemon....
Disable protections against unknown compression types
Ganeti took care to restrict all possible compression invocations tothe few options that were available. This patch strips away all ofthose, but does not allow any interesting and dangerous commands...
Extend offered compression types
This patch adds a few new types of compression supported by Ganeti:gzip-fast (gzip -1), gzip-slow (ordinary gzip), and lzop. gzip nowbecomes a shorthand for gzip-fast kept for compatibility.
Export RPC functions for temp. DRBD reservations in WConfd
These functions will replace the methods in config.py.
Signed-off-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Add the state of temporary DRBD reservations to WConfd
.. and the corresponding functions for reading/modifying them.
The modification functions are somewhat more complex, because they needto support that the modification function uses ConfigData and can...
Utility function for modifying an IORef using a lens
.. and a supplied function that works inside the lens.
New module for temporary reservation of config. resources
This patch adds the first step, the reservation of DRBD minors.
A utility function for finding the first unused element
.. in a given set. This is similar to FindFirst function in our Pythoncode-base, but this one automatically picks the element after the end ofthe set, if the set has no holes.
Signed-off-by: Petr Pudlak <pudlak@google.com>...
A function for listing the DRBD minors of an instance
This includes nested disk children.
Add DiskParams to Disk object
The 'DiskParams' slot was missing from Haskell's Disk objects.Since Wconfd is now responsible for writting the config file this wascausing the 'params' slot to not be written in the config file.
Signed-off-by: Ilias Tsitsimpis <iliastsi@grnet.gr>...
Rename DiskParams to GroupDiskParams
DiskParams was used for the cluster/group disk parameters type. Thispatch renames it to GroupDiskParams and uses the DiskParams type forthe parameters of one single Disk object.
Merge branch 'stable-2.11' into stable-2.12
Merge branch 'stable-2.10' into stable-2.11
Support restricted migration
Make hbal support an option to disallow ReplacePrimary movesand restrict ReplaceAndFailover to instances where the primarynode is drained. If used in evacuation mode, the only migrationmoves will be off the drained nodes....
Add an option for restricted migration
This option will allow node evacuation with migrations onlyoff the nodes to be evacuated.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
For RPCs, never log arguments
...to keep the log to a manageable size.
Limit size of request locking
...as we currently move the whole configuration over thenetwork.
Lift the Disk objects from the Instances
This patch replaces 'instance.disks' with 'GetInstanceDisks' everywherein the codebase. From now on, the function 'GetInstanceDisks' from theconfig file has to be used in order to get the disks of an instance.Also the functions 'AddInstanceDisk'/'RemoveInstanceDisk' have to be...
Use 'getInstDisks' function to retrieve the disks
Change Haskell's Query code to use Config's 'getInstDisks' function inorder to retrieve the instance's disks.
Signed-off-by: Ilias Tsitsimpis <iliastsi@grnet.gr>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Implement getDisks in Confd
Add 'ReqInstanceDisks' request type and allow Confd to query for thedisks of an instance. The helper function 'getInstanceDisks' returns thelist of instances on the given node along with their disks and is usedby the function 'addInstNameToLv'....
Add methods to config to get disks
'GetInstanceDisks' returns a list of disk objects for the giveninstance. 'GetDiskInfo' returns information about a disk given its UUID.These functions should be used instead of the Instance's disk method.
Also add the 'getDisk' and 'getInstDisks' functions in Haskell but leave...
Add timestamp/serial_no slot to disk objects
Now that disks are top level citizens in config,they need a timestamp and a serial_no slot.
Add disks entry to config.data
Add disks entry to config.data.
kvm: use a dedicated QMP socket for kvmd
The KVM daemon keeps a persistent connection to the instances' QMPsockets, listening for asynchronous events. As each monitor socket(either human, or QMP) can handle only one client at a time, this hasthe side-effect that QMP cannot be used for regular instance operations....
Retry forking a new process several times
Apparently due to some library bug, forking sometimes fails: The newprocess is running, but it doesn't start executing. Therefore we retrythe attempt several times.
Add a module with utility functions for MonadPlus operations
In particular, functions for retrying a MonadPlus action:It is repeated until it returns a valid result.
Merge branch 'stable-2.11' into master
Pass the debug level to forked jobs
When forking off jobs, make them inherit the debug levelof the parent process (i.e., of luxid). In this way, wecan debug jobs in test clusters without cluttering productionlogs. We pass the debug level through the environment instead...
Added support for disk native AIO mode for KVM
This patch adds support for the native aio on KVMhypervisor.
Basically, It adds a new HV-KVM optional parameter"disk_aio" that can be set with the followings values:threads (the default for KVM) or native. If not set, it...
Fix a typo in a debug message
Check for own locks when checking job death in Luxi
Otherwise a job that is being started is falsely reported as dead.
Mark a job as failed, if it fails to start
.. and add a reason trail message. Otherwise failed jobs hang, neverfinishing.
htools metric: use weighted vcpu/pcpu ratio
...as described in doc/design-cpu-speed.rst
Add effective CPU overcommitment as derived node parameter
Add a derived parameter for nodes, providing the ratio ofvirtual CPUs per CPU-speed weighted physical CPU.
htools: support cpu_speed at luxi backend
Make the htools luxi backend also query for cpu_speedand take the result into account.
htools: add CPU speed to the text backend
Extend the text format by an optional column for each nodecontaining the relative CPU speed, if provided.
htools: add function to set CPU speed
Add a function on nodes modifying the CPU speedparameter.
htools: extend Node by CPU speed
Add an additional parameter to the representation of a nodefor the relative CPU speed, initially set to 1.
Add a new node parameter cpu_speed
This parameter will describe the speed of the CPU relativeto the speed of a "normal" node in this node group.
Add VTypeFloat
...in order not to have to declare floating pointvalues as VTypeInt and rely on the sloppiness ofthe JSON specification to not distinguish betweenintegers and floating point numbers.
When checking job death, check if its lock is the Luxi lock
In this case, the call trying to acquire a shared lock always succeeds,because the daemon already has an exclusive lock, which falsely reportsthat the job has died.
Provide a more detailed messages when cancelling jobs
In particular, distinguish the cases when a job could not have beencancelled and when a job has already finished.
Add reason-trail entry on failing jobs
When failing a job, add an entry to the reason trail, indicatingwhat made the job fail (e.g., failed to fork or detected job death).
Add lenses for OpCodes
...to simplify manipulation of them.
Add a prism for ValidOpCode
...to be able to operate on the MetaOpCode thatis behind an InputOpCode (if we're in the rightcomponent of the sum).
Add lenses for the job queue objects
...so that manipulations deep within such an objectget more simple.
Move the definition of JQueue objects to a separate file
Move all the definition of objects to a spearate file. Inthis way, the lense module for JQueue can use these objects,while JQueue can use the lenses. For use outside, we reexportthe objects.
Signed-off-by: Klaus Aehlig <aehlig@google.com>...
Export reasonTrailTimestamp
Use toErrorBase to slightly improve code in WConfd server
.. and get rid of unnecessary variable binding.
Clean up dead jobs from the job queue
Make the onTimeWatcher of the job queue scheduler also verifythat all notionally running jobs are indeed alive. If a job isfound dead, remove it from the list of running jobs and updatethe job file to reflect the unexpected death....
Cancel jobs by sending SIGTERM
We can only send the signal if the job is alive and if there is aprocess ID in the job file (which means that the signal handler has beeninstalled). If it's missing, we need to wait and retry.
In addition, after we send the signal, we wait for the job to actually...
Add debugging statements to Ganeti.Utils.Livelocks
.. so that it can be viewed what lock file and with what result wastested.
Enhance watchFile in Ganeti.Utils
The functionality is kept the same, but instead of comparing forequality, a more general version based on a predicate is added.This allows to base the condition on only a part of the output.
In addition, 'bracket' is added so that inotify data structure is...
Add MonadLog instance for MaybeT
.. so that it's possible to use logging operations there.
When forking a job, close all unnecessary file descriptors
This is a bit problematic as there is no portable way how to list allopen file descriptors, and we can't track them all, because they're alsoopened by third party libraries such as inotify. Therefore we use...
Add a utility function for retrying within MonadError
`orElse` works just as `mplus` of ResultT, but it only requires`MonadError` and doesn't accumulate the errors, it just returns thesecond one, if both actions fail.
When starting the Luxi daemon, check if it's able to fork
If a Haskell program is compiled with -threaded, then inheriting openfile descriptors doesn't work, which breaks our job death detectionmechanism. (And on older GHC versions even forking doesn't work.)...
Make luxid aware of SIGCHLD
As luxid forks off processes now, it may receive SIGCHLDsignals. Hence add a handler for this. Since we obtain thesuccess of the child from the job file, ignoring is goodenough.
Execute jobs as processes from Luxi
.. instead of just letting the master daemon to handle them.
We try to start all given jobs independently and requeue those thatfailed.
Add a function for failing a queued job
.. which will be used if the Luxi daemon attempts to start a job, butfails.
Add optional fields for job livelocks and process IDs
This will allow to check if a particular job is alive, and send signalsto it when it's running.
The fields aren't serialized, if missing, for backwards compatibility.
Add utility function for creating fields with process IDs
.. using the POSIX type ProcessID.
Add Haskell and Python modules for running jobs as processes
They will be used by Luxi daemon to spawn jobs as separate processes.
The communication protocol between the Luxi daemon and a spawned processis described in the documentation of module Ganeti.Query.Exec....
Add an utility function for writing and replicating a job
Use the function where appropriate.
Also handling of CancelJob is slightly refactored to use ResultT, whichis used by the new function.
Add a livelock file for the Luxi daemon
The file is initialized and kept within JQStatus.It is temporarily assigned to jobs spawned by Luxi until they createtheir own livelock files.
Move `isDead` from DeathDetection to Utils/Livelock
.. as it has nothing special to do with WConfd and fits the new modulebetter.
Add a module for livelock related functions
Currently it exports a function for creating livelock files.
Add functions for computing the full path of livelock files
.. so that Haskell code can create them at the proper place.
Allow closing a RPC client, keeping its file descriptors
The purpose is to keep the communication channel open, while replacing a'Client' with something else.
Separate client and server config for Luxi communication
The daemon identity is only required for server connections to set theaccess mode to its socket appropriately. For client connections it'snot needed and requiring it prevents creating standalone clients, for...
Extend 'lockFile' to return the file descriptor
.. of the locked file so that it can be closed later, if needed.
Allow creation if bi-directional pair of Luxi-like clients
This allows a process and its forked child to communicate with eachother using our standard infrastructure.
Separate read and write handle in the Luxi Client data type
This is required for inter-process pipes, which are fully supported onlyas uni-directional.
Merge branch 'stable-2.9' into stable-2.10
Set exclusion tags correctly in requested instance
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Jose Lopes <jabolopes@google.com>
Export extractExTags and updateExclTags
...from the htools Loader. These functions are needed whenparsing the requested instance of an allocator request.
Report non-existent jobs as such
When queried to WaitForJobChange of an non-existent job,report this as an error.
fix typo
Clean up from LockAllocation what is no longer used
With the change from LockAllocations to LockWaitings, severalmanipulation operations had to be implemented for LockWaitingsand became unused in LockAllocation. Remove these functions thatare no longer used....
Use a LockWaiting structure instead of a LockAllocation
In this way, we will be able to support in WConfD waiting for locksto become available instead of having to poll for them.
Support opportunistically allocating locks
Again, this just wraps around updateLocks, sequentially tryingto obtain all the locks mentioned.
Support intersecting locks
...in waiting structures. This is just a convenience wrapperaround freeLocksPredicate.
Support downgrade and freeing locks
Add convenience functions for downgrading and freeinglocks in a waiting structure. As these functions areguaranteed to always succeed, they will also drop anypending request of the owner.
Do not notify the current requester
The current implementation of lock waiting yields as notificationset the list of all owners whose requests could be fulfilled. Thisincludes the initiating request. While technically correct, theoriginal requester gets the answer of the request and hence does not...
Fix a race in rescheduling jobs
When handling the queue, in particular at analyzing job dependencies,we assume that all non-finalized jobs are present in the Queue datastructure. When rescheduling jobs we move them from the running partof the queue to the scheduled part again. In order to comply with the...
Instance JSON LockWaiting
...so that it can be used by WConfD who needs to beable to persist all its data.
Instance JSON LockRequest
...as pending requests need to be serialized when serializinglock waiting structures.
Add a function to get a LockWaiting from an extRepr
Add a function to obtain some lock waiting structure from agiven extensional representation.
Provide an extensional representation of LockWaiting
The internal representation of the lock waiting structure containssome arbitrariness---pending requests are arbitrarily keyed by oneof the lock owners that blocks them. Therefore, LockWaiting is notan instance of Eq. To allow some meaningful form of comparison...
Export the set of pending requests
...from a lock waiting structure. In this way, all the datadescribing the behavior can be inspected.
Schedule on jobs where all job dependencies are finished
Jobs may depend on other jobs in the sense that they may only be startedonce a given job is finalized. For a job process, however, it is hard todetermine if the status of a different job without a significant overhead....
Add function to obtain job dependencies
Add a function that computes the list of job ids a job dependson. This will allow to schedule only those jobs for execution,where all jobs it depends on have been finalized.
Add function to obtain job id from a dependency
Add a function that extracts the job id, if given in absoluteform, from a job dependency.
Add process id to lock-owner description
...so that we can notify owners when their pending request got granted.