Let WConfd distribute the configuration to MCs
.. and remove the distribution from lib/config.py
Signed-off-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Add a new RPC server call for uploading a single file
The server side processes the request exactly the same as for"upload_file".
Unlike "upload_file", the new call "upload_file_single" declares allrequired fields without requiring additional preprocessing....
Add more meaningful error messages to asserts in vcluster
.. to simplify debugging of RPC calls.
Use correct lockfile for gnt-debug wconfd
As jobs are currently running in masterd, use the masterd livelockfile.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>
Add utility to guess livelock file for an owner
As livelock files are constructed in a systematic manner,we can guess what the livelock file for a given owner is.While this will not necessarily work perfectly, it will beuseful to simplify direct debugging of WConfD....
Make masterd create a livelock file
...so that it can request resources from WConfd.
Rename setup_queue to setup_context in masterd
...as this function sets up a much richer context than justthe job queue, including the current lock management.
Add utilities for liveliness lock files
To request resources from WConfD, requesters have to providethe name of a file they own an exclusive lock on. In this way,their death can be detected. Add utility functions to obtainsuch a file name.
Signed-off-by: Klaus Aehlig <aehlig@google.com>...
Ensure the existence of LIVELOCK_DIR
Add a path to store the lock files presented to WConfD
When requesting resources from WConfD, a file has to bepresented where an exclusive lock is owned on, so thatWConfD can detect when the requester dies. Add a pathto a directory where these files are kept in....
Merge branch 'stable-2.11' into master
Convert int to float when checking config. consistency
When reading the configuration file from RPC JSON, values without afloating point are parsed as 'int', not as 'float', and later theconsistency check fails.
This patch adds an automatic conversion from 'int' to 'float' during...
Align timestamps in gnt-job info
This patch aligns the timestamps output as a part of gnt-job info, andperforms minor refactorings in the process.
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>
Add alignment support to PrintGenericInfo
Aligning dictionary entries makes no difference to a YAML parser, butmakes the output much easier to read and compare. This patch adds thepossibility of specifying alignment groups to ordered dictionaryentries....
Make gnt-job info output valid YAML
This patch changes gnt-job info to use standard functions defined incli.py, and output valid YAML.
Make PrintGenericInfo handle tuples better
The PrintGenericInfo function in cli.py did not handle tuples ascontainers of items, making it impossible for these to be deserializedautomatically when a YAML parser is used. This patch adds separatehandling of tuples, including inlining them for readability when...
Make gnt-debug delay interruptible
The gnt-debug delay command could be useful as a means of acquiringlocks for testing purposes. In practice, to be useful it should beinterruptible, otherwise we risk race conditions or long delays.
This patch follows the examples of the move-instance command and the...
Add the interruptible option to gnt-debug delay
This patch allows the opcode option to be used through the gnt-debugclient.
Factor Unix domain socket creation into helper class
As the delay class will also have to start using domain sockets,extract the functionality into a helper class.
Fix minor accidental concatenation
Handle incorrect duration more elegantly
The previous version of the LUTestDelay opcode relied on the utilityfunction complaining about the negative duration. As this function hasbeen removed for now, do the check ourselves, and issue a moreappropriate exception....
Make gnt-debug delay command run in parallel
The gnt-debug delay command executes the delay first on the master, andonly then on all the other nodes, causing a significant delay. Thispatch makes the command treat the master as it would all other nodes....
Fix typo in RAPI client utility
Remove duplicated '_CheckOSVariant'
It seems '_CheckOSVariant' was moved from 'ganeti.cmdlib.instance' to'ganeti.cmdlib.instance_utils' but the source was never deleted. Thispatch deletes the source copy if this function.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>...
Add listlocks to gnt-debug wconfd
So that wconfd's locking can be debugged directly.
Stop watcher from restarting down instances during an opcode
This patch changes the watcher to check whether an instance that isdown is also locked by some LU before attempting to restart theinstance. Without checking the lock status, the watcher could think...
Remove unused import in rpc/transport.py
.. which got there by mistake.
Retry luxi/wconfd RPC calls if the connection is closed
Since the daemon can decide to close a client connection after atimeout, the client needs to be able to automatically reconnect.
This patch introduces this functionality into the RPC client:If an attempt to send data fails on 'Broken pipe', it's retried one more...
Allow cluster mac prefix modification
Extend LUClusterSetParams to allow the modification of the clustermac-prefix setting in 'gnt-cluster modify' command.
This fixes part of issue 239.
Signed-off-by: Dimitris Bliablias <bl.dimitris@gmail.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Show mac prefix setting in gnt-cluster info
Include mac-prefix setting in the output of 'gnt-cluster info'command.
Setting correct permissions of client cert (split-user)
This patch makes sure that the client certificate getsthe right permissions and owner when created. Additionallyit enhances the 'ensure_dirs' script to correct thepermissions in case they are broken for whatever reason....
Add a command to gnt-debug to test various aspects of wconfd
For debugging purpose, support direct communication to WConfD fromthe commandline for some of its commands. For the time being, supportthe echo command.
Add some whitespace to fix formatting
Some error messages were lacking some spaces between linesto make it more readable.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Consider old client cert only when available
This fixes a bug which occurred only after upgradingfrom 2.10 to 2.11. During the cluster renew-cryptooperation, Ganeti tries to include the old certificatein the candidate map while it is providing newcertificates. This failed when there was no certificate...
Fix return of 'Validate'
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Add reason for job pickup to the trail
Add a new entry in the reason trail when a job is picked up by MasterD from thehard drive, after LuxiD put it there.
Note that the signature of NameToReasonSrc is changed in an incompatible way,although it's a public method because in this commit we also change its only...
Make the AddReason method public
It will need to be accessed from outside the class too in one of the nextcommits.
Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Let config.py use WConfd for reading/writing the config
Currently it only relays the reads/writes to the file to WConfd,everything else yet remains in config.py.
Also if the 'ConfigWriter' is opened in "offline" mode (like inbootstrap.py), it doesn't use WConfd and resorts to the original...
Start WConfd temporarily during master failover
.. in order to update the configuration and distribute ssconf, beforestarting the daemons by the scripts.
Merge branch 'origin/stable-2.10' into stable-2.11
Signed-off-by: Hrvoje Ribicic <riba@google.com>...
Merge branch 'origin/stable-2.9' into stable-2.10
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Make gnt-debug locks display fake job locks properly
When a job is dependent on other jobs, a fake lock is created whosepending entry contains a list of job ids waiting on the job. gnt-debuglocks did not expect the job ids to be ints, crashing when encountering...
Make NiceSort treat integers well
NiceSort is invoked on arrays that may contain strings, but in othersituations can contain ints as well. As this surprisingly makes sense,add a tiny modification to make NiceSort work in these conditions.
Merge branch 'stable-2.10' into stable-2.11
Merge branch 'stable-2.9' into stable-2.10
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Merge branch 'stable-2.8' into stable-2.9
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Fix expression describing optional parameters
The NIC's network and vlan are also newly added, hence need to beconsidered optional to remain backwards compatible.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Let the instance's tuple of nodes start with the primary
Before the tuple of nodes of an instance was created from a set, listingthe nodes in alphabetical order. This patch ensures that the primarynode is always the first one in the list.
Signed-off-by: Petr Pudlak <pudlak@google.com>...
Check the existence of system users and groups at bootstrap
Before, if any of these were missing, the creation of a cluster failedand the cluster remained in an inconsistent state, without thepossibility to destroy it or to re-create it (#603).
This patch calls 'GetEnts' during bootstrap, which tries to read all...
Conflicts: lib/cmdlib/instance.py: manually apply 0973f9ed on...
Improve job status assert affected by race condition
In the sliver of time between choosing a waiting job to be executed andtrying to acquire locks for its execution, the status of the job can bechanged to canceling. An assert checking the job status neglected to...
Export and import Disk/NIC name
Name of Disk/NIC were not exported during backup until now.Use the exported info during gnt-backup import.
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Fix backup import in case NIC is inside a network
Network UUID is written in .ini file during backup exportbut is not used by _ReadExportParams(). This patch fixes it.
Please note that in case a network is given, link and mode shouldnot be included in NIC options....
Override get() method of ConfigParser
During backup import/export SafeConfigParser() is used tosave/restore instance's configuration. There is a possibility if anexport is done with a different Ganeti version, a specific value notto be saved during export (e.g. the NIC/Disk name) but still...
Smooth renewal of client certificates
This patch fixes another chicken-and-egg problem whichoccurred when the node certificates get renewed. Whenrenewing a node certificate, the previous certificatehas to be used to update the configuration. To address...
Constant for instance communication network mode
Create a new constant to hold the instance communication network modeas this constant will be necessary during the QA, and update thegeneral documentation about the constants related to the instancecommunication mechanism....
Add '-c | --instance-communication' flag to instance modify
Enable/disable instance comm via 'gnt-instance modify'
This patch adds the logic necessary to enable/disable the instancecommunication in a running instance via 'gnt-instance modify'. Withinstance communication enabled, the instance gets a new NIC that is...
Refactor instance comm NIC name creation
Refactor name creation for the NICs used in instance communication.These names are generated based on a prefix and the instance name.Also, these names must be unique within a single instance.
Fix copy of NIC objects to be consistent with the other call
... which can be found just right below in the same module.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Fix whitespace and typos in comments
Use node UUID as client certificate serial number
It turns out, that some implementations of OpenSSL are morepedantic in checking the certficates than others. In thisparticular case, the SSL connection could not beestablished when the serial number of the certificates...
Revert "Disabling client certificate usage"
This reverts commit 45f75526b848, which was introduced totemporarily disable the implementation of SSL clientcertificates. As this patch series fixes the reason forthe disabling, we are rolling back the patch....
Fix an ambiguity in the documentation for GetNodesSshPorts
This ambiguity was introduced by adding the WConfd client.
Add the Python client for WConfD
The client combines the abstract client class and the WConfDstub to provide a Python interface to WConfD.
Add an RPC Python client for generated stub classes
The client provides _GenericInvoke(...) for a stub and uses its_GetSocketPath() for opening a Transport.
Add a Python directory for RPC generated stubs
Directory "lib/rpc/stubs" will contains RPC stubs generated fromHaskell.
Let RPC clients handle their socket address
.. instead of AbstractClient itself. Also let every client call_InitTransport() as needed. This allows to determine socket addresseslater than during the initialization of a class.
Add the WConfD daemon itself
The daemon exposes the declared functions in Ganeti.WConfd.Core to RPCclients (currently just 'echo').
Add the WConfD daemon to build configuration files
Also list it in the Haskell datatype, constants, Python constants andtest configuration.
Rename some functions not to collide with opcode names
Rename some functions related to instance communication not to collidewith the naming convention used in the opcodes.
Refactor instance communication network add and connect
Factor out the opcodes 'OpNetworkAdd' and 'OpNetworkConnect' used in'LUClusterSetParams' and 'LUGroupAdd' in order to reduce codeduplication and keep the configuration of the instance communication...
Connect new groups to the instance communication network
When a new group is added, if the instance communication network isenabled, then this group must also be connected to this network.
'gnt-cluster modify' with '--instance-communication-network'
Extend CLI 'gnt-cluster modify' with'--instance-communication-network'. Given that the return type for'OpClusterSetParams' changed to optionally return a list of jobs, itis also necessary to handle the result of this opcode accordingly....
'LUClusterSetParams' creates the instance communication net
Extend 'LUClusterSetParams' to create the user-supplied instancecommunication network in case this network does not exist. Note thatif the user-supplied network already exists, nothing needs to be done...
Check prereq instance communication network in 'SetParams'
Later, the logical unit for 'OpClusterSetParams' will be responsiblefor creating the instance communication network in case it does notexist. For now, it is important to check whether the network the user...
Add helper to handle CLIs that optionally spawn several jobs
This helper function detects whether an opcode returned a list of jobs(i.e., a result of the type ht.TJobIdListOnly) and in this case ituses 'ganeti.cli.JobExecutor' to wait for the jobs and determine the...
Instance comm network from config instead of predefined
Add 'instance_communication_parameter' to 'Cluster'
Remove the HTOOLS configuration variable
.. and update the code that uses it.
Signed-off-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Conflicts: lib/client/gnt_node.py: trivial src/Ganeti/Query/Query.hs: import ALL the functions
Gracefully handle queries for non-existing nodes
When adding a node, Ganeti checks whether the node is alreadypart of the cluster by querying for the node name. However,as queries are meant to return all nodes with the given name,it might well return the empty list when a new node is to be...
Fix default for luxi clients in python
As masterd is going away, set default for all clients toluxid's socket.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Remove query option from RAPI client
As all RAPI requests now go to luxid, and masterd is going away,remove option from RAPI client to chose a different socket.
Remove query option from GetClient
As all luxi clients talk to luxid now, and masterd willgo away, remove the option to use socket different fromluxid's.
Remove explicit reference to the query socket
Now that luxid's socket is the default socket anyway, do notpass the "query=True" parameter to GetClient. This will allowto get rid of this keyword argument, as masterd will go away.
Make watcher use luxid socket only
With luxid being feature-complete with respect to masterd,make the watcher use its socket exclusively. This is alsonecessary, as masterd will go away soon.
Fix instance create and import parameters
Move OS parameter related constants to 'ganeti.cli' so they are usedboth by instance create and instance import from the CLI.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>...
Fix compatibility issues
Signed-off-by: Santi Raffa <rsanti@google.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
luxid: give stern warnings about debug mode
Luxid as it is can leak private and secret parameters by loggingall requests as they arrive, before any preprocessing is done.
Warn the user stern warnings about this.
Signed-off-by: Santi Raffa <rsanti@google.com>...
OpCodes: modify InstanceReinstall for private, secret params
Modify InstanceReinstall to accept and process private and secretparameters.