CheckBEParams handle a bool BE_AUTO_BALANCE
This only happens at cluster init, if the value is not user-specified.
Reviewed-by: imsnah
A few fixes related to master candidates
This patch: - fixes cluster verify when all nodes are master candidates, but the candidate_pool_size is higher - warn when the master node is not marked as candidate - disable setting master node to regular node...
Fix cluster rename and known_hosts
This patch rewrites and distributes ganeti's known_hosts file in case ofa cluster rename.
We also fix a problem in the node add (from where I copied theknown_hosts file distribution).
Reviewed-by: ultrotter
Fix gnt-cluster verify w.r.t. rpc changes
This partially reorganizes the cluster verify LU: - introduce constants for the node verify rpc call - move from additional rpc calls to a single rpc call, the call_node_info, which gaters all data needed...
Fix cluster rename
With the recent configwriter/ssconf changes, cluster rename becomestrivial. This patch gets rids of the code and just updates the clusterobject.
Convert rpc results to a custom type
For a long time we had the problem that both RPC-layer errors andresults from the remote node share the same "valuespace". This isbecause we shouldn't raise an exception when only one node failed(and lose the results from the other nodes)....
burnin: add instance reinstall and reboot
These two operations were missing from burnin. The reboot is done withall valid modes (a new constant is added), and the reinstall is doneboth with and without specifying the OS (to account for the two codepaths in the LU)....
KVMHypervisor add two missing 'constants.'
Some calls to the HV parameters were missing them.
KVMHypervisor fix to case misspellings
Use the new utils.CheckBEParams function
Where we used/forgot to validate beparams we now use the new common function.
Add utils.CheckBEParams
This function will be used in LUCreateInstance, LUSetInstanceParams,LUSetClusterParams and InitCluster to check the backend parametersvalidity and convert the relevant values to integer, without duplicatingcode. It lives in utils as bootstrap.py is calling it too....
Add constants.VALUE_TRUE and VALUE_FALSE
Handle default/none values in hv/be params
When a value is set to constants.VALUE_DEFAULT we have to remove it fromthe specific instance dict, as this way it will be populated from thecluster before. If instead it's specified as constants.VALUE_NONE we'll...
ImportExport: make src_node and src_path optional
If src_node is not there we'll default to using the currently exportedinstance name as src_path. Also, if src_path is not absolute we'll lookfor it in EXPORT_DIR.
Reviewed-by: iustinp
LUCreateInstance: handle import without src_node
If we get called with no source node we'll thread src_path as aninstance name exported in EXPORT_DIR in one of the nodes and look forit with the export_list rpc call.
LUCreateInstance: keep src node lock on import
Currently the node lock also guards against removing the import at thewrong time, so if we're importing an instance image we want to keep thesource node locked. In the future we might want to put export locks at a...
Fix master failover
The ssconf files were not updated by the master failover. We need topush them, and since we already have RPC initialized, we can use thestandard ConfigWriter to do so - this will take care of both the configfile and the ssconf files....
Adjust cluster-verify to check for candidate role
Currently cluster verify checks all nodes for the same set of files,even if the nodes are not master candidates.
This patch adds back checking of ssconf files for consistency and splitsthe checksum check into different error reporting messages based on...
Add candidate pool size checks in verify
Prevent demotion from candidate based on pool size
In gnt-cluster modify we prevent demotion from the candidate role ifthere are not enough master candidates left.
Add cluster candidate pool size parameter
This patch adds a new cluster paramater "candidate_pool_size" whichtracks the desired size of the list of nodes with the master_candidateflag set.
Prevent master failover to a non candidate node
Add the list of master candidates to ssconf
Restrict job propagation to master candidates only
This patch restricts the job propagation to master candidates only, bynot registering non-candidates in the job queue node lists.
Note that we do intentionally purge the job queue if a node is toggledto non-master status....
Restrict config replication to master candidates
This patch restricts the config data replication to master candidatesonly.
Add a gnt-node modify operation
This patch adds the OpCode, LogicalUnit and gnt-node command formodifying node parameters, more specifically the master candidate flagfor a node.
Add master/master_candidate fields to node list
This patch adds listing of the master_candidate field (as Y/N) and ofthe master role (again Y/N) for nodes.
Introduce a new 'master_candidate' node attribute
The field is not yet used.
Simplify a little the ssconf update
We have (again) the KeyToFilename function, so we move the writing ofthe files to a method under SimpleStore.
Replicate the node list in ssconf
This patch adds node_list in the list of replicated values fromConfigWriter.
Revert "Get rid of ssconf"
This partially reverts the "Get rid of ssconf" patch.
It adds back a simpler version of the SimpleStore class, and drops theWritableSimpleStore class. The new version of the class also hasnode_list as a new key, and increases the size of the keys so that big...
Fix RpcRunner._StaticSingleNodeCall
Unfortunately, a rpc.Client object was passed as the first parameter,causing the function to always fail.
Found during QA testing.
InitCluster: initialize master node serial_no
Currently it was left alone, and thus its value was "null".
Fix errors when the node info RPC is incomplete
[Forward-port from the 1.2 branch]
If ganeti starts before xend, the node information will not have all thefields filled in. The patch changes so that missing keys will be treatedas unknown (this applies to other cases as well, not only xend not...
RAPI:Fix root list and unittest for it.
RAPI: Switch from opcodes to no native 2.0 queries.
jqueue: Always print message for 100% when inspecting queue
jqueue: Allow jobs waiting for locks to be canceled
- Add new "canceling" status- Notify clients when job is canceled- Give a return value from CancelJob- Handle it in the client library
Improve the node add operation
Currently, the node add operation uses a job to query the node name andthe bootstrap function directly reads the config file for the clustername.
This patch changes to that both the cluster name and the verification ofthe node is done via queries to the master....
Fix logic bug in rev 2072
In revision 2072 "ConfigWriter: change cluster serial meaning" I misreadthe serial_no update logic: it was about updating the serial number onthe object itself, not on the cluster.
So we don't actually have at all cluster serial number increase when a...
jqueue: fix a bug in an error path
Dictionaries raise KeyError, and not ValueError when invalid keys arepasses to del.
ConfigWriter: change cluster serial meaning
Currently, we increase the cluster serial number for instance additions,removals and renames. This is conforming with the REST paradigm, howeverit means that for each of these operations, we need to push ssconf...
Fix gnt-backup export
This patch fixes a bug in disk calculation for gnt-backup export, whichcompletely broke one-disk instance export.
The patch also corrects some error messages and style issues.
Fix a message in LUExportInstance
We never verified the node name before, so this is most likely not anon-retrieve but a wrong name case.
Small change to job failure output
Currently, job failures are done by raising OpExecError(job result). Fora one-opcode job that failed, this is very non-intuitive:Failure: command execution error:[u'Disk size change not possible, use grow-disk']
This patch changes the output in two ways:...
Fix file-based block devices
We changed a while ago the protocol for opening block devices, butFileStorage was not changed. This patch makes it work again.
Fix instance creation
This patch fixes the diskless and drbd/file based instances. Sorry :(
convert run dir mode to constant
ganeti-noded used to create all directories under /var/run with anhard-coded mode. convert it to a constant.
jqueue: Log progress and load jobs one by one
By logging more information, a user can see how far it is in inspectingthe queue. This can be useful with a large number of jobs. Also, insteadof loading all jobs in one go, load only the list of job IDs and then...
jqueue: Shutdown workerpool in case of a problem
RAPI: Make calls safier
Reduce duplication of work in rpc.Client
This patch removes the duplicate serialization and calls toutils.GetNodeDaemonPort in rpc.Client, and instead moves them to callingfunctions (the _*NodeCall ones recently introduced).
Move the MASTER_SOCKET to SOCKET_DIR
Before it was in the abstract linux namespace, where unfortunately wecouldn't easily check from python the credentials of the connectingclients. Now we also have to remove the file on exit and when starting.
Add SOCKET_DIR_MODE constant
We want the socket dir to have a restricted permission.
Add SOCKET_DIR constant
This new directory under /var/run will be used for file based unixsockets.
Implement support for multi devices changes
This big patch adds support for: - changing NIC/disks in the multi-device model - adding/removing NICs - adding/removing disks
The patch is big and not very nice; the error checking paths are notvery clear....
Slight change to the LU initialization code
This patch adds support for a separate LU.CheckArguments() method whichshould do syntactic checks without holding locks and without pollutingthe ExpandNames which is a lock-related function. See for example the...
Fix a bug in LUSetInstanceParams
The wrong names were reused in a copy-paste.
Show disk access mode in gnt-instance info
The mode parameter needs to be exported and shown in the info output.
Change _GenerateDiskTemplate iv_name generation
Currently the _GenerateDiskTemplate assumes it does initial creation ofdisks (i.e. it starts with index 0).
For dynamic disk adds, we need to pass an additional offset. This patchadds this offset and modifies its sole current caller....
Pass ssconf values from master to node
Instead of parsing the configuration on the node, we pass the ssconfvalues from the master.
ganeti.http: Don't reuse key and cert objects
Reusing the private key and certificate objects gave us problems. Thispatch changes the code to only cache the PEM data, but the objectsthemselves. For every socket, the private key and certificate objectsare created again....
Fix unittests broken by rev 2015
Ssconf files shouldn't be updated when running unittests.
ganeti.rpc: Read SSL certificate and key only once per request
There's no need to read the SSL certificate and key for every nodein a request. Also add a TODO for better error reporting.
Reviewed-by: amishchenko
Documentation updates for mcpu.py
This is the only change needed to make mcpu epydoc-compliant.
LUCreateInstance: Fix import mac AUTO mode
Previously on import LUCreateInstance used to recycle the mac if the instancename was the same than the one used at export time. Now we do the same, butapply the setting separately for each nic.
LUCreateInstance unlock all nodes mid-way
When creating a new instance, after saving the instance data to the config fileand creating the disks, but before waiting for sync and installing the OS, werelease the node locks, to allow for more instance creations to proceed in...
IAllocator: subtract down instances from free mem
Currently free_memory just reports the amount of free ram, as seen by thehypervisor. We adjust this amount by subtracting the memory for any instancewhich is down, and the difference for any instance which is configured to have...
Correct GetAllInstancesInfo rtype
GetAllInstancesInfo, in the backend, returns just a dict, not a dict of dicts.
IAllocator: use the right hypervisor
Since the hypervisor is instance dependent we'll get one on instance creation,and use the one in the instance config on relocation.
IAllocator: fill i_list in a more proper way
- reuse the previously called cluster_info, rather than calling it again- get all the instances from the config atomically, to prevent race conditions- use a list comprehension, for simplicity
Parallelize instance operations on the same node
With static minors we don't have a race condition anymore whenstarting/stopping/rebooting/reinstalling more than one instance on the samenode, so we'll drop node locking altogether.
Convert iallocator to the new _ComputeDiskSize
_ComputeDiskSize's API was changed for multidisk support in r2010, butiallocator's call to it were not fixed. Converting them now.
Documentation updates for cmdlib.py
This makes cmdlib.py not throw epydoc errors anymore.
Only update ssconf on cluster serial change
There is no need to update ssconf if the cluster serial number has notchanged.
Enable auto-unit formatting in script output
This patch enables by default the old 'human-readable' option, but in aslightly different model.
The option is now called "units" and takes either: - 'h' for automatic formatting - 'm', 'g' or 't' for mebi/gibi/tebibytes...
RAPI: Cancel a job
Make cli.py use FieldSet for matching fields
This changes cli.py to FieldSet usage so that gnt-instance list willformat nicely the disk.size/*, and the count of disks/nics.
Move FieldSet class to utils.py
Since we can use the FieldSet class in cli.py to nicely format disksizes and such, we move it to utils.py and also move its associatedunittest. I didn't remove the cmdlib.py unittest file as that's not thegood direction :)...
Change disk index validation to FindDisk
This patch replaces the hand validation of the disk index with theinstance.FindDisk method (actually reverting to previous method, beforethe multi-disk, but now with indexes).
Change GrowDisk to work with multi-disk
This patch changes the instance.FindDisk method to take index arguments(instead of iv_names), and changes GrowDisk and list instancesaccordingly.
Use SSL for master/node RPC
This patch enables SSL between masterd and noded.
Get rid of node daemon password
With the new SSL client certificate stuff it's no longer needed.
ganeti.http: Add another class to contain SSL key and certificate
Otherwise we would read them for every request the HTTP clientmakes against a server and this is not needed.
Reuse HTTP client pool for RPC
ganeti-masterd: Add initialization and shutdown of RPC pool. It needsto be shutdown before forking.
ganeti.cli: Add decorator function to initialize and shutdown RPC pool.
ganeti.rpc: Add functions to initialize and shutdown RPC pool. Throw...
Write ssconf files when updating configuration
Add RPC call to update ssconf files
Change replace secondary to work with multi-disk
Also fix an error in the CheckPrereq.
ganeti.ssconf: Add function to write ssconf files
This function will be used to write ssconf files from the node daemon.By creating a lock file, we synchronize different child processes ofganeti-noded to not overwrite each other's changes. Also, external...
Convert replace-disks (same nodes) to multi-disk
This patch changes the drbd8 replace disk only (no secondary change) towork in with multi-disk. This mode of replaces works correctly withreplacing only a subset of disks.
Initial multi-disk/multi-nic support
This patch adds support for mult-disk/multi-nic in: - instance add - burnin
The start/stop/failover/cluster verify work as expected. Replace diskand grow disk are TODO.
There's also a change gnt-job to allow dictionaries to be listed in...
Add more disk/nic listing options in gnt-instance
This adds some more listing cases (useful for scripting/rapi): - disk.sizes for a list of all sizes - nic.(ips|macs|bridges)
Change Xen hypervisor to not use iv_name
Currently the iv_name is very linux-specific, and will break with themulti-disk changes.
The patch changes this to generate sdX names based on the disk index inthe disks structure, instead of relying on the iv_name....
ganeti.rpc: Use central functions for actual RPC calls
Before we had lots, lots and lots of code duplication. This patchchanges the code to use four central functions.
Make HttpClientManager threadsafe
This allows a single HttpClientManager to be used from more than onethread at the same time. We discussed having one HttpClientManagerper job queue thread. Assuming there should be one HTTP thread pernode, this would mean quadratic growth with the number of nodes. By...
HTTP server: Do not decode empty entity body
RAPI: Instance modify.
Split parameters filter to the separate function and reuse it in instance creation.
Allow querying of variable number of parameters
This patch adds support for querying in gnt-instance list of: - disk.count - nic.count - disk.size/$N - nic.(ip|mac|bridge)/$N
The patch also disables the exception raised when the header description...
Convert cmdlib.py to _FieldSet
This patch converts the current usage of _CheckOutputFields to theFieldSet class, but it doesn't start to use its variable matchingfeatures.
Add a FieldSet class for variable parameter sets
This patch adds a _FieldSet class that can be used for the new variableparameter sets: e.g. the sda_size will change to disk/0.size (orsimilar) and we need to both check validity and extract the index of the...