Add constants for local disk status
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Handle None result from BlockdevFind
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
objects.BlockDevStatus: Remove ToLegacyStatus
Add master candidates IPs informations to ssconf
This will be used when querying confd, in order not to rely on DNS beingavailable.
Signed-off-by: Luca Bigliardi <shammash@google.com>Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
ConfigObject.ToDict() only export non-None values
The method is changed to a normal loop, to avoid calling getattr()twice. Also getstate is changed to just use ToDict() by default.
This should also make getstate work for objects which have tooverride the ToDict function because they contain other objects....
Add nodes IPs informations to ssconf
Having a list of primary/secondary IPs of all the nodes in ssconf can be usefulfor scripts/hooks which need to automatically configure network properties forthe whole cluster (e.g.: ipsec/netfilter rules) without relying on a...
serializer: fix a few docstrings
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Use objects for blockdev_getmirrorstatus RPC call result
This patch changes the return type for backend.BlockdevGetmirrorstatus froma list of tuples to a list of objects.BlockDevStatus instances.
Use object for blockdev_find RPC call result
This patch changes the return type for backend.BlockdevFind to an object(objects.BlockDevStatus). Before a tuple was used. Adding more values tothis tuple causes a lot of work. Converting the result to an object with...
cmdlib: Fix parameters for storage.FileStorage
It wants a list of directories, not a string.
cmdlib: Add opcode to modify storage unit fields
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Add RPC calls to modify storage fields
storage: Add function to modify fields
This allows the “allocatable” flag on LVM PVs to be changed.
Merge commit 'origin/branch-2.1' into feature/containers
Add new opcode to list physical volumes
storage: Use constants.py instead of local constants
storage: Fix semantics for directory size
The actual directory size is "used" space, not the total space onthe filesystem.
Merge branch 'next' into branch-2.1
jqueue: Fix error when WaitForJobChange gets invalid ID
When JobQueue.WaitForJobChange gets an invalid or no longer existing job ID ittries to return job_info and log_entries, both of which aren't defined yet.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
jqueue: Update message for cancelling running job
cmdlib: Change tasklet logging to debug level
rapi: Add /2/nodes/[node_name]/migrate resource
cmdlib: Add new opcode to migrate node
It migrates all primary instances from the node to their secondaries.
rapi: Add default parameter to _checkIntVariable
cmdlib: Add logging for tasklets
cmdlib: Fix tasklets handling if no tasklets are added
If no tasklets are added, self.tasklets evaluates to None. The LU baseclass will throw an exception because it thinks the derived class doesn'timplement the right methods.
rapi: Add /2/[node_name]/evacuate resource
This can be used to evacuate a node.
Add RPC calls for storage unit list
Add first implementation of generic storage unit framework
utils: Add functions to calc directory size and free space on filesystem
These will be used by the new storage unit framework.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Collapse SSL key checking/overriding for daemons
Signed-off-by: Guido Trotter <ultrotter@google.com>
Collapse daemon's main function
With three ganeti daemons, and one or two more coming, the daemon's mainfunction started becoming too much cut&pasted code. Collapsing most ofit in a daemon.GenericMain function. Some more code could be collapsedbetween the two http-based daemons, but since the new daemons won't be...
Slightly abstract the daemon logfile lookup
The original LOG_<DAEMON_NAME> constants for daemon logfiles are gone.In their place there is a DAEMONS_LOGFILES dict, indexed by daemon name.
This is a minor change with the objective to uniform most of thedaemon's main() functions code, which is very similar one to the other....
Remove <DAEMON>_PID constants
The <DAEMON>_PID constants were created to reference a daemon pid file,but actually contain a daemon's name, because the various functions thatwork with pidfiles abstract the filename from the daemon namethemselves. Removing the constants and using the actual daemon name...
Move rapi to GetDaemonPort
Currently rapi is the only daemon which accepts a port option, ratherthan querying its own port from services, and failing back to thedefault if not found. Changing this to conform to what other daemons do.
Also update the ganeti-rapi(8) manpage...
Change GetNodeDaemonPort to GetDaemonPort in utils
GetNodeDaemonPort is used to lookup the node daemon port in the servicesfile, and if not found to return the default one. We make it a genericfunction, which accepts the daemon name in input, so that it can be used...
Get rid of constants.RAPI_ENABLE
This constant is unused, except in qa. Removing it since it's always True.
This patch also removes the unused qa_rapi.PrintRemoteAPIWarningfunction, and removes a comment about temporary constants "until we havecluster parameters"....
cmdlib: Add init to Tasklet class
Remove references to utils.debug
Various modules set it to True when called in debugging mode, but theutils module supports no such global.
cmdlib: Move LUMigrateInstance functionality to tasklet
Add new opcode to evacuate nodes
cmdlib: Convert _DiskReplacer to tasklet
cmdlib: Function to get all secondary instances on a certain node
noded: Abstract hard-coded sys.exit value
On machines without the ssl file noded exists '5'.Changing this to constants.EXIT_NOTCLUSTER.
Also utils.GetNodeDaemonPort hasn't risen errors.ConfigurationError fora while, so removing that try/except block....
cmdlib: Add tasklet support to logical unit base class
cmdlib: Add tasklet base class
Generate a shared HMAC key at cluster init time
This key is shared on all nodes (via cmdlib._RedistributeAncillaryFiles)and will be used for HMAC authentication of confd messages.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
cmdlib: Move code doing disk replacements into separate class
This class will be used for a new opcode to evacuate nodes.
cmdlib: Pass config and rpc objects directly to IAllocator
Before IAllocator would access them using “self.lu.cfg” and “self.lu.rpc”.It shouldn't know about the internals of the LU.
Fix backend import errors from GetHypervisorClass
The merge of commit 360b0dc into branch-2.1 broke import of backend,since it uses hypervisor.GetHypervisor() which returns an instance ofthe hypervisor. Some of the hypervisors create directories at init time,...
Conflicts: lib/backend.py: non-trivial conflict but easy to solve
backend: Only build once the list of upload files
The list of upload files is built currently at every UploadFile() call.This patch moves it to a separate variable which is initialized onlyonce.
This won't make much difference but I regard it as cleanup....
Merge commit 'origin/next' into branch-2.1
Conflicts: lib/cli.py: trivial extra empty line
Fix a couple of epydoc warnings
It seems epydoc needs fully-qualified references, and doesn't deal withrelative ones (not even in the current module) if there are anyambiguities.
There are other epydoc warnings, in the rapi docstrings, but those areleft as-is as they're removed in 2.1....
job queue: fix loss of finalized opcode result
Currently, unclean master daemon shutdown overwrites all of a job'sopcode status and result with error/None. This is incorrect, since theany already finished opcode(s) should have their status and resultpreserved, and only not-yet-processed opcodes should be marked as...
Switch gnt-debug submit-job to JobExecutor
Currently gnt-debug submits jobs individually, but in 2.1 JobExecutoruses the optimized SubmitManyJobs luxi call and as such should be usedwhenever multiple jobs need to be submitted.
This patch converts gnt-debug submit-job to use it and also removes an...
Modify cli.JobExecutor to use SubmitManyJobs
This patch changes the generic "multiple job executor" to use the manyjobs submit model, which automatically makes all its users use the newmodel.
This makes, for example, startup/shutdown of a full cluster much more...
Add a luxi call for multi-job submit
As a workaround for the job submit timeouts that we have, this patchadds a new luxi call for multi-job submit; the advantage is that all thejobs are added in the queue and only after the workers can startprocessing them....
job queue: fix interrupted job processing
If a job with more than one opcodes is being processed, and the masterdaemon crashes between two opcodes, we have the first N opcodes markedsuccessful, and the rest marked as queued. This means that the overall...
Fix an error path in job queue worker's RunTask
In case the job fails, we try to set the job's run_op_idx to -1.However, this is a wrong variable, which wasn't detected until theslots addition. The correct variable is run_op_index.
Signed-off-by: Iustin Pop <iustin@google.com>...
Add slots on objects in jqueue
Adding slots to _QueuedOpCode decreases memory usage (of these objects)by roughly four times. It is a lesser change for _QueuedJobs.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Optimizie OpCode loading
This patch converts the opcode loading to a pre-built map (at importtime) instead of iteration over the globals dict at each call.
Microbenchmarks show that this should be around three times faster, andburnin still passes.
Yet another fallout from the pylint fixes
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Olivier Tharan <olive@google.com>
Merge branch 'master' into next
Fix another issue with hypervisor_name change
Add a few more checks to verify config
- Check that the enabled hypervisors list is valid- Check that the master node is a valid node
Make sure enabled_hypervisors list is valid
Get rid of the default_hypervisor slot
Currently we have both a default_hypervisor and an enabled_hypervisorslist. The former is only settable at cluster init time, while the lattercan be changed with cluster modify.
This becomes cumbersome in a few ways: at cluster init time for example...
cmdlib: Use dict.fromkeys instead of custom loop
Simplify InitConfig and remove SimpleConfigWriter
InitConfig currently creates the cluster config_data, then puts it intoa dict, passes it to SimpleConfigWriter to load it from a dict (whichjust reuses the dict value) and then saves it. The SimpleConfigWriter is...
InitCluster, don't use SimpleConfigWriter
InitConfig returns a SimpleConfigWriter to InitCluster, which thenpasses it on to ssh.WriteKnownHostsFile, which extracts a couple ofvalues from it. One line later the full ConfigWriter is initialized.
By initializing it one line before we can pass the full writer to...
Fix python 2.4 compatibility
I got overexcited and forgot we have to remain compatible with python2.4. With this patch we move from sha256 to sha1 for hmac authenticatedserialized messages, and we handle both newer and older python, byimporting the right module for each....
Use full-stripe size in LVM growth
LVM has issues when growing stripped volumes, so it's best to specifythe growth in exact multiples of the full stripe size (as precise aspossible). For this we need to do a couple of changes: - in LVM Attach(), we query additionally the VG extent size and the LV...
Remove ConfigWriter.InitConfig
It's been replaced by a simpler bootstrap.InitConfig function, whichdoes the same job, and is currently unused.
Conflicts:
daemons/ganeti-masterd...
Remove SimpleConfigWriter.SetMasterNode
This function is not used.
_GenerateDiskTemplate: use base_index in the name
Currently if a disk is added later the base_index is not considered, andall the disks are called disk0. This patch fixes it.
HMAC authenticated json messages
This patch includes HMAC authenticated json messages to the serializer.The new interface works on any json-encodable data type, and can sign itwith a private key and an optional salt. The same private key must beused upon message loading to verify the message....
rapi: Implement /2/nodes/[node_name]/role resource
This resource can be used to retrieve and set the role of a node.
rapi: Add generic “force” parameter
cmdlib: Fix typo in LUQueryClusterInfo
This was broken by my pylint fixes patch.
RAPI: implement instance reinstall
This patch adds instance reinstall to RAPI, with two optional parameters: - ‘os', in order to change the OS on reinstall - ‘nostartup’, in order to leave the instance down after reinstall
The call will first shutdown the instance, the reinstall it, and unless...
Extend call_node_start_master rpc with no_voting
When the parameter is set to True and start_daemons is also True,ganeti-masterd will be started with the new --no-voting --yes-do-itoptions.
This new option is set to True only on masterfailover, when no_voting is...
Create a new --no-voting option for masterfailover
This allows failing over in certain corner cases, such as a 2 nodecluster with one node down. The man page is also updated to documentthis dangerous option and how to recover from this situation.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
Fix pylint warnings
bootstrap: Don't leak file descriptor when generating SSL certificate
Fix problem with EAGAIN on socket connection in clients
If a user used ^Z to stop the program, poll() in socket.recv would returnEAGAIN due to SIGSTOP. This patch changes luxi.Transport.Recv to ignore EAGAIN.
Fix some typos
Increase maximum accepted size for a DRBD meta dev
With the change to stripped LVs, the actual size of a meta device (whichis small) can be more than we expected (for non-stripped LVs). Thispatch increases from 160MB to 1GB the accepted size, and updates the...
Cleanup config data when draining nodes
Currently, when draining nodes we reset their master candidate flag, butwe don't instruct them to demote themselves. This leads to “ERROR: file'/var/lib/ganeti/config.data' should not exist on non master candidates...
Fix node readd issues
This patch fixes a few node readd issues.
Currently, the node readd consists of two opcodes: - OpSetNodeParms, which resets the offline/drained flags - OpAddNode (with readd=True), which reconfigures the node
The problem is that between these two, the configuration is inconsistent...
backend.DemoteFromMC: don't fail for missing files
If the config file is missing when the DemoteFromMC() function iscalled, it will raise a ProgrammerError. Instead of changing theutils.CreateBackup() file which is called from multiple places, for nowwe only change the DemoteFromMC() function to not call it if the file is...
Allow GetMasterCandidateStats to ignore some nodes
This patch modifies ConfigWriter.GetMasterCandidateStats to allow it toignore some nodes in the calculation, so that we can use it to predictcluster state without some nodes (which we know we will modify, and thus...
Fix error message for extra files on non MC nodes
Currently the message for extraneous files on non master candidates isconfusing, to say the least. This makes it hopefully more clear.