Pass ssconf values from master to node
Instead of parsing the configuration on the node, we pass the ssconfvalues from the master.
Reviewed-by: iustinp
Use SSL for master/node RPC
This patch enables SSL between masterd and noded.
Get rid of node daemon password
With the new SSL client certificate stuff it's no longer needed.
ganeti-masterd: Remove PID file at the end
Removing the PID file should be the last thing done. This patch makessure it's also removed when master.server_cleanup() throws an exception.
Also initialize logging only after writing the PID file.
Reuse HTTP client pool for RPC
ganeti-masterd: Add initialization and shutdown of RPC pool. It needsto be shutdown before forking.
ganeti.cli: Add decorator function to initialize and shutdown RPC pool.
ganeti.rpc: Add functions to initialize and shutdown RPC pool. Throw...
Add RPC call to update ssconf files
Abstract runtime creation of dirs into a function
Currently the dir creation in ganeti-noded is in the main function. Thisis not nice: we move it into a separate function and also add creationof the OS_LOG_DIR (with different permissions, but in the same way)....
Document HttpServer.__init__
At the same time, simplify the interface a bit by not using a tuple.
Reviewed-by: killerfoxi, ultrotter
Export the disk index in the import/export scripts
We want to export the disk index as some OSes will only want to exportthe first disk (or the second one, etc.), even if we have multipledisks.
The patch also updates the backend.ExportSnapshot docstring....
Convert ImportOSIntoInstance to OS API 10
- Change ImportOSIntoInstance not to get any "os_disk" and "swap_disk" arguments but to accept multiple target images to import, and to return a list of booleans with the result of each import- Change the relevant rpc call and the only caller to conform...
Pass request headers in to RAPI handlers.
Convert the job queue rpcs to address-based
The two main multi-node job queue RPC calls (jobqueue_update,jobqueue_rename) are converted to address-based calls, in order to speedup queue changes. For this, we need to change the _nodes attribute onthe jobqueue to be a dict {name: ip}, instead of a set....
Remove the logger.py module
Since now we use only one function from the logger module(SetupLogging), we move it to utils.py (which is already imported by allusers of this function), and we remove the module.
Reviewed-by: imsnah
Cleanup os_add/rename rpc for OS API 10
- remove now unused osdev and swapdev arguments from backend, noded, rpc, cmdlib- convert docstrings to epydoc
ETag passing support.
rapi: Convert to new HTTP server class
Requests are no longer logged to a separate file.
Reviewed-by: amishchenko
Improvements to the master startup checks
In order to account for future improvements to master failover, we movethe actual data gathering capabilities from ganeti-masterd intobootstrap.py, and we leave only the verification into masterd.
The verification procedure is then changed to retry multiple times (up...
Add an interface for the drain flag changes/query
This adds the set/reset in the jqueue and luxi modules, and a way toquery it in OpQueryConfigValues, and also the comand line interface forit:$ gnt-cluster queue infoThe drain flag is unset$ gnt-cluster queue drain...
Add a rpc call for changing the drain flag
A new multi-node call is added that sets/resets the drain flag.
Implement transport of ganeti errors across luxi
This patch adds a generic method to identify the ganeti error given itsclass name, and implements this across the luxi protocol.
rapi: Whitespace fixes
Reviewed-by: ultrotter
Export the hypervisor.ValidateParameters over RPC
The newly-added node-specific ValidateParams hypervisor method isexported over RPC, using the semi-standard (success, message) returnvalue. Multi-node call, so that we call on both primary and secondary at...
Fix a few rpc-related errors
This fixes: - whitespace change, double lines between methods - duplication of call_upload_file, introduced by mistake in rev 1795 and which went undetected because of the many changes in that ref (only diff -b shows it clearly)...
Abstract checking own address into a function
Currently, we check if we have a given ip address (i.e. it's alive onone of our interfaces) but manually calling TcpPing(source=localhost).This works, but having it spread all over the code makes it hard to...
Convert ganeti-noded to new HTTP server class
Convert rpc module to RpcRunner
This big patch changes the call model used in internode-rpc fromstandalong function calls in the rpc module to via a RpcRunner class,that holds all the methods. This can be used in the future to enablesmarter processing in the RPC layer itself (some quick examples are not...
Move the hypervisor attribute to the instances
This (big) patch moves the hypervisor type from the cluster to theinstance level; the cluster attribute remains as the default hypervisor,and will be renamed accordingly in a next patch. The cluster also gains...
rpc.call_instance_migrate: pass the whole instance
Currently the call_instance_migrate call only passes the instance name;we need to pass the whole object for the hypervisor_type changes (allthe other individual instance rpc calls already pass the instance...
Implement job 'waiting' status
Background: when we have multiple jobs in the queue (more than just afew), many of the jobs (up to the number of threads) will be in state'running', although many of them could be actually blocked, waiting forsome locks. This is not good, as one cannot easily see what is...
Implement job auto-archiving
This patch adds a new luxi call that implements auto-archiving of jobsolder than a certain age (or -1 for all completed jobs), and the gnt-jobcommand that makes use of this (with 'all' for -1).
backend.py change to get cluster name from master
Currently there are three function in backend that need the cluster namein order to instantiate an SshRunner. The patch changes these to get thecluster name from the master in the rpc call; once the multi-hypervisor...
Convert ganeti-master
Use simpleconfig instead of ssconf.
Convert ganeti-watcher
Use RPC calls instead of ssconf.
Convert ganeti-noded
Replace ssconf with utility functions.
Add new query to get cluster config values
This can be used to retrieve certain cluster config values fromwithin clients.
OpDumpClusterConfig was not used anywhere, hence I'm just reusingit. The way ConfigWriter.DumpConfig returned the configurationwas not thread-safe, anyway (no deepcopy)....
Fix the watcher with down nodes
The watcher didn't handle the down nodes, fix this by ignoring (insecondary node reboot checks) any node that doesn't return a boot id.
Fix the watcher not restarting instance bug
The watcher was using conflicting attributes of the instance: - it queried the admin_/oper_state, which are booleans - but it compared those to the status (which is a text field)
The code was changed to query the aggregated 'status' field, as that...
Remove last use of utils.RunCmd from the watcher
The watcher has one last use of ganeti commands as opposed to sendingrequests via luxi. The patch changes this to use the cli functions.
The patch also has two other changes: - fix the docstring for OpVerifyDisks (found out while converting...
ganeti-noded: Add constant for queue lock timeout
Implement master startup safety check
This is an initial version of the master startup checks. It's a veryrudimentary change, however in normal usage (an old master was started,the rest of the cluster is functioning normally) it will succeed inpreventing wrong startups....
Export backend.GetMasterInfo over the rpc layer
We create a multi-node call so that querying all nodes for agreementwill be fast.
Use lock timeout for queue updates in ganeti-noded
This helps to prevent complete deadlocks.
noded: Get job queue lock while purging queue content
Only one process should modify the queue at the same time.
Make WaitForJobChanges deal with long jobs
This patch alters the WaitForJobChanges luxi-RPC call to have aconfigurable timeout, so that the call behaves nicely with long jobsthat have no update.
We do this by adding a timeout parameter in the RPC call, and returning...
Make sure that client programs get all messages
This is a large patch, but I can't figure out how to split it withoutbreaking stuff. The old way of getting messages by always getting thelast one didn't bring all messages to the client if they were added...
Use Linux-specific way to name master socket
By using this Linux-specific way we don't have to care about removing thesocket file when quitting or starting (after an unclean shutdown). For amore detailed description, see the comment in the patch.
Reviewed-by: schreiberal
Add RPC call to wait for job changes
This way clients can react faster to status or message changes anddon't have to poll anymore.
Add query function for exports
noded: Add RPC function to rename job queue files
This will be used to archive jobs.
noded: Add decorator for job queue lock
The lock will also be needed by another function.
Implement queue locking in node daemon
More logging for errors during noded RPC calls
Add job queue RPC functions
jobqueue_update: Uploads a job queue file's content to a node. Themost common operation is to upload something that we already havein a string. Unlike in the upload_file function, the file is notread again when distributing changes, but content has to be passed...
Use API instead of command line utilities in watcher
Notify job queue about added/removed nodes
The job queue maintains its own node list and must be notifiedwhen nodes are added/removed.
Implement {Add,Readd,Remove}Node in GanetiContext
By doing this we've a central place which coordinates what needs to bedone when adding or removing nodes. Another patch will add calls intothe job queue.
Two log messages move to config.py.
When removing a node, node_leave_cluster is now called after it has...
jqueue: Don't pass the list of nodes to SubmitJob anymore
The job queue now maintains its own list and is updated whennodes are added or removed from the cluster.
masterd: Move job queue into context object
The job queue must be called from cmdlib when adding or removingnodes to the cluster. Moving it to the context objects makesthis possible.
Implement query for nodes
Implement query for instances
Queries don't create jobs and are more efficient. Log messagesare not yet stored anywhere.
First write operation (add tag) for Ganeti RAPI
Add instance tag handling, improved error logging....oh, yes adopt instance listing for RAPI2!
Unify SetupDaemon/SetupLogging
The 'old-style' info, error, debug logs do not make much sense. Thispatch unifies the SetupLogging and SetupDaemon functions. As a result,all the commands logs to a 'commands.log' file.
The patch also changes the log setup to keep going if there's an error...
Rework master startup/shutdown/failover
This (big) patch reworks the master startup/shutdown and the fixes themaster failover.
What does the patch do?
For master start/stop: - remove the old ganeti-master script and its associated man page - moves the ip start/stop directly into the backend.(Start|Stop)Master...
Implement checking for the master role in rapi
This patch moves the CheckMaster function from ganeti-masterd to ssconf(most logical place, it cannot go in utils since we would have recursiveimports between ssconf and utils) and changes ganeti-rapi to also call...
Add a new parameter to backend.(Start|Stop)Master
This patch adds a new, unused for now, parameter to the start and stopmaster operations in backend. The idea behind it is that we need to beable to control whether the IP (de)activation is coupled with daemon...
Use constants for the pid file stems
Make the rapi daemon create a pidfile
This is needed for controlling it cleanly with start-stop daemon.
Implement signal handling in ganeti-rapi
Move ganeti-rapi core code to daemon
All other daemons have their main code in themselves and not in a module.This patch does the same to ganeti-rapi by moving the code fromlib/rapi/RESTHTTPServer.py to daemons/ganeti-rapi.
Fix RPC parameters for {Cancel,Archive}Job
They aren't be tuples on the client side.
ganeti-masterd: write and remove pidfile
ganeti-noded: write and remove pid file
Distribute the queue serial file after each update
This patch adds distribution of the queue serial file after each writeto it (but before a new job is created and written with that ID, andbefore a response is returned, so we should be safe from crashes in...
Handle signals in node daemon
This also fixes a TODO added by ultrotter by killing the parentprocess when QuitGanetiException is raised.
Use new signal handler class in master daemon
Breath life in to RAPI for trunk
Fork ganeti-noded
Create a new ForkingHTTPServer in ganeti-noded by deriving both fromNodeDaemonHttpServer and ForkingMixin. This will allow us to processconcurrent requests.
Fix previous patch using workerpool in masterd
The function to stop a worker pool is TerminateWorkers(), not Shutdown().
Use workerpool in master daemon
Reusing threads instead of starting one for each request is more efficient.
Use new HTTP server classes in ganeti-noded
Initial copy of RAPI filebase to the trunk
Remove more old job queue code
Apparently I forgot to this code when removing the rest.
Move watcher's LockFile function to utils
Fix double-logging in daemons
Currently, in debug mode, both the logfile handler and the stderrhandler will log debug messages. Since the stderr is redirected to thesame logfile (to catch non-logged errors), it means log entries aredoubled.
The patch adds an extra parameter to the logger.SetupDaemon() function...
ganeti-noded logging improvements
The patch adds some more logging to the node daemon:
- log methods at beggining not only at the end- log method parameters (they are very verbose, but useful)
A separate change is to initialize the global variable in the global...
Remove the old locking functions
This removes (hopefully) all traces of the old locking functions anduses.
Remove old job queue code
Change masterd/client RPC protocol
- Introduce abstraction class on client side- Use constants for method names- Adopt legacy function SubmitOpCode to use it
Make luxi RPC more flexible
- Use constants for dict entries- Handle exceptions on server side- Rename client function to CallMethod to match server side naming
Instantiate new job queue in master daemon
Create all SUB_RUN_DIRS in ganeti-noded
Rather than just creating BDEV_CACHE_DIR we loop through theSUB_RUN_DIRS list and create all its childs.
Fix some issues with the watcher
This patch fixes two bugs: - the state file is not saved because we use the method for checking for udpated data - in two places 'Error' was used instead of 'Exception', which breaks error handling
Additionally:...
Add custom logging setup for daemons
It's better for daemons if: - they log only to one log file - the log level is included - for debug runs, the filename/line number is included
This patch moves the custom formatter from the watcher to the logging...
ganeti-masterd: Remove unused locking code
Reviewed-by: iustinp, ultrotter
ganeti-masterd: Use logging module
Reviewed-by: ultrotter, iustinp
Context: s/GLM/glm/
Make the GanetiLockManager instance of GanetiContext lowercase
Increase the thread size to 5
Now that we use the locking library to make sure running opcodes cannotstep on each other toes we can have a bigger thread size, andpotentially process many opcodes in a parallel manner.
Processor: pass context in and use it.
The processor used to create a new ConfigWriter when it was initialized.We now have one in the context, so we'll just recycle it. First of allwe'll pass the context in when creating a new Processor object, thenwe'll just use context.cfg, which is granted to be initialized, wherever...
ganeti-masterd: init and distribute common context
This patch creates a new GanetiContext class, which is used to holdcontext common to all ganeti worker threads. As for theGanetiLockingManager class it is paramount that there is only one suchclass throughout the execution of Ganeti, so the class checks for that,...
ganeti-noded: Fix handling of QuitGanetiException
- s/GanetiQuitException/QuitGanetiException/- Look for the arguments in err.args, not err itself