History | View | Annotate | Download (12.9 kB)
Remove the logger.py module
Since now we use only one function from the logger module(SetupLogging), we move it to utils.py (which is already imported by allusers of this function), and we remove the module.
Reviewed-by: imsnah
Improvements to the master startup checks
In order to account for future improvements to master failover, we movethe actual data gathering capabilities from ganeti-masterd intobootstrap.py, and we leave only the verification into masterd.
The verification procedure is then changed to retry multiple times (up...
Add an interface for the drain flag changes/query
This adds the set/reset in the jqueue and luxi modules, and a way toquery it in OpQueryConfigValues, and also the comand line interface forit:$ gnt-cluster queue infoThe drain flag is unset$ gnt-cluster queue drain...
Implement transport of ganeti errors across luxi
This patch adds a generic method to identify the ganeti error given itsclass name, and implements this across the luxi protocol.
Convert rpc module to RpcRunner
This big patch changes the call model used in internode-rpc fromstandalong function calls in the rpc module to via a RpcRunner class,that holds all the methods. This can be used in the future to enablesmarter processing in the RPC layer itself (some quick examples are not...
Implement job 'waiting' status
Background: when we have multiple jobs in the queue (more than just afew), many of the jobs (up to the number of threads) will be in state'running', although many of them could be actually blocked, waiting forsome locks. This is not good, as one cannot easily see what is...
Implement job auto-archiving
This patch adds a new luxi call that implements auto-archiving of jobsolder than a certain age (or -1 for all completed jobs), and the gnt-jobcommand that makes use of this (with 'all' for -1).
Convert ganeti-master
Use simpleconfig instead of ssconf.
Reviewed-by: iustinp
Add new query to get cluster config values
This can be used to retrieve certain cluster config values fromwithin clients.
OpDumpClusterConfig was not used anywhere, hence I'm just reusingit. The way ConfigWriter.DumpConfig returned the configurationwas not thread-safe, anyway (no deepcopy)....
Implement master startup safety check
This is an initial version of the master startup checks. It's a veryrudimentary change, however in normal usage (an old master was started,the rest of the cluster is functioning normally) it will succeed inpreventing wrong startups....
Make WaitForJobChanges deal with long jobs
This patch alters the WaitForJobChanges luxi-RPC call to have aconfigurable timeout, so that the call behaves nicely with long jobsthat have no update.
We do this by adding a timeout parameter in the RPC call, and returning...
Make sure that client programs get all messages
This is a large patch, but I can't figure out how to split it withoutbreaking stuff. The old way of getting messages by always getting thelast one didn't bring all messages to the client if they were added...
Use Linux-specific way to name master socket
By using this Linux-specific way we don't have to care about removing thesocket file when quitting or starting (after an unclean shutdown). For amore detailed description, see the comment in the patch.
Reviewed-by: schreiberal
Add RPC call to wait for job changes
This way clients can react faster to status or message changes anddon't have to poll anymore.
Reviewed-by: ultrotter
Add query function for exports
Notify job queue about added/removed nodes
The job queue maintains its own node list and must be notifiedwhen nodes are added/removed.
Implement {Add,Readd,Remove}Node in GanetiContext
By doing this we've a central place which coordinates what needs to bedone when adding or removing nodes. Another patch will add calls intothe job queue.
Two log messages move to config.py.
When removing a node, node_leave_cluster is now called after it has...
jqueue: Don't pass the list of nodes to SubmitJob anymore
The job queue now maintains its own list and is updated whennodes are added or removed from the cluster.
masterd: Move job queue into context object
The job queue must be called from cmdlib when adding or removingnodes to the cluster. Moving it to the context objects makesthis possible.
Implement query for nodes
Implement query for instances
Queries don't create jobs and are more efficient. Log messagesare not yet stored anywhere.
Unify SetupDaemon/SetupLogging
The 'old-style' info, error, debug logs do not make much sense. Thispatch unifies the SetupLogging and SetupDaemon functions. As a result,all the commands logs to a 'commands.log' file.
The patch also changes the log setup to keep going if there's an error...
Rework master startup/shutdown/failover
This (big) patch reworks the master startup/shutdown and the fixes themaster failover.
What does the patch do?
For master start/stop: - remove the old ganeti-master script and its associated man page - moves the ip start/stop directly into the backend.(Start|Stop)Master...
Implement checking for the master role in rapi
This patch moves the CheckMaster function from ganeti-masterd to ssconf(most logical place, it cannot go in utils since we would have recursiveimports between ssconf and utils) and changes ganeti-rapi to also call...
Use constants for the pid file stems
Fix RPC parameters for {Cancel,Archive}Job
They aren't be tuples on the client side.
ganeti-masterd: write and remove pidfile
Distribute the queue serial file after each update
This patch adds distribution of the queue serial file after each writeto it (but before a new job is created and written with that ID, andbefore a response is returned, so we should be safe from crashes in...
Use new signal handler class in master daemon
Fix previous patch using workerpool in masterd
The function to stop a worker pool is TerminateWorkers(), not Shutdown().
Use workerpool in master daemon
Reusing threads instead of starting one for each request is more efficient.
Remove more old job queue code
Apparently I forgot to this code when removing the rest.
Fix double-logging in daemons
Currently, in debug mode, both the logfile handler and the stderrhandler will log debug messages. Since the stderr is redirected to thesame logfile (to catch non-logged errors), it means log entries aredoubled.
The patch adds an extra parameter to the logger.SetupDaemon() function...
Remove the old locking functions
This removes (hopefully) all traces of the old locking functions anduses.
Remove old job queue code
Change masterd/client RPC protocol
- Introduce abstraction class on client side- Use constants for method names- Adopt legacy function SubmitOpCode to use it
Make luxi RPC more flexible
- Use constants for dict entries- Handle exceptions on server side- Rename client function to CallMethod to match server side naming
Instantiate new job queue in master daemon
Add custom logging setup for daemons
It's better for daemons if: - they log only to one log file - the log level is included - for debug runs, the filename/line number is included
This patch moves the custom formatter from the watcher to the logging...
ganeti-masterd: Remove unused locking code
Reviewed-by: iustinp, ultrotter
ganeti-masterd: Use logging module
Reviewed-by: ultrotter, iustinp
Context: s/GLM/glm/
Make the GanetiLockManager instance of GanetiContext lowercase
Increase the thread size to 5
Now that we use the locking library to make sure running opcodes cannotstep on each other toes we can have a bigger thread size, andpotentially process many opcodes in a parallel manner.
Processor: pass context in and use it.
The processor used to create a new ConfigWriter when it was initialized.We now have one in the context, so we'll just recycle it. First of allwe'll pass the context in when creating a new Processor object, thenwe'll just use context.cfg, which is granted to be initialized, wherever...
ganeti-masterd: init and distribute common context
This patch creates a new GanetiContext class, which is used to holdcontext common to all ganeti worker threads. As for theGanetiLockingManager class it is paramount that there is only one suchclass throughout the execution of Ganeti, so the class checks for that,...
Handle any exception in ganeti-masterd
If an uncaught exception is thrown currently it destroys the callingthread. This patch changes the behaviour to failing the current job,logging a message, but trying to keep the daemon up.
Fail job on ganeti exceptions
When a Job raises a ganeti exception a message is printed but nothing isreported in the job itself. It's better to update the job status, thusnotifying the client, possibly polling for the job result, of what wentwrong.
ganeti-masterd: Some docstrings work
- Add a docstring to IOServer's constructor- Add argument description to PoolWorker's and JobRunner's ones
Disable forking in the master daemon
This patch adds a mechanism to disable utils.RunCmd in selectedprograms. This is needed in the master daemon unless we confirmthreading doesn't pose any problems.
This makes cluster init fail, but creating new trunk clusters is anyway...
Move the 'cmd' lock from cli.py to ganeti-masterd
This patch removes the lock and the lock options from cli.py and movesthem to the master.
Later during development we can remove it completely, but for now it'sgood to protect any other tool that uses the lock directly....
Convert cli.SubmitOpCode to use the master
This patch converts the cli.py SubmitOpCode method to use the unixprotocol and thus execute the opcodes via the master.
The patch allows a partial burnin to work with the master. Currently thequery opcodes, since they are executed via the SubmitOpCode, are...
Add per-opcode results to job processing
This patch changes the definition of a job and introduces per-opcoderesults.
First, the result and status fields of a job are condensed into a single'status' attribute. Then, we introduce an opcode status and one result...
Implement forking/master role checking in masterd
This patch adds checks for the master role and daemonize support toganeti-masterd.
The patch modifies the startup/shutdown of the server because: - we want bind()/listen() to the master socket to occur before forking...
Add a simple gnt-job script
This patch adds a very basic gnt-job script that allows job querying.This goes on top of the previous master daemon patches.
Currently, because of the not-changed cmd lock, you can't query the jobsas long as a job is running - you have to rm the cmd lock and then you...
Initial tests with ganeti-masterd
This patch adds a very in-progress master daemon. This needs to belaunched manually, does not background itself, but can be used foropcode execution.
Also parts of this code should be moved to luxi.py.