Fix the watcher with down nodes
The watcher didn't handle the down nodes, fix this by ignoring (insecondary node reboot checks) any node that doesn't return a boot id.
Reviewed-by: imsnah
Fix the watcher not restarting instance bug
The watcher was using conflicting attributes of the instance: - it queried the admin_/oper_state, which are booleans - but it compared those to the status (which is a text field)
The code was changed to query the aggregated 'status' field, as that...
Remove last use of utils.RunCmd from the watcher
The watcher has one last use of ganeti commands as opposed to sendingrequests via luxi. The patch changes this to use the cli functions.
The patch also has two other changes: - fix the docstring for OpVerifyDisks (found out while converting...
ganeti-noded: Add constant for queue lock timeout
Reviewed-by: iustinp
Implement master startup safety check
This is an initial version of the master startup checks. It's a veryrudimentary change, however in normal usage (an old master was started,the rest of the cluster is functioning normally) it will succeed inpreventing wrong startups....
Export backend.GetMasterInfo over the rpc layer
We create a multi-node call so that querying all nodes for agreementwill be fast.
Use lock timeout for queue updates in ganeti-noded
This helps to prevent complete deadlocks.
noded: Get job queue lock while purging queue content
Only one process should modify the queue at the same time.
Make WaitForJobChanges deal with long jobs
This patch alters the WaitForJobChanges luxi-RPC call to have aconfigurable timeout, so that the call behaves nicely with long jobsthat have no update.
We do this by adding a timeout parameter in the RPC call, and returning...
Make sure that client programs get all messages
This is a large patch, but I can't figure out how to split it withoutbreaking stuff. The old way of getting messages by always getting thelast one didn't bring all messages to the client if they were added...
View revisions
Also available in: Atom