ganeti-confd: don't depend on the os log dir
ganeti-confd doesn't need to log anything related to os installations.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Make ganeti-watcher use the standard debug option
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Remove RpcResult.RemoteFailMsg completely
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
ganeti-confd: remove partial imports
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
ConfdAsyncUDPServer: handle signals at read time
Currently if a signal is delivered during an attempted read, anexception is logged in the logfile. There is no need for this, so wehandle this case explicitely.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
ConfdAsyncUDPServer: defer handling writes
Currently if we fail writing to the socket (perhaps because a signal wasdelivered) we lose the data we were sending. Although this is not toobad (it's udp, and data may get lost anyway) we try to avoid this by...
Abstract AsyncUDPSocket to daemon
This allows this extended asyncore+udp module to be used also in otherdaemons, and in the confd client library
ConfdAsyncUDPServer: fix a docstring
It refers to an older input variable
Add a magic fourcc code to confd packets
This will make it easier to change the protocol later on
ganeti-confd: explicitely log failed big sends
Make sure that if we try to send packages which are too big (whichshouldn't happen) this gets properly logged in the config file.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Luca Bigliardi <shammash@google.com>
Move fourcc packing/unpacking to main confd module
This way it can be used by the client as well
Keep lock status with every job
This can be useful for debugging locking problems.
Move OpCode processor callbacks into separate class
There are two major arguments for this:- There will be more callbacks (e.g. for lock debugging) and extending the parameter list is a lot of work.- In the jqueue module this allows us to keep per-job or per-opcode variables in...
Wrap lines over 80 characters
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Luca Bigliardi <shammash@google.com>
gnt-cluster watcher: Show more information
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
confd: avoid spamming the logfile
When confd is disabled we don't want to be noticed every timer interval.
Merge branch 'next' into branch-2.1
Move SimpleConfigReader creation to ConfdProcessor
This will be useful to make ConfdProcessor aware of a config failure,without quitting confd.
ConfdProcessor: add disabled state
This is a state the processor will get in, if it fails to load theconfig.
confd: start in polling mode
This allows us not to enable the inotify handler immediately, and thusto make it easier for us should the config file not exist at all.
Confd: don't fail if the config doesn't load
Rather than quitting we'll just continue to poll the config at a slowrate, hoping that sooner or later we'll get it back. This allows alsoworking on non-MC nodes, and smoothly transitioning from MC to non-MC,...
confd: s/confd_event_handler/inotify_handler/
In a case we don't encounter frequently (file modified but notoverwritten) the notify handler we use is called with a wrong name.
Add script to clean archived jobs after 21 days
Implement timers in confd
Timers are used both for checking for inotify failures, and for polling,should inotify notices become too frequent.
ConfdInotifyEventHandler.enable: use InotifyError
Rather than raising ConfdFatalError directlyConfdInotifyEventHandler.enable raises InotifyError should it not beable to configure inotify, allowing the caller to decide what to do.
ConfdInotifyEventHandler, move to a callback
ConfdInotifyEventHandler used to reload the config whenever anotification arrived. Moving to a callback system, so thatConfdConfigurationReloader can be responsible for that functionality.
Additionally the inotify class no longer reenables itself automatically,...
Move creation of inotify handler to a new class
This class will be responsible for managing inotify notifications,timers, and rate-limiting reloads. For now none of these features isimplemented. :)
ConfdInotifyEventHandler: add enable/disable
Make possible to enable and disable the inotify event handler. Theinotify handler will remain enabled, unless explicitely told to disableitself.
Move the luxi error handling into errors.py
Currently the luxi error handling is hardcoded as special encoding onthe masterd-side and special decoding on the client side. This patchmoves it to errors.py such that other parts of the code can reuse thesame encoding....
ganeti-watcher: Don't run if paused
Add file to pause watcher for a certain duration
This can be used during maintenance work.
ganeti-masterd: Master voting in separate process
One shouldn't fork a Python process after using threads. Mastervoting is done before forking (utils.Daemonize), but it also usesthreads. Hence it's now called from a separate process.
This patch also fixes the check function to actually exit if...
ganeti-masterd: Add helper to run function in separate process
This will be used to do the master voting.
Style fixes for ganeti-*
Add disk copy support at backend and the rpc level
This uses a simple 'dd if=… | ssh $target dd of=…' method, like theExportSnapshot (which uses the OS export; here we want full disk-levelcopy and not any FS-level changes).
Signed-off-by: Iustin Pop <iustin@google.com>...
Convert ganeti-confd to Mainloop
Now that mainloop is asyncore-enabled we can easily do that.
Convert ganeti-masterd to @utils.SignalHandled
Add RPC call for storage operations
Merge commit 'origin/next' into branch-2.1
Remove unused imports from confd files
confd.server and daemons/ganeti-confd import a few modules they don'tactually use. Clean them up.
Remove a few unused imports from noded/masterd
Signed-off-by: Guido Trotter <ultrotter@google.com>
Initial confd implementation
ganeti-confd is a simple asynchronous daemon, which listens on a UDPport, passes each packet to a processor, and sends back to the clientthe result.
It also listens on an inotify socket, in order to reload itsconfiguration when the ganeti config file changes....
Merge branch 'master' into next
Handle None result from BlockdevFind
Use objects for blockdev_getmirrorstatus RPC call result
This patch changes the return type for backend.BlockdevGetmirrorstatus froma list of tuples to a list of objects.BlockDevStatus instances.
Use object for blockdev_find RPC call result
This patch changes the return type for backend.BlockdevFind to an object(objects.BlockDevStatus). Before a tuple was used. Adding more values tothis tuple causes a lot of work. Converting the result to an object with...
rpc: add rpc call for getting disk size
Note that this exports the disk size as bdev returns it, in bytes. Thevalue will be converted to MiB in cmdlib.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Add RPC calls to modify storage fields
Add RPC calls for storage unit list
Extend call_node_start_master rpc with no_voting
When the parameter is set to True and start_daemons is also True,ganeti-masterd will be started with the new --no-voting --yes-do-itoptions.
This new option is set to True only on masterfailover, when no_voting is...
Collapse SSL key checking/overriding for daemons
Collapse daemon's main function
With three ganeti daemons, and one or two more coming, the daemon's mainfunction started becoming too much cut&pasted code. Collapsing most ofit in a daemon.GenericMain function. Some more code could be collapsedbetween the two http-based daemons, but since the new daemons won't be...
Slightly abstract the daemon logfile lookup
The original LOG_<DAEMON_NAME> constants for daemon logfiles are gone.In their place there is a DAEMONS_LOGFILES dict, indexed by daemon name.
This is a minor change with the objective to uniform most of thedaemon's main() functions code, which is very similar one to the other....
Remove <DAEMON>_PID constants
The <DAEMON>_PID constants were created to reference a daemon pid file,but actually contain a daemon's name, because the various functions thatwork with pidfiles abstract the filename from the daemon namethemselves. Removing the constants and using the actual daemon name...
Move rapi to GetDaemonPort
Currently rapi is the only daemon which accepts a port option, ratherthan querying its own port from services, and failing back to thedefault if not found. Changing this to conform to what other daemons do.
Also update the ganeti-rapi(8) manpage...
Change GetNodeDaemonPort to GetDaemonPort in utils
GetNodeDaemonPort is used to lookup the node daemon port in the servicesfile, and if not found to return the default one. We make it a genericfunction, which accepts the daemon name in input, so that it can be used...
Remove references to utils.debug
Various modules set it to True when called in debugging mode, but theutils module supports no such global.
ganeti-rapi, replace hardcoded exit value
substitute exit(1) with exit(constants.EXIT_FAILURE).Also fix a wrongly indented line.
Add the bind-address option to ganeti-rapi
noded: Abstract hard-coded sys.exit value
On machines without the ssl file noded exists '5'.Changing this to constants.EXIT_NOTCLUSTER.
Also utils.GetNodeDaemonPort hasn't risen errors.ConfigurationError fora while, so removing that try/except block....
Add a luxi call for multi-job submit
As a workaround for the job submit timeouts that we have, this patchadds a new luxi call for multi-job submit; the advantage is that all thejobs are added in the queue and only after the workers can startprocessing them....
Conflicts:
daemons/ganeti-masterd...
ganeti-masterd: avoid SimpleConfigReader
SimpleStore is a lot less heavyweight than SimpleConfigReader, and tojust get the master name we can use that. This is the only usage ofSimpleConfigReader currently, but we're not going to delete the class,as new usages will come in for ganeti-confd (in 2.1). Using it there,...
ganeti-masterd: allow non-interactive --no-voting
This will be used by ganeti-noded to start ganeti-masterd in a--no-voting masterfailover.
Fix problem with EAGAIN on socket connection in clients
If a user used ^Z to stop the program, poll() in socket.recv would returnEAGAIN due to SIGSTOP. This patch changes luxi.Transport.Recv to ignore EAGAIN.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Rename the volume_list RPC call to lv_list
There are volume-related rpc calls. This patch renames the ‘volume_list’call to ‘lv_list’ to make more clear its purpose.
Simplify the RPC result framework in backend.py
Since now all functions fail via _Fail, the return True, … is redundantas all normal return paths have it, and thus the True value can be addedin the ganeti-noded handler.
This means that all functions can now forget about the special result...
Implement result-type restriction in ganeti-noded
Since all rpc calls were converted, we can now: - enforce result type to (status, data) - convert all unhandled exceptions to (False, str(err))
This makes sure that all unhandled errors are reported to rpc users....
Big rewrite of the OS-related functions
Currently the OSes have a special, customized error handling: the OSobject can represent either a valid OS, or an invalid OS. The associatedfunction, instead of raising other exception or failing, create customOS objects representing failed OSes....
Convert the jobqueue rpc to new style result
This patch converts the job queue rpc calls to the new style result.It's done in a single patch as there are helper function (in both jqueueand backend) that are used by multiple rpcs and need synchronizedchange....
Convert os_diagnose rpc to new style result
This also removes custom post-processing from rpc.py; since this callhas only one user, it was simple to move it back to the caller.
Convert call_version rpc to new style result
This also cleans up its single use in cmdlib.py.
Conver node_leave_cluster rpc to new style result
This patch converts this rpc call to the new style result, and alsochanges in the process the meaning of the QuitGanetiException'sarguments and the node daemon rpc call exception handler.
The problem with the exception handler is that we used a two-stage one,...
Convert node_start_master to new style result
This is used in multiple places outside cmdlib.py, so it's a moreinteresting patch.
Convert node_has_ip_address rpc to new style
This should actually have a function in backend, but it's fine for now.
Convert instance_list rpc to new style result
Since backend.GetInstanceList() is used both as RPC endpoint and asinternal function, it can't return (status, value). Instead it returnsonly valid instance info, and failures are denoted by exceptions; and...
Convert volume_list rpc to new style result
This is a big change, because we need to cleanup its users too.
The call and thus LUVerifyDisks LU used to differentiate between failureat node level and failure at LV level, by returning different types inthe RPC result. This is way too complicated for our needs....
Convert export_info rpc to new style result
This also removes some code from ganeti-noded and rpc.py, which shouldnot do such processing of data (and be simply glue code). (Oralternatively they could, if we had better infrastructure).
rpc: Add a simple failure reporting framework
This patch adds a simple failure reporting tool, similar to bdev's_ThrowError. In backend, we move towards the new-style RPC results (oftype (status, payload)) and thus functions which use this style can very...
Add a node powercycle command
This (somewhat big) patch adds support for remotely rebooting the nodesvia whatever support the hypervisor has for such a concept.
For KVM/fake (and containers in the future) this just uses sysrq plus a‘reboot’ call if the sysrq method failed. For Xen, it first tries the...
watcher: automatically restart noded/rapi
This patch makes the watcher automatically restart the node and rapidaemons, if they are not running (as per the PID file).
This is not an exhaustive test; a better one would be TCP connect to theport, and an even better one a simple protocol ping (e.g. get / for rapi...
watcher: handle full and drained queue cases
Currently the watcher is broken when the queue is full, thus notfulfilling its job as a queue cleaner. It also doesn't handle nicely thequeue drained status.
This patch does a few changes: - first archive jobs, and only after submit jobs; this fixes the case...
Merge branch 'master' into branch-2.1
watcher: write the instance status to a file
This patch modifies the watcher to keep on-disk a file with the instancestatus; this can be used from outside of ganeti to react to instancesbeing down (when the watcher cannot restart them).
watcher: try to restart the master if down
Bugs in either our code or in associated libraries can bring the master daemondown, and this (due to the 2.0 architecture) stops all work on the cluster.
Since the watcher already does periodic checks on the cluster, we modify...
Inform the OS create script of reinstalls
Sometimes reinstalls are slightly different than new installs. Forexample certain partitions may need to be preserved accross reinstalls.In order to do that on a per-os basis we pass in the INSTANCE_REINSTALLvariable to inform the create script about when a reinstall is...
ganeti-noded: add bind address option
This allows ganeti-noded to bind only on one interface rather than allthe ones on the machine. The default behaviour doesn't change.
Fix luxi serialization in ganeti-masterd
Currently, lib/luxi.py used lib/serializer.py for encoding/decodingmessages, but the master daemon uses directly the simplejson module.This is wrong as any non-trivial change to serializer.py will break themaster daemon....
Disable synchronous (locking) queries
This patch raises an error in the master daemon in case the userrequests a locking query; accordingly, all clients were modified to sendonly lockless queries. This is short-term fix, for proper fix theclients should be modified to submit a job when the user request a...
Fix the output of watcher on non-master nodes
Currently the watcher spews errors message on non-master nodes. Thiscleans it up.
Reviewed-by: imsnah
Change the watcher to use jobs instead of queries
As per the mailing list discussion, this patch changes the watcher touse a single job (two opcodes) for getting the cluster state (node listand instance list); it will then compute the needed actions based on...
Add some more debugging info to masterd
This patch will log data about queries, which are today completelyinvisible (at the default log level) in the master log file.