Fix execution group of NodeD
The Node deamon was executed with the wrong gid (gnt-daemons) instead of the oneassigned to it by configure.ac.
Fixes Issue 707.
Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Start-master/stop-master always fail if confd is disabled
In 'daemons/daemon-util.in', 'start-master' and 'stop-master' alwaysfail if confd is disabled.
Fixes issue 685.
Signed-off-by: Jose A. Lopes <jabolopes@gmail.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
daemon-util: handle luxid in {start,stop}_master()
Luxid was not handled in start_master() and stop_master() at all. As a result,during a master-failover, luxid would be left running on the old master andwould not start on the new master, leaving the cluster without management until...
Merge branch 'stable-2.7' into stable-2.8
Conflicts: NEWS: trivial...
daemon-util: pass --oknodo at rotate_logs
daemon-util's rotate_logs() did not pass --oknodo to start-stop-daemon whileHUPing the daemon processes. As a result, rotate_logs would fail for anon-running daemon causing rotate_all_logs to exit prematurely....
daemon-util: provide rotate_logs and rotate_all_logs actions
Modify daemon-util to allow sending SIGHUP to one or all daemons. This is meantas a utility function to be used in logrotate definitions.
Signed-off-by: Apollon Oikonomopoulos <apoikos@gmail.com>...
Fix permission problem related to Issue 477
Commit 91525dee856951ace940c78b6254a1c7344b4803 fixed Issue 477 but broke"gnt-cluster info".
This commit offers a solution to both problems, by changing the permissionof the socket instead of changing the permission the confd process is run...
Rename queryd to luxid
As queryd will, in the future, handle all LUXI request, queue jobs andmost likely perform various other tasks, it is renamed to luxid already.This will safe some headache when upgrading Ganeti installations, as wedon't have to deal with a daemon rename....
Add queryd daemon (split from confd)
queryd is added as a new daemon which handles configuration queries overLUXI. This functionality was removed from confd, which now only queriesover the network.
The queryd user is added to the master group such that it can access...
Set the correct group for confd
Starting confd as a member of the daemons group allows the RAPI daemon to accessthe LUXI socket.
Fixes Issue 477.
Conflicts: doc/iallocator.rst...
Adjusting permissions after confd start
This is a workaround for issue 477. Confd resets thepermissions of the query socket in a wrong way. Thispatch fixes them after the start of confd.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Thomas Thrainer <thomasth@google.com>
Add the core of the monitoring daemon
This commit adds the core infrastructure of the monitoring daemon,and integrates it in the build and test systems.
The actual functionality of the monitoring daemon is still completelymissing.
Signed-off-by: Michele Tartara <mtartara@google.com>...
Fixes to pass pep8 (make lint)
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>Reviewed-by: Iustin Pop <iustin@google.com>
Merge ganeti-master-cleaner back into ganeti-cleaner
As I wrote during/after the review on commit 2958c56, “ganeti-cleaner:Separate queue cleaning code”, while I appreciated the permissionseparation, I didn't like too much the file-based approach:
- it is a very simple script, and lots of the code is duplicated...
ganeti-cleaner: Separate queue cleaning code
This code does not need to run as root, therefore it's better to splitit out. It is now run with the same permissions as the master daemon.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Cleanup of build-time shell variable replacements
Instead of having a different set in (almost) every shell script, thisinserts the most commonly used variables at build time. This way thecode for injecting a root directory for virtual clusters also is just...
daemon-util: Use function to determine if confd is enabled
… instead of comparing with two different values in two placse.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
daemon-util: Support virtual clusters
GANETI_ROOTDIR contains the root directory for the current “virtual node”.
Fix daemon-util with non-root user models
Commit 4b42c3d6 broke non-root user mode since, while trying to do acleanup a move all local variable definitions to the start of thefunction; however, the plain_name var is only defined later, so thisactually doesn't work....
Add support to daemon-util for distributions without start-stop-daemon
This adds support to daemon-util for Red Hat based distributions thatdo not have a start-stop-daemon. If /sbin/start-stop-daemon is notavailable, daemon-util will source /etc/rc.d/init.d/functions....
Merge branch 'devel-2.5'
ganeti.initd: Add “status” action
Eric Rostetter sent a patch adding a “status” action, but unfortunatelyhis code was apparently specific to Red Hat. I hope this implementationis more distribution-agnostic; after all “status_of_proc” is part ofLSB. Example output:...
serializer: Remove JSON indentation and dict key sorting
Serializing to JSON using “simplejson” is significantly slower whenindentation and/or sorting of dictionary keys is used. In simplejson 1.xthe difference isn't that big, but with simplejson 2.x the difference...
Adapt daemon-util to ENABLE_CONFD
We still allow explicit shutdown of confd, but we prevent manualor automatic start-up.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
DeprecationWarning fixes for pylint
In version 0.21, pylint unified all the disable-* (and enable-*)directives to disable (resp. enable). This leads to a lot ofDeprecationWarning being emitted even if one uses the recommendedversion of pylint (0.21.1, as stated in devnotes.rst)....
PEP8 style fixes
Identified using the “pep8” utility.
cleaner: Remove watcher's instance status file after 21 days
ganeti-cleaner: Remove old watcher state files
Watcher state files can stay around if node groups are removed. Withthis patch they're removed after 21 days.
Remove old ensure-dirs (no longer needed)
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Merge branch 'devel-2.3'
Conflicts: lib/cmdlib.py: Trivial...
import-export: Improve timeout error reporting
When the source cluster takes too long to create a snapshot, thedestination would time out. Unfortunately no good error message waswritten unless debug logging was enabled, not even to the log file. Thiswill be improved with this patch....
Conflicts: lib/cmdlib.py: Trivial qa/ganeti-qa.py: Trivial...
ensure-dirs: Speed up when using big queues
The “ensure-dirs” script as included in Ganeti 2.3 is very slow whenworking with big queues requiring a change of permissions on many or allfiles.
$ find /var/lib/ganeti/queue/ | wc -l52354
Before this change:...
Move “rapi_users” file into separate directory
This reduces the number of notifications in “ganeti-rapi”. Until now itwas notified for every change in …/lib/ganeti and had to check whetherthe users file was affected. A symlink is always created in cfgupgrade...
impexpd: Implement support for IPv6
Support timeouts in RunCmd
Further investigations have to be done for merging some of these bitstogether with import-export daemon which uses similiar logic.
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Move locking.RunningTimeout to utils
As we need this functionality in other places than just locking it makessense to move it to utils rather than keeping it in locking
Move ganeti-noded to ganeti.server.noded
Move ganeti-rapi to ganeti.server.rapi
Make *.in non-executable
Move ganeti-masterd to ganeti.server.masterd
Move ganeti-confd to ganeti.server.confd
Move ganeti-watcher to ganeti.watcher
Add support and checks for version in LUXI
A new constant, LUXI_VERSION, is used to verify the peer's version. Theversion is optional, so old(er) clients and servers talking to peers notsupporting it won't break. Example with mismatching library:
$ gnt-instance list...
LUClusterVerify: Complain if disk is marked faulty
This will show a warning if, for example, one side of a DRBDdisk becomes unavailable. The data is collected separatelyfrom the other verification data.
Example output:
Adding RPC call for blockdev_wipe
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add a new watcher option --ignore-pause
During cluster maintenance, when the watcher is disabled, it's useful torun it just once. This is incovenient to do currently, as the watcherneeds to be unpaused, then run, then paused again.
This patch adds an option “--ignore-pause” that can be used to ignore...
Fix compatibility with Pyinotify 0.8
I didn't know why the code previously used“pyinotify.EventsCodes.ALL_FLAGS” instead of using the flags from“pyinotify.EventsCodes” directly. Turns out that Pyinotify 0.8 has themin “pyinotify”, not “pyinotify.EventsCodes”....
ganeti-rapi: Watch directory, not file for user file changes
We noticed several issues when just watching the file, among them raceconditions upon replacing the file using rename(2) (the new watcherwould be created too soon). By just watching the directory for events on...
http.auth.ReadPasswordFile: Don't read file directly
Reading the file before this function allows for better errorreporting.
"Fix" handling of old software versions on startup
Currently, masterd startup with old software versions is very confusingfor users: we present two tracebacks, with a message in the middle about"version mismatch". This can lead to users believing that all that needs...
Convert ganeti daemons to the three-stage startup
This makes almost all of the daemons show error messages, and not returnuntil they finished listening on the appropriate sockets.
Masterd is the only one "special", as it doesn't do enoughinitialization in the server creation, only later....
Change utils.GenericMain protocol
Currently, GenericMain does a two-staged workflow:
- Check, before forking- then Exec, after forking
This means we don't have any possibility to treat preparation work(before the daemon is ready for work) different from the actual work....
jqueue: Use timeout when acquiring locks
As already noted in the design document, an opcode's priority isincreased when the lock(s) can't be acquired within a certain amount oftime, except at the highest priority, where in such a case a blockingacquire is used....
RAPI server: Move user file watching out, update documentation
This patch moves the code watching the users file into aa separate class to not mix it with HTTP serving. The usersfile is now driven from outside the HTTP server class.
Also the documentation is updated to mention the automatic...
Update the authentication mapping in RAPI if users file has been updated
Please note: This only works if the file existed upon startup. If the file wascreated later, ganeti-rapi has to be restarted.
Modify daemon-util to support launching daemons under different user/groups
Remove utils.EnsureDir as this is done by ensure-dirs.in now
Partial Revert "Let ganeti-rapi run under a different user/group"
This partially reverts commit 8b72b05c51208190796b558233d69dae7643c7f7.
Basically it removes the user involved changes
Merge branch 'devel-2.2'
Merge branch 'stable-2.2' into devel-2.2
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Allow ensure-dirs to run partially and skip big file chunks
The startup of the daemons would take a lot of time otherwise,also it's not needed to set the permissions of those file overand over again, because if the daemons are once migrated to theuser they will keep creating the file for that user....
Adapt ensure-dirs to accomodate the additional permissions and files
Please note that this can and will be improved over time. There are discussionsabout automated file generation of ensure-dirs so we can really keep all thepermissions and file ownerships in one place. Because right now they are all...
Disable the RAPI CA checks in watcher
Since the RAPI certificate is not necessarily self-signed, and wecurrently don't have any configuration variable for the real CA file, wedisable for now the CA checks. This fixes the 'restart RAPI every 5minutes' problem with non-self-signed certs....
hansmi helped me with merging the conflict. Thanks
Conflicts: lib/workerpool.py
Add simple lock monitor
This patch adds an initial implementation of a lock monitor, accessiblefor the user through “gnt-debug locks”. It currently shows all resourcelocks: BGL, nodes and instances. Config and job queue locks could beshown too, but wouldn't be of much help. The current owner(s) and mode...
Add RPC calls to update /etc/hosts
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Removing all ssh setup code from the core
Support for resolving hostnames to IPv6 addresses
This patch enables IPv6 name resolution by using socket.getaddrinfoinstead of socket.gethostbyname_ex.
It renames the HostInfo class to Hostname and unifies its use throughoutthe code. This is achieved by using static calls where no object is...
Introduce new IPAddress classes
This patch unifies the netutils functions dealing with IP addresses tothree classes:- IPAddress: Common IP address functionality- IPv4Address: IPv4 specific functionality- IPv6address: IPv6-specific functionality
Furthermore it adds methods to check whether an address is a loopback...
workerpool: Change signature of AddTask function to not use *args
By changing it to a normal parameter, which must be a sequence, we canstart using keyword parameters.
Before this patch all arguments to “AddTask(self, *args)” were passed asarguments to the worker's “RunTask” method. Priorities, which should be...
masterd: move the IP activation from Exec to Check
Currently, the master IP activation is done in the Exec function. Sincethe original masterd process returns after forking, and Exec is run inthe (grand)child process, this means that after 'ganeti-masterd' has...
Move the UsesRPC decorator from cli to rpc
This is needed because not just the cli scripts need this decorator, butthe master daemon too (and it already duplicated the code once).
In cli.py we just leave a stub, so that we don't have to modify all thescripts to import rpc.py....
watcher: smarter handling of instance records
This patch implements a few changes to the instance handling. First, oldinstances which no longer exist on the cluster are removed from thestate file, to keep things clean.
Second, the instance restart counters are reset every 8 hours, since...
Convert RPC client to PycURL
Instead of using our custom HTTP client, using PycURL's multiinterface allows us to get rid of the HTTP client threadpool.The majority of the code is still in the ganeti.http.clientmodule.
A simple per-thread HTTP client pool gives cURL a chance to...
Confd IPv6 support
This patch series basically adds a new parameter 'family' to the constructorsof daemon.AsyncUDPSocket and confd.client.ConfdUDPClient. This enables theusers of these two classes to support IPv6.
In ganeti-confd.ConfdAsyncUDPClient a method to check the address families of...
Introduce lib/netutils.py
This patch moves network utility functions to a dedicated module.
Signed-off-by: Manuel Franceschini <livewire@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Merge branch 'devel-2.1'
Signed-off-by: Luca Bigliardi <shammash@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Mlockall: decrease warnings if ctypes module is not present
Node daemon prints a lot of warnings if --no-mlock option is not specified andctypes module is not present.
With the following patch the warning is printed only at noded startup.
Signed-off-by: Luca Bigliardi <shammash@google.com>...
Add drbd_helper rpc call
Fix ganeti-rapi version string
This was "broken" for almost a year :)
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
RAPI client: Switch to pycURL
Currently the RAPI client uses the urllib2 and httplib modules fromPython's standard library. They're used with pyOpenSSL in a very fragileway, and there are known issues when receiving large responses from a RAPIserver....
Rename some constants to facilitate IPv6 support
Signed-off-by: Manuel Franceschini <livewire@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Add missing pylint disable for "except:"
Why it's needed here but not a few lines above is a mistery that onlypylint understands.
Also fix an indentation error in another disable, for the same function.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
masterd: use AsyncTerminatedMessageStream for luxi
Each luxi connection now creates an asyncore MasterClientHandler (whichis an AsyncTerminatedMessageStream subclass, sending each message to aclient worker). This makes it harder to DOS the master daemon by just...
Introduce an RPC call for OS parameters validation
While we only support the 'parameters' check today, the RPC call isgeneric enough that will be able to support other checks in the future.The backend function will both validate the parameters list (so as to...
import/export daemon: Add support for a magic prefix
This “magic” value will be used to ensure that we don't accidentiallyconnect to the wrong daemon (e.g. due to a bug), comparable to DRBD'sper-disk secret. Just depending on the SSL certificate isn't enough...
import/export: Validate remote host/port
The hostname and port received from the remote cluster shouldbe validated, just in case.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Handle ESRCH when sending signals
Upon sending signals, ESRCH can be reported when the target nolonger exists.
Remove the job queue drain rpc call
This call was introduced but never used. In two years.Since it's just creating/removing a file it can also be in simpler ways,without a special rpc call, if/when we need it again. In the meantime,let's give it to history....
Add unittest for ganeti-cleaner
import/export: Allow script to predict size
Once we have a size for an export (in the context of theimport/export daemon), we can provide the user with apercentage and ETA.
import/export daemon: Record amount of data transferred
This reports the amount of data transferred and the throughput (averagedover 60 seconds) to the master daemon. While not yet fully implemented,once the export scripts report the expected data size, we can even provide...
ensure-dirs: don't fail if no rapi log is present
Sometimes a node has never been a master. Or ran rapi. In that case weneed to create the file (because if later rapi gets started, it won't beable to create it itself).
Let daemon-utils fix the owners for ganeti-rapi
This is a workaround until we fully switched to user separation and fixes theowners of directories/log files so ganeti-rapi will start flawlessly. This isright now run for every daemon but as it operates on a relatively small subset...
Modify ganeti-masterd to set permission and owner of masterd-socket
Let ganeti-rapi run under a different user/group
Convert ganeti-masterd's main thread to mainloop
Not much changes with this patch. The main loop for the IOServer isrepaced by mainloop.Run() and the main thread now uses asyncore tohandle connections to the master socket. Once it accepts them, though,...
ganeti-watcher should attempt to fix ganeti-rapi
Update ganeti-watcher so that it tests the master's RAPI port with asimple test (in this case GetVersion). If it fails, make one attemptat restarting ganeti-rapi and retest.
- daemons/ganeti-watcher: Test rapi and make one attempt at restarting it....