Convert LUTestDelay to concurrent usage
In order to do so: - We set REQ_BGL to False - We implement ExpandNames
That's it, really.
Reviewed-by: iustinp
Processor: Acquire locks before executing an LU
If we're running in a "new style" LU we may need some locks, as requiredby the ExpandNames function, to be able to run. We'll walk up the locklevels present in the needed_locks dictionary and acquire them, then run...
LogicalUnit: add ExpandNames function
New concurrent LUs will need to call ExpandNames so that any namespassed in by the user are canonicalized, and can be used by hooks,locking and other parts of the code. This was done in CheckPrereqbefore, but it's now splitted out, as it's needed for locking, which in...
Processor: Move LU execution to its own method
This makes the try...finally code simplier, and helps adding a morecomplex locking structure before the actual execution. It also fixes aconcurrency bug caused by the fact that write_count was read beforeacquiring the BGL, and thus spurious config update hooks run could have...
Add a new LockSet unittest
This test checks the LockSet behaviour when an empty list is passed.The current behaviour is expected, but since this is a corner case,we're safer to keep it under a check, and if we need a different onemonitor that everything is as we expect it to be....
constants: Add job and opcode status strings
workerpool: Don't notify if there was no task
Workers have to notify their pool if they finished a task to makethe WorkerPool.Quiesce function work. This is done in the finally:clause to notify even in case of an exception. However, beforewe notified on each run, even if there was no task, thereby creating...
Create all SUB_RUN_DIRS in ganeti-noded
Rather than just creating BDEV_CACHE_DIR we loop through theSUB_RUN_DIRS list and create all its childs.
Add a top level RUN_GANETI_DIR constant
This patch creates a base RUN_GANETI_DIR and then moves the other rundir constants to use that (even if just setting BDEV_CACHE_DIR as equalto it, rather than putting it deeper, for now).
Also we create a constant list of all the subdirs we need in RUN_DIR to...
symlinks: Add DISK_LINKS_DIR constant
The DISK_LINKS_DIR points to the RUN_DIR/ganeti/instance-disksdirectory, which will contain symlinks to the instances' disks. Theseprovide a stable name accross all nodes for them, and permitlive-migration to happen....
luxi: Use serializer module instead of simplejson
serializer.DumpJson: Control indentation by parameter
If the simplejson module supports indentation, it's always used. Thereare cases where we might not want to use it or enable it only fordebugging purposes, such as in RPC.
Add a missing import to cmdlib
cmdlib uses some constants from locking (ie. locking levels) but doesn'timport it. This patch fixes the issue.
Fix an error accessing the cfg
Since the context is passed to LogicalUnit, rather than the cfg, we canonly access the cfg as self.cfg, self.context.cfg, or context.cfg (inthe constructor). cfg is not valid anymore.
Add and remove instance/node locks
Whenever we add an instance or node to the cluster (i.e. to the configand whenever we remove them we should add/remove locks as well). In thefuture we may want to optimize this so that the configwriter does it, orit's handled at the context level, but till we're adding/removing...
Pass context to LUs
Rather than passing a ConfigWriter to the LUs we'll pass the wholecontext, from which a ConfigWriter can be extracted, but we can alsoaccess the GanetiLockManager. This also fixes the places where a FakeLUis created.
mocks: create a FakeContext object
This will be passed to FakeLUs
Fix a typo in LUTestDelay docstring
Locking: remove LEVEL_CONFIG lockset
Since the ConfigWriter now handles its own locking it's not necessary tohave a specific level for the config in the Locking Manager anymore.This patch thus removes it, and all the unittest calls that used it, ordepended on it being present....
ConfigWriter: synchronize access
Since we share the ConfigWriter we need somehow to make sure thataccessing it is properly synchronized. We'll do it using thelocking.ssynchronized decorator and a module-private shared lock.
This patch also renames a few functions, which were called inside the...
Locking: add ssynchronized decorator
This patch creates a new decorator function ssynchronized in the lockinglibrary, which takes as input a SharedLock, and synchronizes access tothe decorated functions using it. The usual SharedLock semantics apply,so it's possible to call more than one synchronized function at the same...
ConfigWriter: remove _ReleaseLock
Remove empty function _ReleaseLock and all its calls. Since we onlyhave one configwriter per cluster the locking needs to cover all thedata in the object, and not just the file contents. Locking inConfigWriter will be handled using the ganeti locking library....
Fix some issues with the watcher
This patch fixes two bugs: - the state file is not saved because we use the method for checking for udpated data - in two places 'Error' was used instead of 'Exception', which breaks error handling
Additionally:...
Add generic worker pool implementation
Reviewed-by: ultrotter
Reuse the luxi client in cli.SubmitOpCode
By a mistake, we don't reuse the luxi client. As such, we open and closethe connection at each poll cycle and spam the server logs.
Add custom logging setup for daemons
It's better for daemons if: - they log only to one log file - the log level is included - for debug runs, the filename/line number is included
This patch moves the custom formatter from the watcher to the logging...
Remove custom locking code from gnt-instance
The gnt-instance script doesn't run in the same process anymore, so wecan't and don't have to unlock.
ganeti-masterd: Remove unused locking code
Reviewed-by: iustinp, ultrotter
ganeti-masterd: Use logging module
Reviewed-by: ultrotter, iustinp
Context: s/GLM/glm/
Make the GanetiLockManager instance of GanetiContext lowercase
Reviewed-by: imsnah
Set locale when using docbook programs
At least docbook2man inserts a date formatted using the currentlocale into its output.
Update .gitignore
Reviwed-by: imsnah
Add a FirstFree function to utils.py
This function will return the first unused integer based on a list ofused integers (e.g. [0, 1, 3] will return 2).
Increase the thread size to 5
Now that we use the locking library to make sure running opcodes cannotstep on each other toes we can have a bigger thread size, andpotentially process many opcodes in a parallel manner.
Processor: acquire the BGL for LUs requiring it
If a LU required the BGL (all LUs do, right now, by default) we'llacquire it in the Processor before starting them. For LUs that don'twe'll still acquire it, but in a shared fashion, so that they cannot run...
Processor: pass context in and use it.
The processor used to create a new ConfigWriter when it was initialized.We now have one in the context, so we'll just recycle it. First of allwe'll pass the context in when creating a new Processor object, thenwe'll just use context.cfg, which is granted to be initialized, wherever...
Add REQ_BGL LogicalUnit run requirement
When logical units have REQ_BGL set (it is currently the default) theyneed to be the only ganeti operation run on the cluster, and we'llguarantee it at the master daemon level. Currently only one thread isrunning at a time, so this requirement is never broken....
Burnin doesn't need a Processor
In 2.0 burnin submits job to the master daemon, so it doesn't need tocreate an internal Processor anymore. Even if the processor is not usedanywhere in the burnin code it was still initialized as a leftover ofhow burnin used to work. Fixing this....
Implement “gnt-job list -o +...”
This adds the same “-o +...” functionality in gnt-job as in the node andinstance scripts.
Fix sstore handling in Processor
- no need to keep the sstore as an object member, remove it- don't reinitialize sstore only if self.cfg is None This is not an issue, as the Processor is recycled for every opcode, but in general we know that (a) we might need a different type of...
Remove duplicate code in hooks unittests
All the tests there used to creare a cfg, a sstore, an opcode and a LU.Put all the duplicate code in the setUp function.
ganeti-masterd: init and distribute common context
This patch creates a new GanetiContext class, which is used to holdcontext common to all ganeti worker threads. As for theGanetiLockingManager class it is paramount that there is only one suchclass throughout the execution of Ganeti, so the class checks for that,...
AddNode: move the initial setup to boostrap
From the master node we can't start ssh and connect to the remote node,nor we can do it from ganeti-noded as this ssh section will possibly askfor key confirmation and password. So the code to copy the ganeti-noded...
AddNode: Check for node existance
In the "new world" we'll need to setup ganeti-noded via ssh on the nodebefore calling the AddNode opcode. Before doing it we'll check that thenode is not already in the cluster, if --readd was not passed. Thisguarantees we're not going to restart ganeti-noded on a running node....
LUAddNode: use node-verify to check node hostname
As we can't use ssh.VerifyNodeHostname directly, we'll set up a mininode-verify to do checking between the master and the new node. In thefuture networking checks, or more nodes, can be added as well.
LUAddNode: use self.sstore, not a local ss
Since we're inside a LU we have access to self.sstore.No need to use ss, which separate instantiation will disappear in a fewpatches! ;)
LUAddNode: upload files via rpc, not scp
We used to scp all the ssconf files, and the vnc password file to thenew node. With this patch we use the upload_file rpc, specifying justthe new node as a destination. All the files previously copied by scpare already allowed by the backend....
Allow VNC_PASSWORD_FILE to be rpc-uploaded
What could possibly go wrong?
Change fping to TcpPing in two LUs
Two LUs are using RunCmd to call fping, in order to check for an IPpresence on the network. Substituting it with TcpPing will get rid ofit, which makes it not break in the new world order, where the mastercannot fork....
raise QuitGanetiException in LeaveCluster
ganeti-noded: Fix handling of QuitGanetiException
- s/GanetiQuitException/QuitGanetiException/- Look for the arguments in err.args, not err itself
Simplify QuitGanetiException instantiation
Rather than packing all the arguments in a tuple, let's pass themplainly. The superclass won't complain.
logger: Set formatter for stderr
Having a timestamp on log messages is very useful. The defaultformat string doesn't include a timestamp.
When removing a node don't ssh to it
Even in 1.2 this behaviour is broken, as the rpc call will remove thessh keys before we get a chance to log in. Now the rpc takes care ofshutting down the node daemon as well, so we definitely can avoid this.
This makes the LURemoveNode operation work again with the threaded...
ganeti-noded: quit on QuitGanetiException
Accoring to the usage documented in the QuitGanetiException docstring,if we receive such an exception we'll set the global _EXIT_GANETI_NODEDvariable to True, and then return either a valid value or an errormessage to the user. This will be the last request we serve, though,...
Add errors.QuitGanetiException
This exception does not signal an error but serves the purpose of makingthe ganeti daemon shut down after handling a request. Currently it willbe used by ganeti-noded but in the future ganeti-masterd might make useof it as well. Its usage is documented in the docstring....
ganeti-noded: serve not quite forever
Rather than calling httpd.serve_forever() in ganeti-noded we'll callhttpd.handle_request() but just while a global variable, which we'llcall _EXIT_GANETI_NODED, remains false.
Add missing empty line in SshKeyError's docstring
Remove spurious check during LUAddNode
There is no point in checking whether the cluster VNC password fileexists as a prerequisite for AddNode, considering the check happens onthe master node, not the target one. Removing this check.
Improve LURemoveNode BuildHooksEnv docstring
devel/upload: Add --no-restart option
If --no-restart is passed to devel/upload, it'll not run"/etc/init.d/ganeti restart" (which kills processes), makingdebugging on a terminal a bit easier.
Cleanup old DRBD 0.7.x code
Apparently there were still some leftovers. While removing an instance,I got the message "unhandled exception 'module' object has no attribute'LD_MD_R1'".
Cleanup LV status computation
Currently, when seeing if a LV is degraded or not (i.e. virtual volume),we first attach to the device (which does an lvdisplay), then do a lvsin order to display the lv_attr. This generates two external commands todo (almost) the same thing....
Add a .gitignore file
This makes it easier to setup new git repositories, and makes it morelikely all people have the same ignore rules.
Add unittests for ganeti.serializer
Remove lib/Makefile.libcommon
Fix gnt-cluster “command” and “copyfile”
Since the disabling of forking in the master daemon, the two ssh-basedsubcommands were not working anymore. However, there is no need at allfor the commands to be run from the master daemon (permissions to readthe cluster private ssh key notwithstanding), they can be run directly...
Handle any exception in ganeti-masterd
If an uncaught exception is thrown currently it destroys the callingthread. This patch changes the behaviour to failing the current job,logging a message, but trying to keep the daemon up.
cfgupgrade: Implement upgrading to Ganeti 2.0 configuration
Makefile.am: Don't create "--" directory
Automake automatically appends "--" to mkdir_p. In case you havea directory named "--" in your source tree, you can remove it usingthe command "rm rf - --".
mkdir_p
objects: Remove config_version from cluster configuration
cfgupgrade: Add main() function
cfgupgrade: Add logging module
Fix the zombie process unittest
The failure is because in high load, the parent gets to run before thechild has the chance to os._exit(), and therefore it is still runningwhen the parent does the check.
The fix removes the chance of this happening by waiting to receive a SIGCHLD...
Bump version to 2.0.0~alpha0
We decided to bump the major number to 2 a few weeks ago due to the huge numberof changes going into it.
Add functions to calculate version number to constants.py
In cfgupgrade, we need to extract parts of and build new version numbers.
utils.WriteFile: Remove optional check_abspath parameter
cfgupgrade will not work with relative paths at all, but rather get themfrom constants.py.
Add a ‘tags’ field to instance and node listing
Currently there isn't any easy way to list all nodes or instance andtheir tags; you have to query each node in turn, or list all the tagsvia something like “gnt-cluster search-tags '.*'”. Of course, this is...
Implement handling of luxi errors in cli.py
Currently the generic handling of ganeti errors in cli.py (GenericMainand FormatError) only handles the core ganeti errors, and not the clientprotocol errors (which live in a separate hierarchy).
This patch adds handling of luxi errors too, and also adds another luxi...
Remove twisted checks from configure.ac
Currently we don't use twisted, so we remove the twisted checks from theconfigure stage.
Reviewed-by: amishchenko
Add a rpc call for BlockDev.Close()
This patch adds rpc layer calls (in rpc.py and the equivalent inganeti-noded) to close a list of block devices, and the wrapper inbackend.py that takes a list of Disk objects, identifies them andreturns correctly formatted results....
Check for docbook2{man,pdf,html}
docbook2{man,pdf,html} are mandatory. "configure" aborts if oneof them isn't found.
Small typo in gnt-instance manpage
Reviewed-by: manuel.franceschini
Use a single Makefile.am instead of many
This change allows us to use cleaner dependencies betweendirectories. The build system is basically rewritten in large partsand may contain bugs.
Fix bdev unittest when run under distcheck
The path to the filename for drbd8 proc data is not correctly computedwhen using distcheck. The patch duplicates it from the other drbd tests.
Rework the DRBD8 device status computation
Currently, compute the status of a drbd8 device in GetSyncStatus andreturn only the values that we need (and fit in the framework ofGetSyncStatus). However, the full status details are useful (and needed)in other places, so the patch attempts to improve this situation....
ganeti-watcher: Replace custom exceptions with ganeti.error.*
ganeti-watcher: Don't write file if data didn't change
This is the safest way to detect changes and the amount of datais small, so keeping a copy around is cheap enough.
ganeti-watcher: Rename WatcherState.data to WatcherState._data
Cleanup: _data is private and should not be modified from outsideof this class.
Don't log SystemExit exception in ganeti-watcher
Replace watcher state file atomically
- Lock it before renaming- Code cleanup; close() automatically unlocks it
Write ganeti-watcher status file even if something failed
Add more parameters to utils.WriteFile
- Make closing file optional: Required by ganeti-watcher to keep file open after writing it. Changes return value of utils.WriteFile if "close" parameter evaluates to True.- Pre- and post-write functions: Can be used to lock files. This...
Use ganeti.serializer module in ganeti-watcher
Replace custom logging code in watcher with logging module
- Log timestamp for all messages- Write everything to logfile and optionally to stderr- Log messages are no longer buffered, allowing a user to see progress
Make sure serialized data ends with EOL character
Also fix the regular expression to not remove newlines. The simplejsonmodule puts whitespace at line endings when using indentation. Removeunnecessary import of ConfigParser module.
Allow disk object to set their own physical ID
Currently, the way to customize a DRBD disk from (node name 1, node name2, port) to (ip1, port, ip2, port) is to use the ConfigWriter methodSetDiskID. However, since this needs a ConfigWriter object, it can be...
Fix an error-handling case
There is a mistake in handling grow-disk for an invalid disk. This patchfixes it.
Manpage updates for the new grow-disk command
The patch documents the steps needed to complete a user-visible grow(i.e. not only grow-disk, but also filesystem resize is needed, etc.)
Implement gnt-instance grow-disk
This patch exposes at command line level the grow-disk operation.