Fix serializer unittests
Commit d22b29997cd broke the serializer unittests with certainversions of simplejson. This patch removes sort_keys againand implements a slightly more efficient way of detectingsimplejson functionality. The serializer unittests no longer...
cfgupgrade: Implement upgrade to 2.1.0
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
bootstrap: Factorize HMAC key generation
Make bootstrap._GenerateSelfSignedSslCert public
cfgupgrade: Remove Ganeti 1.2 support
This also fixes a few typos.
serializer: Sort keys in JSON
Bump version to 2.1.0~beta0
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
mcpu: Use new timeout class for timeout
locking: Convert pipe condition to new timeout class
locking.LockSet: Move timeout calculation to separate class
This class can also be used by mcpu.
locking, mcpu: Ensure timeout is always >= 0.0
locking.LockSet: Improve assertions
locking: Factorize LockSet.acquire
By moving the main code of LockSet.acquire to its own functionwe reduce the code complexity a bit and clarify the exceptionhandling.
This also fixes a case where a lock acquire timeout wasn'thandled correctly, leading to obscure error messages....
mcpu: Make sure added locks are released on errors
Test LockSet.acquire return value for timeout
opcodes: Add missing shutdown_timeout to OpRemoveInstance
luxi: Pass socket path directly to exception, not in tuple
Update NEWS for instance shutdown timeout
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
gnt-* use the correct opcode slot to build opcodes
gnt-* scripts were building wrong opcodes for commands which had theshutdown_timeout slot (due to missing testing after renaming). Fixing.
Also change SHUTDOWN_TIMEOUT_OPT dest field name to "shutdown_timeout":...
Update documentation for recreate-disks
This also clarifies the UUIDs NEWS entry.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
rapi: fix tag operations
This patch fixes the tag PUT/DELETE operations, and additionally changesthe Tags* functions to take only positional and not keyword arguments(the defaults do not make any sense at all, and they are always calledwith all arguments)....
Update NEWS for Ganeti 2.1
Convert NEWS to ASCII
cli: add SHUTDOWN_TIMEOUT_OPT
Add timeout options to other LUs
All the LUs that shut down the instance need to be able too pass thetimeout parameter as well.
Update manpages for --shutdown-timeout
mcpu: Change lock attempt timeout calculation
With this patch all timeouts are pre-calculated. The interface ofthe _LockTimeoutStrategy class is also changed a bit; NextAttemptnow returns a new instance.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Code and docstring style fixes
Found using pylint and epydoc.
mcpu: Improve lock reporting with timeouts
mcpu: Implement lock timeouts
The timeout is always between ~0.1 and ~10.0 seconds. A smallvariation of ±5% is added to prevent different jobs fromfighting each other. After 10 attempts to acquire the locks witha timeout, a blocking acquire is made.
Lock status reporting will be improved in a separate patch....
mcpu: Remove unused exclusive_BGL attribute
locking.LockSet: Implement acquire timeouts
The timeout passed to LockSet.acquire() is measured over all lock acquires. IfLockSet.acquire fails to acquire all requested locks within the specifiedamount of time, all locks are released again and the acquire fails....
Update gnt-instance(8) for shutdown --timeout
Accept shutdown timeout from the user
Using the new --timeout option:
- gnt-instance shutdown is changed to accept a timeout- the opcode is changed to hold one- the LU is changed to optionally get one- the rpc is changed to carry one- the backend is changed to take it as a parameter rather than...
ChrootManager: clean StopInstance
Currently it has lots for duplicated code, and internal retries.Clean it up with the following assumptions:
We'll probably be called more than once.It is ok to fail to stop, unless we're called with force=True.If we're called only once, and with force=True it's ok not to run the...
cli: add a timeout option
KVMHypervisor: use the StopInstance retry feature
Since we know StopInstance is going to be called more than once (atleast twice, once with force and once without, but normally quite a lotmore) we don't need our own sleep/loop, and we can just send one monitor...
backend.InstanceShutdown: small cleanup
1) unhardcode the timeout, abstracting it in a constant2) Use time.time() rather than hiding the timeout in a range()3) call hyper.StopInstance multiple times -- currently all hypervisors just ignore all calls but once...
Add default instance shutdown timeout constant
It reflects the "current" two minutes we give to the instance.
Hypervisors: Add retry= to StopInstance
Currently some hypervisors need the stop operations to be retried morethan once, while other ones only do it in one pass. With this changewe'll handle retries outside the hypervisor code, but telling whetherthis is the first try or not....
Get rid of utils.CommaJoin
- We never remember to use it (5 uses vs 21 " ,".join())- It's longer to write than " ,".join()- The added value of the apostrophe in the string is not very much
ethers hook: allow more than one daemon pidfile
burnin: skip instance moves on single node
If we have only one node, instance moves fail, because it tries to movethe instance to itself. Skipping the operation, because in that case itdoesn't make sense.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
Match instance and node names case insensitively
Since DNS cannot contain two names with different cases anyway, thisshould be ok.
Add case_sensitive keyword to MatchNameComponent
Now featuring unit testing, and more deterministic results on somecorner cases.
VNC password: move to hv param and use in kvm
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Olivier Tharan <olive@google.com>
Implement strict mode for devel/review
This should prevent typos in aliases from going unnoticed.
Check the OS name for variants
If an OS supports variants, unless --force-variant is specified a validvariant must be passed.
Update ganeti-os-interface(7) for API 15
Add force_variant slot to Create/ReinstallInstance
These two opcode need to know whether an unknown variant must be forcedthrough or not.
Allow --force-variant for instance add/reinstall
Passing this option makes an undeclared variant be passed to the os "asis", hoping it'll be able to figure it out (as per the design doc).
Update client os lists to name+variant format
List of OSes are displayed by gnt-os list, rapi, and gnt-instancereinstall --select-os, and checked by burnin. In all of these show thelist with name+variant, if the os has variants.
Add slot and constant for supported OS variants
The slot will contain a list of variants, and the variants file constantcontains the file in the os dir which is supposed to hold the list.
Populate OS variants if an api >= 15 is present
Adding the file name to the os_files dict will fill in the full path andget it checked, if present we also read it and split into lines, one perdeclared variant.
OSEnvironment: populate OS_VARIANT
According to the design on api_version >= 15 the OS variant is the partof the OS name after the "+" sign. If none is found, we just pass in thefirst variant an OS declares (which is bound to exist, as we check forit in _TryOSFromDisk)....
OSFromDisk: handle variants when loading os
When we load an OS from disk, we need _TryOSFromDisk to get the realname, without any variant. This allows any functionality that uses theinstance OS to handle a name with a variant.
Add per-node variants list to OS diagnose output
Add "variants" field to LUDiagnoseOS
If selected this field will contain a list of os variants supported onall nodes.
cli.CalculateOSNames
Given an os and its variants, return a list of "full" os names.
gnt-os diagnose: show os variants
We already show the per-node os variants, also show the global ones.
Fix rpc.call_os_get to actually return the OS
Since nobody ever read the actual OS object, this bug was introduced inthe rpc conversion.
Convert os api version file name to a constant
TryOSFromDisk: s/os_scripts/os_files/
We'll be using this dict/loop to check more than just scripts, so we'rerenaming the variables appropriately.
TryOSFromDisk: only check actual os scripts for +x
Currently all checked files in the loop are os scripts, so nothing willchange, but in the future we only want the +x bit on actual os scripts,not necessarily all files.
Add support for using the bootloader in xen-pvm
This patch adds three optional parameters: - 'use_bootloader', whether use or not the bootloader - 'bootloader_path', absolute path to the bootloader - 'bootloader_args', extra arguments to the bootloader...
Disallow "xrange" function
Replace all xrange() with range()
More locking tests race conditions fixes
There were more race conditions. By adding a notify function toSharedLock.acquire we can prevent them.
check-python-code: Show line number for problems
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Olivier Tharan <olive@google.com>
LUSetNodeParams: autopromote self when needed
If we're de-offlining or de-draining a node we need to promote it to MCif we have not enough, or the config will be corrupt.
Abstract self-promotion decision
During node add we decide whether to self promote to an MC. Abstractthis decision making to a separate function.
Fix master candidate removal
Currently during a master candidate removal, when it's possible topromote another node, the removal operation fails because of a corruptconfig before it's even possible to do the promotion. Fixing this bydoing the promotion before, excluding the current node....
LUSetNodeParams: Don't break config on mc demotion.
If --force is used to demote an MC, but then there are not enough MCs inthe cluster, the configuration gets corrupted until a node is promoted.
In order to avoid that we only allow demotion with --force if the node...
Master candidate stats, return one more value
Other than returning the current number of candidates, and the number ofdesired and possible candidates, we also return the maximum possiblenumber, even if greater than our desires. All callers for now ignore...
SingleActionPipeCondition =~ s/Action/Notify/
With this patch we simplify usage on the SingleActionCondition (whichwasn't a condition at all) by making it a real condition. This way wecan just wait() on it, or notifyAll() as we would on a normal one. The...
testNotification: add more checking about order
Abstract base condition test cases
This way they can be used to test different condition classes.
Move the "done" queue inside _ThreadedTestCase
All (ok, all but one) _ThreadedTestCase users have a done Queue, so wemove its building in the _ThreadedTestCase setUp
Abstract "base" condition code in a separate class
Each condition has an underlying lock, the acquire and release methods,and a few helper methods to check that it's called in the proper way.
Abstract them to a separate class so we can have more than one without...
locking.SharedLock: Fix bug in delete function
SharedLock.__acquire_unlocked uses keyword parameters. Just passingthe timeout would set the “shared” parameter.
Rename LockSet.acquire parameter “blocking” to “timeout”
Also remove the “blocking” parameter from LockSet.remove andGanetiLockManager.remove. There's no point in implementing timeouts on removalunless we need them.
Try to fix locking unittests
Our automated test system found a few problems in the new lockingunittests. This patch should fix them, although I wasn't able toreproduce the problem. All are race conditions.
Change SharedLock to new pipe(2)-based condition
Add _PipeCondition class
_PipeCondition is a condition implemented using pipe(2) and poll(2).It allows the implementation of timeouts without using a busy-wait loopwith time.sleep.
Unlike Python's built-in threading.Condition class and to save filedescriptors and an internal queue, it can only be used to notify...
Add _SingleActionPipeCondition class
This class will be used as a basic block for pipe(2)-basedconditions. Upon initialization it creates a pipe and can benotified once (hence the “single action” in the name). Acallable helper class is used to wait for notifications....
SharedLock: implement timeouts
This patch greatly simplifies the SharedLock code and implementstimeouts for the acquire() and delete() functions. A wrapper aroundPython's threading.Condition class must be used to ensure threadsafety when check whether there are any waiters left....
Extend confd instances ips query
The query now accepts a link parameter.
Signed-off-by: Luca Bigliardi <shammash@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Merge remote branch 'origin/master' into mogu
Improve description of migrate/failover post hooks env
Update env vars for instances in hooks documentation
Remove variables which are listed at the beginning of the section and variableswhich are not declared when building hooks env.
Fix burnin's verbose mode
The timestamp need special formatting, which was done for the internalbuffer storage but not for the messages logged in verbose mode. Thispatch unifies the formatting for these two cases.
Signed-off-by: Iustin Pop <iustin@google.com>...
Final NEWS update and version increase for 2.0.4
QA passed successfully, let's try to have a release.
Add initial confd client unittests
Some basic tests for the confd client library
confd/client: make it possible to update peer list
Until now the peers have to be the same all the time. Adding a newfunction to update the list, and call it from the constructor to avoidduplicating code.
Remove ‘-u’ from masterd shebang
This is not needed anymore - the original change was more than a yearago when masterd was in its incipient phase.
confd/client: pass self to upcalls
It may be handy for upcalls to know which client called them, and callit back. So we create a new "client" field in the upcall target,containing the current client instance
devel/upload.in: make it more project generic
Only install ganeti specific files if they exist. This way we can callganeti's devel/upload in another sub-projects (eg. nbma) and have ituploaded to a host as well, without having to create a new script there....
ConfdFilterCallback: fix a bug in expire
The HandleExpire function takes the whole "up" structure, and not justthe salt.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
ganeti-confd: cleanup imports
Many functionalities of confd have been moved to other classes/modules,and the main confd daemon doesn't reference these modules directlyanymore.
ganeti-confd: don't depend on the os log dir
ganeti-confd doesn't need to log anything related to os installations.