History | View | Annotate | Download (98.8 kB)
Add drbd_helper rpc call
Signed-off-by: Luca Bigliardi <shammash@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
VerifyNode: add usermode helper reply
Rename some constants to facilitate IPv6 support
Signed-off-by: Manuel Franceschini <livewire@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Two more fixes for OS params and opcode defaults
If the OS is not using API v20, the parameter verification should beentirely skipped.
The second change is a simple typo.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Fix breakage due to OS parameters
I was using wrong python installation path (thanks Guido!), so I was notactually testing the new backend.py module. Two immediate things bugsare fixed, and after these burnin passes again…
Signed-off-by: Iustin Pop <iustin@google.com>...
Add OS verification support to cluster verify
For this, we needed to extend the NodeImage class with a few extravariables, and we do a trick in the node verification where we pick thefirst node that returned valid OS data as the reference node, and then...
Add OS parameters to cluster and instance objects
The patch also modifies the instance RPC calls to fill the osparameterscorrectly with the cluster defaults, and exports the OS parameters inthe instance/OS environment.
Add support for OS parameters during import/export
Nothing special here, just copy/adjust the beparams code.
LUDiagnoseOS: add more fields, cleanup
This patch exports all the way from backend a new field ‘api_version’which holds the list of support API versions, and exposes the (alreadycomputed) ‘parameters’ field.
The patch also reworks (again) the field calculation in its Exec()...
Add reading of OS parameters from disk
The patch also modifies the internal methods in LUDiagnoseOS and gnt-osto deal with the format change of call_os_diagnose.
Introduce an RPC call for OS parameters validation
While we only support the 'parameters' check today, the RPC call isgeneric enough that will be able to support other checks in the future.The backend function will both validate the parameters list (so as to...
ListVisibleFiles: do not sort output
Among all users, turns out just one may need the output to be sorted.All the others can cope without.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Split the core-OS and instance-specific env
Since we'll need to be able to generate the OS-specific environmentseparately from the instance one, we move it to a separate function. Wealso add a new OS_NAME env. var which is identical to the INSTANCE_OSone (which won't exist for OS-only environments)....
Merge branch 'devel-2.1' into master
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Balazs Lecz <leczb@google.com>
Fix unsafe variant initializer in _TryOSFromDisk
In case an OS has inconsistent declarations, we might get into a casewhere one node reports a valid variants list (with OS API >=15), andanother node has OS API < 15, in which case its supported_variants gets...
backend: Add support for import/export magic
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Handle ESRCH when sending signals
Upon sending signals, ESRCH can be reported when the target nolonger exists.
Remove the job queue drain rpc call
This call was introduced but never used. In two years.Since it's just creating/removing a file it can also be in simpler ways,without a special rpc call, if/when we need it again. In the meantime,let's give it to history....
backend: Enable export size prediction
Distribute cluster domain secret
The cluster domain secret file was not distributed to other nodes.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Allow control of import/export compression method
For exports to/imports from the same machine, compression willnot be used anymore.
Put common import/export daemon options into object
The X509 key name and CA are passed from cmdlib all the way tothe backend import/export daemon. With the addition of an optionto choose the compression method, another parameter would haveto be passed all the way. By moving these options to a separate...
Merge branch 'devel-2.1'
Add checks for master IP in cluster verify
This also updates a comment in the unittest for utils.py. We unittestthe new function for two things: correct reporting on real case (forlocalhost), and correct reporting with a mocked-out TcpPing that returns...
Conflicts: daemons/ganeti-noded lib/daemon.py lib/rapi/baserlib.py lib/rapi/rlib2.py lib/utils.py
Signed-off-by: Luca Bigliardi <shammash@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix some pylint warnings
Disable warnings for:- except Exception,- use of __errno_location,- redeclaration of handleError()
Signed-off-by: Luca Bigliardi <shammash@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Lock PowercycleNode child in memory
Fix import/export
63bcea2a5 added file checks for import/export, but unfortunately theywere broken.
backend: remove a couple of useless mkdir calls
Those directories must exist for the node daemon to run (it's in thenode daemon's list of ensured directories) and those functions are onlycalled by the node daemon, so there's no point in those checks+mkdir...
backend: Check paths and always write CA file for import/export daemon
Once the import/export daemon uses separate users, the node daemon file (whichis used for intra-cluster transfers) might not be readable anymore. Alwayswriting it to a daemon-specific file will make this easier....
Remove two unused RPC functions
Both of these functions, “snapshot_export” and “instance_os_import”,have been replaced by the instance import/export daemon.
Add RPC call to send SIGTERM to import/export daemon
This will be used to stop the daemon without doing complete cleanup (yet).
utils: Add function to read locked PID file
This is useful in combination with utils.StartDaemon and will be used forreading the import/export daemon's PID file.
Conflicts: doc/security.rst trivial lib/cli.py trivial
Signed-off-by: Balazs Lecz <leczb@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Add CleanupInstance hypervisor call
Currently some hypervisors (namely kvm) need to do some cleanup aftermaking sure an instance is stopped. With the moving of the retry cyclein backend those cleanups were never done. In order to solve this we adda new optional hypervisor function, CleanupInstance, which gets called...
Add RPC calls to import and export instance data
These RPC calls can be used to start, monitor and stop the instance dataimport/export daemon.
backend: Consolidate code opening real block device
Merge remote branch 'devel-2.1'
Export more instance parameters in instance export
Currently the backend parameters are not exported automatically, butonly a few directly in the '[instance]' section. Hypervisor type andhypervisor parameters are not exported at all.
This patch creates two separate sections for the be and hv parameters,...
Export the nicparams too during instance export
The patch tries to export all params (based on the dict defined inconstants), using None for missing keys.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix backend.VerifyNode behaviour for VG problems
In case LVM is broken, backend.GetVolumeList will raise an RPC exception(as expected since it's a function exposed over RPC). Therefore we mustbe prepared to catch any such exceptions, so that we don't fail the...
Add RPC calls to create and remove X509 certificates
Certificates and keys generated using these functions will be used forinter-cluster instance moves. As per design, the private key should neverleave the node.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
backend: Two small style fixes
- Pass keyword parameter as such- Replace “not x == y” with “x != y”
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Rename SSL_CERT_FILE to NODED_CERT_FILE
To be consistent with RAPI_CERT_FILE, the rather generic named“SSL_CERT_FILE” constant is renamed to “NODED_CERT_FILE”. The actual filename is not changed.
Rightname confd's HMAC key
Currently, the ganeti-confd's HMAC key is called “cluster HMAC key” orsimply “HMAC key” everywhere. With the implementation of inter-clusterinstance moves, another HMAC key will be introduced for signing criticaldata. They can not be the same, so this patch clarifies the purpose of the...
utils.CreateBackup: Use human-readable instead of seconds since Epoch
Seconds since the Epoch are not easily readable by a human. Using aformatted timestamp makes it easier (e.g.“….backup-2010-03-12_14_02_43.…”). This patch also makes OS logfiles usethis formatted timestamp....
Improve cluster verify with hypervisor errors
In case the hypervisor has issues on one node, currentlybackend.VerifyNode will exit via an exception (two exit paths possible,one via HypervisorError from hypervisor.Verify(), and one via RPCFailfrom GetInstanceList). This is bad as it invalidates all other checks of...
Fix node volumes list for stripped volumes
Currently backend.NodeVolumes() drops everything except the first PV,thus we get a truncated result. The patch is not the nicest, as Pythondoesn't have a simple `concat' function, so I had to change the listcomprehension to an explicit loop....
Switch more code to PathJoin
This should remove most of the remaining constructs which can bereplaced by PathJoin.
Add caller-validation on Disk.StaticDevPath
Since in objects we don't have access to utils.py, we add a warning thatthe result value from objects.Disk.StaticDevPath might not be a validpath, and change its only caller to validate the path.
Implement disabling of file-based storage
Rationale: the file-based storage backend can add/remove files under acertain directory. However, the master node is also controlling thesetting of the file-based root directory, so basically it means we can'tprevent arbitrary modifications by the master of the node's filesystem....
Replace os.path.sep.join(seq) with utils.PathJoin
This is a no-op change, but at least we concentrate the calls to pathjoins into a single function.
A use in utils.FindFile is left as-is (don't want to raise exceptionsthere, at least for now).
Abstract OS log names computation
The various OS operations create log files in a specific directory(constants.LOG_OS_DIR). The construction of the log names is howeverspread and duplicated across multiple functions.
This patch abstracts this into a separate function that also validates...
Remove superfluous warnings in HooksRunner
For non-existing hooks (the majority of cases probably), logging awarning every time is not helpful. So we first check if we have a validdirectory.
Switch from os.path.join to utils.PathJoin
This passes a full burnin with lots of instances, and should be safe aswe mostly to join a known root (various constants) to a run-timevariable.
Add an extra safety layer to _CleanDirectory
In order to protect from accidental use of _CleanDirectory on a randomdirectory, we add a list of allowed clean directories, somewhat similarto _ALLOWED_UPLOAD_FILES (but statically computed).
Implement utils.RunParts and use it for hooks
This function is a generic pythonic version of runparts. We currentlyuse it in the backend HooksRunner, but we'll use it for runningdifferent directories as well.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
Change backend hooks runner to use RunCmd
And save lots of lines of code, in the process
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Merge branch 'stable-2.1' into devel-2.1
Implement debug level across OS-related RPC calls
This doesn't implement the full functionality, we need to add the debuglevel to the opcodes too, but at least won't require changing the RPCcalls during the 2.1 series.
Make the snapshot decision based on disk type
… instead of disk size, which is not as reliable. This actuallysimplifies the code; but it still leaves the possibility of stackoverflows if the disk data structure is corrupted.
Merge branch 'devel-2.0' into devel-2.1
Conflicts: lib/backend.py - trivial merge...
Ensure all int/float conversions are handled right
int()/float() can raise either ValueError (in case of int("a")), orTypeError (in case of int(None)). We had many bugs over time due tothis, and a recent one was just diagnosed, so we go over the codebase...
backend._OSOndiskAPIVersion: remove obsolete arg
The 'name' argument is not used anymore, probably since before 2.0.Since this is an internal function, we can just remove it (from itscaller too).
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Olivier Tharan <olive@google.com>
Convert to static methods (where appropriate)
Many methods are simple pure functions, and not depending on the objectstate. We convert these to staticmethods.
Add targeted pylint disables
This patch should have only:
- pylint disables- docstring changes- whitespace changes
Remove many 'Unused variable' warnings
Note there are some cases left which need extra cleanup.
Add targetted pylint disables
This patch adds targeted pylint disables, where it makes sense (eitherdue to limitations in pylint or due to historical usage), and also a fewblanket ones in rapi where all the names are… “different”.
Merge branch 'stable-2.0' into stable-2.1
Move the hooks file mask into constants.py
This will allow reuse of the same mask for multiple validations.
Security issue: add validation of script names
This patch unifies the search for external script to always go throughutils.FindFile and implements in that function a restriction on validchars in file names and (additionally) that the passed name is the...
gnt-cluster verify: Warn if node time diverges too far
The warning will be generated if the clocks diverge by morethan 150 seconds. Due to the way the RPC system works, wecannot get exact time differences, e.g. if one of thequeried nodes is broken. The comparision is done using a...
Remove quotes from CommaJoin and convert to it
This patch removes the quotes from CommaJoin and converts most of thecallers (that I could find) to it. Since CommaJoin does str(i) for i inparam, we can remove these, thus simplifying slightly a few calls....
Use “daemon-util” to reload SSH keys
Fix pylint 'E' (error) codes
This patch adds some silences and tweaks the code slightly so that“pylint --rcfile pylintrc -e ganeti” doesn't give any errors.
The biggest change is in jqueue.py, the move of _RequireOpenQueue out ofthe JobQueue class. Since that is actually a function and not a method...
Add new “daemon-util” script to start/stop Ganeti daemons
Until now, Ganeti started and stopped its own daemons using custom functions.To start, the daemon was just executed and then sent the appropriate signals tostop it again. Init scripts would have to pay attention to the PID file and...
hypervisors: change MigrateInstance API
Currently the $hypervisor.MigrateInstance takes the instance name. Thispatch changes it to take the instance object, such that other instanceproperties (especially hvparams) are available to it.
Workaround fake failures in drbd+live migration
This patch is an attempt to fix the ugly issue during migration: Cannot resync disks on node …: [True, 100]
If my understanding is correct, sometimes we poll the /proc/drbd file atan inoportune moment, while it's being updated, or while the DRBD device...
Another round of pylint-related style fixes
A newer version of pylint, more warnings…
Implement cluster verify checks for wrong PV names
Since ':' is not a valid character in PV names (for the way Ganeti usesLVM), we need to check this and warn the user. This patch adds a newNV_PVLIST cluster verify check and verifies the PV names returned from...
backend: Convert to utils.Retry
Epydoc fixes
backend: Don't overwrite function parameter with loop variable
Adding '--no-ssh-init' option to 'gnt-cluster init'.
Allows the initialization of a cluster without the creation or distributionof SSH key pairs. Includes changes for LeaveCluster and RPC.
Signed-off-by: Ken Wehr <ksw@google.com>Signed-off-by: Guido Trotter <ultrotter@google.com>...
Try to reduce wrong errors in InstanceShutdown
In backend.InstanceShutdown(), there is a race condition betweenchecking that the instance exists and trying to shut it down whichtranslates sometime in error messages like:
Tue Oct 20 20:08:30 2009 - WARNING: Could not shutdown instance: Failed...
Revert breakage introduced in e4e9b80
Commit e4e9b8064787df01a79846a40f49c8ae06a8eb0e introduced two problemsin backend.InstanceShutdown():
- first, it reduced the check interval significantly (especially for the first few checks); there are very few production VMs that shutdown in...
Introduce checks for /sys and /proc
This patch adds checks for /proc and /sys in cluster verify, sinceGaneti relies on these special filesystems to be mounted.
Add timeout options to other LUs
All the LUs that shut down the instance need to be able too pass thetimeout parameter as well.
Code and docstring style fixes
Found using pylint and epydoc.
Accept shutdown timeout from the user
Using the new --timeout option:
- gnt-instance shutdown is changed to accept a timeout- the opcode is changed to hold one- the LU is changed to optionally get one- the rpc is changed to carry one- the backend is changed to take it as a parameter rather than...
backend.InstanceShutdown: small cleanup
1) unhardcode the timeout, abstracting it in a constant2) Use time.time() rather than hiding the timeout in a range()3) call hyper.StopInstance multiple times -- currently all hypervisors just ignore all calls but once...
Populate OS variants if an api >= 15 is present
Adding the file name to the os_files dict will fill in the full path andget it checked, if present we also read it and split into lines, one perdeclared variant.
OSEnvironment: populate OS_VARIANT
According to the design on api_version >= 15 the OS variant is the partof the OS name after the "+" sign. If none is found, we just pass in thefirst variant an OS declares (which is bound to exist, as we check forit in _TryOSFromDisk)....
OSFromDisk: handle variants when loading os
When we load an OS from disk, we need _TryOSFromDisk to get the realname, without any variant. This allows any functionality that uses theinstance OS to handle a name with a variant.
Add per-node variants list to OS diagnose output
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Olivier Tharan <olive@google.com>
TryOSFromDisk: only check actual os scripts for +x
Currently all checked files in the loop are os scripts, so nothing willchange, but in the future we only want the +x bit on actual os scripts,not necessarily all files.