History | View | Annotate | Download (67.7 kB)
Pass ssconf values from master to node
Instead of parsing the configuration on the node, we pass the ssconfvalues from the master.
Reviewed-by: iustinp
Correct GetAllInstancesInfo rtype
GetAllInstancesInfo, in the backend, returns just a dict, not a dict of dicts.
Add RPC call to update ssconf files
Convert trunk to posix-compatibility
We change two functions to use RunCmd without shell, and the other(which needs a ssh command line) is changed to the '>... 2>&1' syntax.
Reviewed-by: imsnah
Update backend.py docstrings
This patch converts all of backend.py to epydoc formatting.
Fix another error handling case
The return from this error path is a dict, but the actual return value(on the non-error handling) is a list of dicts. Change accordingly.
Fix an error handling case
Found while reviewing documentation.
Reviewed-by: ultrotter
OSFromDisk remove superfluous empty line
Some documentation updates
This fixes a few doc issues and converts a few docstrings to epydoc.
Export the disk index in the import/export scripts
We want to export the disk index as some OSes will only want to exportthe first disk (or the second one, etc.), even if we have multipledisks.
The patch also updates the backend.ExportSnapshot docstring....
Convert ImportOSIntoInstance to OS API 10
- Change ImportOSIntoInstance not to get any "os_disk" and "swap_disk" arguments but to accept multiple target images to import, and to return a list of booleans with the result of each import- Change the relevant rpc call and the only caller to conform...
Convert ExportSnapshot to OS API 10
We pass the data via the environment rather than on the command line, asAPI 10 says. All the rest remains the same, and we export just one disk,as the master calls this function for every snapshotted disk.
LUExportInstance: snapshot all disks
Rather than just snapshotting the "sda" disk, we'll snapshot all of theinstance disks. If we can't snapshot a disk for any reason we'll log anerror and proceed anyway: in this case the resulting export will miss adisk. This also changes all the warning messages to self.LogWarning()....
Convert SnapshotBlockDevice's docstring to epydoc
Cleanup os_add/rename rpc for OS API 10
- remove now unused osdev and swapdev arguments from backend, noded, rpc, cmdlib- convert docstrings to epydoc
Temporarily explicitely break import/export
Since they're not converted to API 10 yet, we temporarily disable theimport/export functions.
AddOSToInstance: convert to api10
RunRenameInstance: convert to api10
Add new OSEnvironment function
This function calculates the basic environment for OS scripts in APIversion 10.
OSFromDisk: use script names from constants
Change OSFromDisk's docstring to epydoc
Plus update it with the real variable name
Add a rpc call for changing the drain flag
A new multi-node call is added that sets/resets the drain flag.
Change the backend to use the beparams
The backend.FinalizeExport function is changed to use the beparamsinstead of the instance attributes. Future enhancements should be donein order to export and import/reuse the whole be/hv params.
Temporary fix for dual hvm/pvm instances
We have a problem with the current model of combining instance listsfrom multiple hypervisors: we don't allow duplicates, but "xm list" gives the same output for both pvm and hvm. This is a lack in the actualxen hypervisor implementation/split between pvm and hvm, but for now we...
Export the hypervisor.ValidateParameters over RPC
The newly-added node-specific ValidateParams hypervisor method isexported over RPC, using the semi-standard (success, message) returnvalue. Multi-node call, so that we call on both primary and secondary at...
Abstract checking own address into a function
Currently, we check if we have a given ip address (i.e. it's alive onone of our interfaces) but manually calling TcpPing(source=localhost).This works, but having it spread all over the code makes it hard to...
OS API: support for multiple versions in an OS
Allow multiple api versions in an OS. This is according to the OS APIchanges design doc, by which an OS can support multiple versions of theGaneti API and if one is supported by Ganeti it will work. Since up to...
Move the hypervisor attribute to the instances
This (big) patch moves the hypervisor type from the cluster to theinstance level; the cluster attribute remains as the default hypervisor,and will be renamed accordingly in a next patch. The cluster also gains...
rpc.call_instance_migrate: pass the whole instance
Currently the call_instance_migrate call only passes the instance name;we need to pass the whole object for the hypervisor_type changes (allthe other individual instance rpc calls already pass the instance...
backend.py change to get cluster name from master
Currently there are three function in backend that need the cluster namein order to instantiate an SshRunner. The patch changes these to get thecluster name from the master in the rpc call; once the multi-hypervisor...
Fix SshRunner breakage from the changed API
More places actually use the SshRunner than just the gnt-clustercommands.
Convert ssh.py
Get rid of ssconf and convert to configuration instead.
Convert hypervisor
Replacing ssconf with configuration.
Convert backend.py
Replacing ssconf with simpleconfig.
Never remove job queue lock in node daemon
Otherwise, corruption could occur in some corner cases. E.g. whenLeaveNode is running in a child and is in the process of removingqueue files, the main process gets killed, started again and getsa request to update the queue. This is rather extreme corner case,...
Change backend._GetMasterInfo to return more data
The _GetMasterInfo() function needs to export the master name too to beuseful in master safety checks. This patch makes it a public (no _)function and adds a third element in the return tuple. Its callers are...
Pass hypervisor type to the OS scripts
It's handy to make the os scripts know which hypervisor the instance isgoing to run under. In order not to change the os API we pass thisinformation in the environment, where the os scripts can access it ifthey're hypervisor-aware....
Don't always remove queue lock when queue is purged
The lock should only be removed if ganeti-noded is going to quit.Otherwise it needs to be kept to prevent another process from creatingit again while we're still holding the (removed) lock. This is due to...
backend: Add optional exclusion list to _CleanDirectory
The code cleaning the queue will make use of it.
noded: Add RPC function to rename job queue files
This will be used to archive jobs.
backend: Add function to check whether file is in queue dir
Another function will need to check whether its parametersare job queue files.
Disallow uploading job queue files through upload_file
The job queue is now updated through its own RPC functions.
Add job queue RPC functions
jobqueue_update: Uploads a job queue file's content to a node. Themost common operation is to upload something that we already havein a string. Unlike in the upload_file function, the file is notread again when distributing changes, but content has to be passed...
Move function cleaning directory to module level
JobQueuePurge() will be used by an RPC function.
Clean job queue directories when leaving cluster
Old job files shouldn't be left on nodes removed from a cluster.
Allow job queue files to be uploaded through ganeti-noded
This is needed for job queue replication.
Fix pylint-detected issues
This is mostly: - whitespace fix (space at EOL in some files, not all, broken indentation, etc) - variable names overriding others (one is a real bug in there) - too-long-lines - cleanup of most unused imports (not all)...
Fix some errors detected by pylint
Rework master startup/shutdown/failover
This (big) patch reworks the master startup/shutdown and the fixes themaster failover.
What does the patch do?
For master start/stop: - remove the old ganeti-master script and its associated man page - moves the ip start/stop directly into the backend.(Start|Stop)Master...
Add a new parameter to backend.(Start|Stop)Master
This patch adds a new, unused for now, parameter to the start and stopmaster operations in backend. The idea behind it is that we need to beable to control whether the IP (de)activation is coupled with daemon...
Distribute the queue serial file after each update
This patch adds distribution of the queue serial file after each writeto it (but before a new job is created and written with that ID, andbefore a response is returned, so we should be safe from crashes in...
Convert backend.py to the logging module
The patch also switches some of the exception logs to uselogging.exception (and therefore the log message will have a diferentformat).
(Note that this might not be a good choice in all cases, though)
Fix backend.NodeVolumes handling of LVM output
This is the same fix as for GetVolumeList.
I've checked manually and all other places that call lvm commands arealready checking the output validity in terms of correct number offields.
Fix backend.GetVolumeList handling of LVM output
Sometimes ‘lvs’ can spit error messages on stdout, even when one wantsto parse the output:...Inconsistent metadata copies found - updating to use version 2776...
So we need to validate the output to guard against such cases....
Allow VNC_PASSWORD_FILE to be rpc-uploaded
What could possibly go wrong?
raise QuitGanetiException in LeaveCluster
Add a rpc call for BlockDev.Close()
This patch adds rpc layer calls (in rpc.py and the equivalent inganeti-noded) to close a list of block devices, and the wrapper inbackend.py that takes a list of Disk objects, identifies them andreturns correctly formatted results....
Expose block device grow in backend.py
This patch adds a wrapper over the block device grow operation thatconverts the input and output parameters as needed for the rpc layer.
Add migration support at the rpc layer
This patch adds the migration rpc call and its implementation in thebackend. The patch does not deal with the correct activation of disks.
Because of the new RPC, the protocol version is increased.
Implement node daemon conectivity tests
This patch adds in gnt-cluster verify checks for inter-node tcpcommunication checks on the node daemon port for both the primary and(if defined) secondary networks.
The output looks like (4-node cluster, one with the secondary interface...
Reduce chance of ssh failures in verify cluster
The cluster verify builds a sorted list of nodes and passes that to allthe nodes (in parallel) for ssh checks. This means that for a clusterwith N nodes, there will be approximately N simultaneous connections to...
Remove non-existing arguments from some docstrings
A fewdocstrings in the HooksRunner backend class list arguments the relevantfunctions do not take. Clean them up.
Move iallocator script execution to ganeti-noded
Currently the iallocator execution takes place in the master, which is aviolation of the current architecture, and will create problems with athreaded master daemon.
This patch moves the execution to the backend, similar to the hooks...
backend.FinalizeExport: safely initialize some vars
This patch initializes nic_count and disk_count with 0. This preventssome reference errors if the snap_disks block device list is empty.
Move the OS search code into an abstract function
Based on the previous OS search code changes, we can now move the OSsearch code into a generic look-for-file function in utils.py. Thismeans that the allocator code can use the same function.
Change backend._OSSearch return values
Currently, the function backend._OSSearch() returns the (first) base dirin which this OS can be found. Thereafter the full actual path to the OSdir is built in the backend.OSFromDisk() function.
This patch changes this so that _OSSearch() always returns the full path...
Backend directory functions for file backend
Add _[Create,Remove,Rename]FileStorageDir function which are needed forfile-based instance management. These function check whether the givendirectory to operate on is under the cluster-wide defined default file...
Move SSH functions into a class
This renames some functions and does some minor codestyle cleanup.
Replace custom file writing code with utils.WriteFile
Fix master role stop on cluster destroy
Currently the cluster destroy doesn't remove the master role, whichmeans that the IP address of the cluster remains assigned to the masternode.
This patch fixes this and also a docstring in backend.StopMaster()....
Small comment fix.
Fixes small spell mistakes and comments
Alter the device activation code
This tiny patch fixes the breakage that the previous patch aboutactivation did by removing the Close() call after activation.
The initial reason for that call was that if the device is alreadyactive and open, but we need it closed, we close it automatically....
Export bridge information too
gnt-backup export used to export the ip and mac of each nic, but not whichbridge it was connected to. Adding this information.
Fix VG listing broken by r510
LVM code sometimes adds an extra separator at the end of the field list.Make the code strip it if exists.
Make backend._GetVGInfo check the validity of 'vgs'
Currently, the function backend._GetVGInfo only checks for errors viathe exit code of the 'vgs' command. However, there are other ways offailure so we need to also check for valid output before parsing....
Change a hardcoded path into its proper constant
The function backend.UploadFile still uses "/etc/hosts" directly insteadof the existing constant; this patch fixes this.
Two small style fixes
This is a merge from the 1.2 branch
Improve verify-disks: broken/missing LV detection
This patch improves the ‘gnt-cluster verify-disks’ command by addingsupport for detecting broken volume groups and missing logical volumenames.
As such, we don't try anymore to activate disks for instances that are...
Return more data in rpc.call_volume_list
Currently, the volume_list call returns only the volume size. However,it is useful to also have two other things: the 'inactive' state of thevolume (which might trigger a ‘vgchange -a y’ on the volume group) and...
On OS creation errors, write logfile path to ganeti-noded's logfile.
Reviewed-by: schreiberal
Output reading fix for backend.NodeVolumes()
Use result.stdout instead of result.output to avoid potential confusionby merged in error messages from stderr.
Modify GetVolumeList so output on stderr from lvs doesn't break it.
Various code style fixes for strings.
- When line wrapping is needed, move spaces to the next line.- Remove embedded line breaks from error messages.
When an assembly error occurs log it too
Right now an assembly error produces an exception but not a log message. Thisis bad because the exception suggests looking at the log, but the log itselfhas a lot of errors which are not really a problem and only some which really...
Fix a wrong comparison in _RecursiveAssembleBD
We want to prevent sending too many 'None' children to a device.However, the test as it is today is wrong, as we want to test thesituation after adding a new child, and not before. This patch fixesthis by testing greater-or-equal instead of just greater....
Use new functions to modify /etc/hosts.
Enhance secondary node replace for drbd8
This (big) patch does two things: - add "local disk status" to the block device checks (BlockDevice.GetSyncStatus and the rpc calls that call this function, and therefore cmdlib._CheckDiskConsistency) - improve the drbd8 secondary replace operation using the above...
Allow DRBD8 operation without backing storage
This patch adds the following functionality: - DRBD8 devices can assemble without local storage (done by allowing None in the list of children, and making DRBD8 to ignore all children if any is None)...
Change the way remove children is called in bdev
For some cases, we don't have to have access to the children of a devicein order to remove them (e.g. md over lvs, or drbd over lvs). In orderto ease the removal process, skip over finding the child if it provides...
Fix a unhandled error case in device creation
The block device creation process is the following: - device create - device assembly (on primary or depending on dev_type, on secondary too) - set sync speed - return
The problem is that device assembly after creation was not checked for...
Miscellaneous style fixes
This patch fixes some minor pylint warnings (unused variables, wrongindentation, etc.) and a real bug in the recovery for drbd8 renameprocedure.
Make DiagnoseOS use the modified OS objects
Modify backend.py so that DiagnoseOS only returns OS objects rather thanInvalidOS errors, and make sure gnt-os understands the new objects. Also deletethe deprecated helper functions from gnt-os.
Reviewed-By: iustinp
Fix two typos in a doc string
Remove a wrong "i" and add a missing ")" to the DiagnoseOS function doc string.
Implement device to instance mapping cache
Currently, troubleshooting DRBD problems involves a manual process of goingbackwards from the DRBD device to the instance that owns it.
This patch adds a weak (i.e. not guaranteed to be correct or up-to-date)cache of device to instance. The cache should be, in normal operation,...
Whitespace fixes
Fix a non-clear error message
Implement replace-disks for drbd8 devices
This patch adds three modes of disk replacement for drbd8: - replace the disk on the primary node - replace the disk on the secondary node - replace the secondary node
It also adds some debugging code to backend.py and increments the...
Implement block device renaming
This patch add code for renaming a device; more precisely, for changingthe unique_id of the device. This means: - logical volumes, rename the volume - drbd8, change the remote peer
This is needed for the being able to replace disks for drbd8....
Modify two mirror-device related rpc calls
The two calls mirror_addchild and mirror_removechild take only one childfor addition/removal. While this is enough for our md usage, for localdisk replacement in drbd8, we need to be able to specify both the data...