History | View | Annotate | Download (75.3 kB)
Switch call_blockdev_create call to (status, data)
This allows errors to be visible at the user level instead of just nodedaemon logs.
Reviewed-by: ultrotter
Block device creation cleanup
Currently when creation LVM-based instances, we always get theextremely-confusing message "ERROR Can't find LV /dev/xenvg/..." whichis actually expected. This behaviour was introduced before we hadUUID-style LV names, since at that point it was not a unexpected to have...
Forward port the live migration from 1.2 branch
This is forward port via copy (and not individual patches cherry-pick)of the latest code on the 1.2 branch related to the migration.
The changes compared to 1.2 are the fact that we don't need theIdentifyDisks step anymore (the drbd rpc calls are independent now), and...
Forward-port DrbdNetReconfig
This is a modified forward-port of DrbdNetReconfig and their associatedRPCs. In Ganeti 2.0, these functions will be used for two things: - live migration (as in 1.2) - and for other network reconfiguration tasks, since DRBD8.Attach()...
backend: rename AttachOrAssemble to Assemble
Since now the Assemble function is different than Attach, we rename thisbackend function to show that the intent is to fully assemble the device(and it's always allowed to modify the device).
Add an instance_migratable rpc call
This is a forward-port of commit 1194 on the 1.2 branch:
This call will check whether an instance is up on its primary, and that it has been started with symlinks. We currently have no on-secondary checks, nor any hypervisor specific call....
backend: Remove symlinks by disk name
This is a modified forward-port of commit 1184 on the 1.2 branch:
backend: Remove symlinks by disk name, not using a wildcard
The changes to the original patch are related to the docstring style and...
Pass instance name to rpc call blockdev_close
This is an extract of commit 1166 on the 1.2 branch (Add a rpc call fordrbd network reconfiguration), but only the blockdev_close part.
The patch changes the blockdev_close call to take the instance so that...
Fix the _RemoveBlockDevLinks() function
This is a forward-port of commit 1163 on the 1.2 branch: This fixes the removal of the instance symlinks (probably breakage from the glob changes).
Reviewed-by: imsnah
Remove instance's symlinks
This is a forward-port of commits 1150 and 1151 on the 1.2 branch: Add _RemoveBlockDevLinks auxiliary function, called when an instance fails to start and when it is shut down.
Reviewed-by: iustinp
and: Fix cut&paste error when removing symlinks...
Catch BlockDeviceError when starting instance
This is a forward-port of commit 1149 on the 1.2 branch: _GatherAndLinkBlockDevs used to raise the errors.BlockDeviceError exception when it failed to create a block device, and with this patch set it does so also when it fails to create a symlink to it....
Create symlinks to intances' block devices
This is a forward-port of commit 1148 on the 1.2 branch: Change the _GatherBlockDevs private function, called only one time by StartInstance, to _GatherAndLinkBlockDevs, and make it transform the device returned even more by calling the new _SimlinkBlockDev auxiliary...
Simplify hypervisor block_devices structure
This is a partial forward-port of commit 1136 on the 1.2 branch:
The hypervisor doesn't need to be passed the whole block device structure, so we'll just give it the block device name on the local node, and the name as seen by the instance. This will make it easier to...
Use subdirectories for job queue archive
As it turned out, having many files in a single directory can bevery painful. With this patch, only 10'000 files are stored in adirectory for the job queue archive. With 10'000 directries, thisallows for up to 100 million jobs be archived without having large...
cleanup: fix export NIC count the same way as disk
For safety, we use the same algorithm as in disk count.
Reviewed-by: amishchenko
cleanup: fix backend._RecursiveFindBD
_RecursiveFindBD takes a parameter that isn't used; moreover, nowhere inthe SVN history can I find a case that it has been used.
As such, remove this parameter and fix its callers.
cleanup: more unused vars
cleanup: sanitize a default parameter
Instead of relying that the usage of the parameter is ok with mutabledefault parameters, let's just make it safer..
Fix epydoc format warnings
This patch should fix all outstanding epydoc parsing errors; as such, weswitch epydoc into verbose mode so that any new errors will be visible.
ganeti.backend: Improve compression check
RPC: Compress file upload data
Adding compression to larger amounts of data is more efficient thantransferring it (len(nodes) - 1) times over the network withoutcompression. We were able to compress a 800KB config file to about30 KB, which is about 40 KB with Base64 encoding (required due to...
Cleanup the config file on demotion from candidate
This patch adds a simple rpc which makes a backup of the config file andthen removes it. This is done so that cluster verify doesn't complainimmediately after demoting a node.
Fix gnt-cluster verify w.r.t. rpc changes
This partially reorganizes the cluster verify LU: - introduce constants for the node verify rpc call - move from additional rpc calls to a single rpc call, the call_node_info, which gaters all data needed...
Simplify a little the ssconf update
We have (again) the KeyToFilename function, so we move the writing ofthe files to a method under SimpleStore.
Revert "Get rid of ssconf"
This partially reverts the "Get rid of ssconf" patch.
It adds back a simpler version of the SimpleStore class, and drops theWritableSimpleStore class. The new version of the class also hasnode_list as a new key, and increases the size of the keys so that big...
Fix gnt-backup export
This patch fixes a bug in disk calculation for gnt-backup export, whichcompletely broke one-disk instance export.
The patch also corrects some error messages and style issues.
Pass ssconf values from master to node
Instead of parsing the configuration on the node, we pass the ssconfvalues from the master.
Correct GetAllInstancesInfo rtype
GetAllInstancesInfo, in the backend, returns just a dict, not a dict of dicts.
Add RPC call to update ssconf files
Convert trunk to posix-compatibility
We change two functions to use RunCmd without shell, and the other(which needs a ssh command line) is changed to the '>... 2>&1' syntax.
Update backend.py docstrings
This patch converts all of backend.py to epydoc formatting.
Fix another error handling case
The return from this error path is a dict, but the actual return value(on the non-error handling) is a list of dicts. Change accordingly.
Fix an error handling case
Found while reviewing documentation.
OSFromDisk remove superfluous empty line
Some documentation updates
This fixes a few doc issues and converts a few docstrings to epydoc.
Export the disk index in the import/export scripts
We want to export the disk index as some OSes will only want to exportthe first disk (or the second one, etc.), even if we have multipledisks.
The patch also updates the backend.ExportSnapshot docstring....
Convert ImportOSIntoInstance to OS API 10
- Change ImportOSIntoInstance not to get any "os_disk" and "swap_disk" arguments but to accept multiple target images to import, and to return a list of booleans with the result of each import- Change the relevant rpc call and the only caller to conform...
Convert ExportSnapshot to OS API 10
We pass the data via the environment rather than on the command line, asAPI 10 says. All the rest remains the same, and we export just one disk,as the master calls this function for every snapshotted disk.
LUExportInstance: snapshot all disks
Rather than just snapshotting the "sda" disk, we'll snapshot all of theinstance disks. If we can't snapshot a disk for any reason we'll log anerror and proceed anyway: in this case the resulting export will miss adisk. This also changes all the warning messages to self.LogWarning()....
Convert SnapshotBlockDevice's docstring to epydoc
Cleanup os_add/rename rpc for OS API 10
- remove now unused osdev and swapdev arguments from backend, noded, rpc, cmdlib- convert docstrings to epydoc
Temporarily explicitely break import/export
Since they're not converted to API 10 yet, we temporarily disable theimport/export functions.
AddOSToInstance: convert to api10
RunRenameInstance: convert to api10
Add new OSEnvironment function
This function calculates the basic environment for OS scripts in APIversion 10.
OSFromDisk: use script names from constants
Change OSFromDisk's docstring to epydoc
Plus update it with the real variable name
Add a rpc call for changing the drain flag
A new multi-node call is added that sets/resets the drain flag.
Change the backend to use the beparams
The backend.FinalizeExport function is changed to use the beparamsinstead of the instance attributes. Future enhancements should be donein order to export and import/reuse the whole be/hv params.
Temporary fix for dual hvm/pvm instances
We have a problem with the current model of combining instance listsfrom multiple hypervisors: we don't allow duplicates, but "xm list" gives the same output for both pvm and hvm. This is a lack in the actualxen hypervisor implementation/split between pvm and hvm, but for now we...
Export the hypervisor.ValidateParameters over RPC
The newly-added node-specific ValidateParams hypervisor method isexported over RPC, using the semi-standard (success, message) returnvalue. Multi-node call, so that we call on both primary and secondary at...
Abstract checking own address into a function
Currently, we check if we have a given ip address (i.e. it's alive onone of our interfaces) but manually calling TcpPing(source=localhost).This works, but having it spread all over the code makes it hard to...
OS API: support for multiple versions in an OS
Allow multiple api versions in an OS. This is according to the OS APIchanges design doc, by which an OS can support multiple versions of theGaneti API and if one is supported by Ganeti it will work. Since up to...
Move the hypervisor attribute to the instances
This (big) patch moves the hypervisor type from the cluster to theinstance level; the cluster attribute remains as the default hypervisor,and will be renamed accordingly in a next patch. The cluster also gains...
rpc.call_instance_migrate: pass the whole instance
Currently the call_instance_migrate call only passes the instance name;we need to pass the whole object for the hypervisor_type changes (allthe other individual instance rpc calls already pass the instance...
backend.py change to get cluster name from master
Currently there are three function in backend that need the cluster namein order to instantiate an SshRunner. The patch changes these to get thecluster name from the master in the rpc call; once the multi-hypervisor...
Fix SshRunner breakage from the changed API
More places actually use the SshRunner than just the gnt-clustercommands.
Convert ssh.py
Get rid of ssconf and convert to configuration instead.
Convert hypervisor
Replacing ssconf with configuration.
Convert backend.py
Replacing ssconf with simpleconfig.
Never remove job queue lock in node daemon
Otherwise, corruption could occur in some corner cases. E.g. whenLeaveNode is running in a child and is in the process of removingqueue files, the main process gets killed, started again and getsa request to update the queue. This is rather extreme corner case,...
Change backend._GetMasterInfo to return more data
The _GetMasterInfo() function needs to export the master name too to beuseful in master safety checks. This patch makes it a public (no _)function and adds a third element in the return tuple. Its callers are...
Pass hypervisor type to the OS scripts
It's handy to make the os scripts know which hypervisor the instance isgoing to run under. In order not to change the os API we pass thisinformation in the environment, where the os scripts can access it ifthey're hypervisor-aware....
Don't always remove queue lock when queue is purged
The lock should only be removed if ganeti-noded is going to quit.Otherwise it needs to be kept to prevent another process from creatingit again while we're still holding the (removed) lock. This is due to...
backend: Add optional exclusion list to _CleanDirectory
The code cleaning the queue will make use of it.
noded: Add RPC function to rename job queue files
This will be used to archive jobs.
backend: Add function to check whether file is in queue dir
Another function will need to check whether its parametersare job queue files.
Disallow uploading job queue files through upload_file
The job queue is now updated through its own RPC functions.
Add job queue RPC functions
jobqueue_update: Uploads a job queue file's content to a node. Themost common operation is to upload something that we already havein a string. Unlike in the upload_file function, the file is notread again when distributing changes, but content has to be passed...
Move function cleaning directory to module level
JobQueuePurge() will be used by an RPC function.
Clean job queue directories when leaving cluster
Old job files shouldn't be left on nodes removed from a cluster.
Allow job queue files to be uploaded through ganeti-noded
This is needed for job queue replication.
Fix pylint-detected issues
This is mostly: - whitespace fix (space at EOL in some files, not all, broken indentation, etc) - variable names overriding others (one is a real bug in there) - too-long-lines - cleanup of most unused imports (not all)...
Fix some errors detected by pylint
Rework master startup/shutdown/failover
This (big) patch reworks the master startup/shutdown and the fixes themaster failover.
What does the patch do?
For master start/stop: - remove the old ganeti-master script and its associated man page - moves the ip start/stop directly into the backend.(Start|Stop)Master...
Add a new parameter to backend.(Start|Stop)Master
This patch adds a new, unused for now, parameter to the start and stopmaster operations in backend. The idea behind it is that we need to beable to control whether the IP (de)activation is coupled with daemon...
Distribute the queue serial file after each update
This patch adds distribution of the queue serial file after each writeto it (but before a new job is created and written with that ID, andbefore a response is returned, so we should be safe from crashes in...
Convert backend.py to the logging module
The patch also switches some of the exception logs to uselogging.exception (and therefore the log message will have a diferentformat).
(Note that this might not be a good choice in all cases, though)
Fix backend.NodeVolumes handling of LVM output
This is the same fix as for GetVolumeList.
I've checked manually and all other places that call lvm commands arealready checking the output validity in terms of correct number offields.
Fix backend.GetVolumeList handling of LVM output
Sometimes ‘lvs’ can spit error messages on stdout, even when one wantsto parse the output:...Inconsistent metadata copies found - updating to use version 2776...
So we need to validate the output to guard against such cases....
Allow VNC_PASSWORD_FILE to be rpc-uploaded
What could possibly go wrong?
raise QuitGanetiException in LeaveCluster
Add a rpc call for BlockDev.Close()
This patch adds rpc layer calls (in rpc.py and the equivalent inganeti-noded) to close a list of block devices, and the wrapper inbackend.py that takes a list of Disk objects, identifies them andreturns correctly formatted results....
Expose block device grow in backend.py
This patch adds a wrapper over the block device grow operation thatconverts the input and output parameters as needed for the rpc layer.
Add migration support at the rpc layer
This patch adds the migration rpc call and its implementation in thebackend. The patch does not deal with the correct activation of disks.
Because of the new RPC, the protocol version is increased.
Implement node daemon conectivity tests
This patch adds in gnt-cluster verify checks for inter-node tcpcommunication checks on the node daemon port for both the primary and(if defined) secondary networks.
The output looks like (4-node cluster, one with the secondary interface...
Reduce chance of ssh failures in verify cluster
The cluster verify builds a sorted list of nodes and passes that to allthe nodes (in parallel) for ssh checks. This means that for a clusterwith N nodes, there will be approximately N simultaneous connections to...
Remove non-existing arguments from some docstrings
A fewdocstrings in the HooksRunner backend class list arguments the relevantfunctions do not take. Clean them up.
Move iallocator script execution to ganeti-noded
Currently the iallocator execution takes place in the master, which is aviolation of the current architecture, and will create problems with athreaded master daemon.
This patch moves the execution to the backend, similar to the hooks...
backend.FinalizeExport: safely initialize some vars
This patch initializes nic_count and disk_count with 0. This preventssome reference errors if the snap_disks block device list is empty.
Move the OS search code into an abstract function
Based on the previous OS search code changes, we can now move the OSsearch code into a generic look-for-file function in utils.py. Thismeans that the allocator code can use the same function.
Change backend._OSSearch return values
Currently, the function backend._OSSearch() returns the (first) base dirin which this OS can be found. Thereafter the full actual path to the OSdir is built in the backend.OSFromDisk() function.
This patch changes this so that _OSSearch() always returns the full path...
Backend directory functions for file backend
Add _[Create,Remove,Rename]FileStorageDir function which are needed forfile-based instance management. These function check whether the givendirectory to operate on is under the cluster-wide defined default file...
Move SSH functions into a class
This renames some functions and does some minor codestyle cleanup.
Replace custom file writing code with utils.WriteFile
Fix master role stop on cluster destroy
Currently the cluster destroy doesn't remove the master role, whichmeans that the IP address of the cluster remains assigned to the masternode.
This patch fixes this and also a docstring in backend.StopMaster()....
Small comment fix.
Fixes small spell mistakes and comments
Alter the device activation code
This tiny patch fixes the breakage that the previous patch aboutactivation did by removing the Close() call after activation.
The initial reason for that call was that if the device is alreadyactive and open, but we need it closed, we close it automatically....
Export bridge information too
gnt-backup export used to export the ip and mac of each nic, but not whichbridge it was connected to. Adding this information.