History | View | Annotate | Download (143.2 kB)
Relax replace_disks_all meaning for drbd8
In order to make the replace secondary action to be done via the sameopcode parameters for both remote_raid1 and drbd, we must allow theLUReplaceDisks to change replace_disks_all for drbd with non-emptyremote_node into replace_disks_sec....
Use new functions to modify /etc/hosts.
Reviewed-by: schreiberal
Changes related to logging
This patch modifes: - mcpu.Processor.LogWarning to have its 'hint' parameter as optional and only log it if not None - cmdlib._WaitForSync to not log directly to stdout/stderr but via the proc.Log(Info|Warning) methods...
Enhance secondary node replace for drbd8
This (big) patch does two things: - add "local disk status" to the block device checks (BlockDevice.GetSyncStatus and the rpc calls that call this function, and therefore cmdlib._CheckDiskConsistency) - improve the drbd8 secondary replace operation using the above...
Check whether init.d script is executable.
Enhance DBRD8 disk replacement (same nodes)
This patch adds enhanced reporting and much more checks to the diskreplacement (when not switching the secondary).
Reviewed-by: imsnah
Handle missing init script at cluster init
This patch adds a check in the prereq of LUInitCluster for the existenceof the init script. This allows a clean abort instead of a stack dump.
Based on a report by admin@steibei.net
Reviewed-by: ultrotter
Miscellaneous style fixes
This patch fixes some minor pylint warnings (unused variables, wrongindentation, etc.) and a real bug in the recovery for drbd8 renameprocedure.
Convert os_get to use OS rather than InvalidOS
In order to do this for simplicity we leave the OSFromDisk function as-is andwe convert the eventual exception to an OS object in ganeti-noded. Theunmangling gets simplified and so does the code for checking whether the OS is...
Make call_os_get a single node function
call_os_get is never called with a real list of nodes, so there's no point init being multi-node. Making it single-node till a usage for multi-node call isfound.
Reviewed-By: iustinp
Implement tag searching
This patch adds a search command for locating tags on all objects of thecluster using a regex pattern.
Reviewed-by: aat
Implement device to instance mapping cache
Currently, troubleshooting DRBD problems involves a manual process of goingbackwards from the DRBD device to the instance that owns it.
This patch adds a weak (i.e. not guaranteed to be correct or up-to-date)cache of device to instance. The cache should be, in normal operation,...
More sane handling of errors during failover
Currently we ignore errors on instance shutdown (on the source node)during instance failover. We should do this only if the user gave acommand line options allowing this, as it's a dangerous thing to do.
This patch fixes this by using the same "--ignore-consistency" option...
Fix bridge checking in instance failover
The current code checks the bridge on the primary node of the instance,but we need to check it on the destination node.
This was caught by testing failover with a down primary node.
Fix _UpdateEtcHosts to understand empty lines.
Change the signature of some methods of mcpu.Processor
This patch moves the passing of the feedback_fn argument from the(Exec|Chain)OpCode to the initialization of the Processor instance.
Implement replace-disks for drbd8 devices
This patch adds three modes of disk replacement for drbd8: - replace the disk on the primary node - replace the disk on the secondary node - replace the secondary node
It also adds some debugging code to backend.py and increments the...
Modify two mirror-device related rpc calls
The two calls mirror_addchild and mirror_removechild take only one childfor addition/removal. While this is enough for our md usage, for localdisk replacement in drbd8, we need to be able to specify both the data...
Initial implementation of drbd8 template type
This is a partially working drbd8 template type. It does: - add/remove - startup/failover/shutdown
Not working is replace disks, which needs custom code for this template.
Fix a disk handling bug triggered by failover
This leaves an instance's disks configured for the primary node as afterdisk activation we want to start the instance anyway. As such,_GatherBlockDevs in backend.py will need the disks configured for theprimary....
Abstract more strings values into constants
Currently, the disk types are defined using constants in the code.Convert those into constants so that we can easily find them and checktheir usage.
Note that we don't rename the values of the constants as they are used...
Patch series for reboot feature, part 2
This patch series implements the reboot command for gnt-instance. Itsupports three types of reboot: soft (hypervisor reboot), hard (instanceconfig rebuild and reboot) and full (full instance shutdown and startupagain)....
Make “gnt-cluster verify” exit 0 if there's no problem with instances.
Add the number of VCPUs in gnt-instance info
Allow force removal of instances
This patch adds a new option to the instance removal command"--ignore-failures" that forces the removal of the instance from theconfiguration even if the removal process encounters errors.
In order to be able to do this when the remote node(s) is(are) down, we...
Replace more ssh paths with proper constants
The node's ssh keys filenames are now provided as constants; this shouldallow easier customization.
Also, the user's ssh key computing has been abstracted into ssh.py
Some small improvements to the hooks environment
For the configuration update hook, it's useful to have a consistent namefor the target of the operation. As such, the LU code is modified toinclude an GANETI_OP_TARGET that points either to the cluster (name),...
Split the hooks env building in two parts
This patch moves some of the environment processing from _BuildEnv to anew _RunWrapper command which does the stringification and adds thesstore variables.
The reasoning is that the sstore can be fresher than before the...
Remove fping as a dependency for Ganeti.
This patch completely gets rid of fping - replace all fping invocations with TcpPing calls - update documentation accordingly. - associated cleanups (use constant for localhost IP, use more sensible defaults for TcpPing and use those)...
Remove the shebang from modules
Since modules are not directly executables, remove the shebang fromthem. This helps with lintian warnings.
Also make the autogenerated _autoconf.py contain two comment lines atthe beginning, like the other modules.
Add boot id to “gnt-node list”.
Reviewed-by: iustinp
Change tags add/remove to process multiple tags
This patch changes the tags opcodes to work with multiple tags at onceinstead of only one. As such, the opcodes and some parameters arerenamed.
Fix tags operations for instances
Remove requirement that host names are FQDN
We currently require that hostnames are FQDN not short names(node1.example.com instead of node1). We can allow short names as longas: - we always resolve the names as returned by socket.gethostname() - we rely on having a working resolver...
Allow 'add instance' to not start the new instance
This patch allows 'gnt-instance add' to not start the newly-createdinstance. It also allow 'gnt-instance add' and 'gnt-backup import' tonot check for IP conflicts (only when not starting the instance)....
Change resolved hostname from dict to a class
The current result of utils.LookupHostname() is a dict, but this doesnot allow static checkers to check the correctness of the code. Thispatch introduces a new class names HostInfo and changes LookupHostname...
Implement cluster rename operation
This patch adds a new OpCode (and corresponding LU) that implements thecluster rename functionality.
This is done by shutting down the master role, making the needed sstoremodifications and distributing the changed files to all nodes, and then...
Implement instance rename operation
This patch adds support for instance rename operation at all remaininglayers: RPC, OpCode/LU and CLI.
Change OpQueryNodes nodes attribute to names
Change this to have the exact same parameters as OpQueryInstances.
Also fix burnin which is broken since r146.
Enable LUQueryInstances to work with a given list of instances
As per the changes to LUQueryNodes, the QueryInstances LU is modified toaccept a list of instances for which to compute and return information.
Remove OpQueryNodeData and LUQueryNodeData
Now that LUQueryNodes supports all the functionality of LUQueryNodeData,let's migrate gnt-node.ShowNodeConfig to use it and remove all traces ofOpQueryNodeData and LUQueryNodeData.
Change LUQueryNodes to return raw values and support selective listing
LUQueryNodes it's very similar to LUQueryNodeData, but it lacks twofeatures: - instance list (it has count though), both primary and secondary - selective node listing
In order to support these features, we change it to return raw values...
Change _GetWanted* to return names instead of objects
On closer look, all except one of the current users of _GetWantedNodes areusing only the name of the nodes and throw away the other attributes. It makessense to make this function return only the name list (as in the future this...
Move string formatting out of LUQueryInstances
Currently, LUQueryInstances will provide strings for its results. This makes ithard for other consumers than "gnt-instance list" to use the OpQueryInstancesopcode for whatever they wish to.
The change moves the formatting in five of the six cases where this happens to...
Clone cmdlib._GetWantedNodes into _GetWantedInstances
This duplicates _GetWantedNodes to _GetWantedInstances, after doing somechanges to it: - fix an indentation error that should result in only the last node name passed being chosen - change the function to have a single return statement...
- Check for secondary node before doing a failover.- Replace magic values by constants.
Add one more check on cluster init.
This adds a check that the initial node's IP name doesn't resolve to aloopback address (127.x.y.z).
Also remove an unused variable.
Refuse nodes with non-FQDN hostnames.
This changes the cluster init and node join to refuse a node that has adifferent hostname than what the resolver returns.
Rework ssh known-hosts handling.
This changes: - cluster setup, we no longer edit /etc/ssh/ssh_known_hosts but our own file - node add, we no longer remove root's known_hosts (twice) - gnt-instance console, both the LU and the script: since now the ssh...
Improve LURunClusterCommand
This function used a hand-coded ssh call to remote nodes. Fix it to use thessh.SSHCall function, and in the process drop the command field from theresults, as it's too verbose and we can use (in gnt-cluster) what we passed in....
Fix one wrong usage of _GetWantedNodes
_GetWantedNodes is used wrongly by the LUClusterCopyFile. This fixes that.
Add support for listing instance disk sizes.
A CheckPrereq method had one unconverted "return 1" statement. Change it to theappropriate raise.
Some small fixes.
It fixes the main Makefile.am to create $localstatedir/{lib,log}/ganeti.It fixes the testing Makefile.am after the rename fake_config.py ->mocks.py. It strips the output of "ip link show" to have a nicer outputif the master netdev does not exist.
Style changes for pep-8 and python-3000 compliance.
This changes the raising of exceptions from: raise Exception, valueto raise Exception(value)
as the first form will be removed in python-3000 and the second form ispreferred now.
The changes also involve a few cases of changing from raising standard...
Implement tag support for cluster, nodes and instances.
This is only the backend part, from the command line the tags can't beread/modified yet.
Don't bail out if node isn't there on “gnt-node volumes”.
Add instance name to LVM volume as a tag.
Change logival volume names to not be based on the instance's name, butinstead use an UUID prefix and a suffix denoting the disk iv_name(sda/sdb) and possibly it's type (data/meta).
Fix issues reported by pylint.
Unify environment variables for instance related hooks.
Check for instances on “gnt-cluster destroy”.
- Implement “gnt-instance reinstall --os-type=…”- Add the command to gnt-instance.sgml
Fix the "gnt-cluster getmaster" command by making the LuQueryClusterInforunnable on non-master nodes (and remove the list of instances and nodesreturned by it, that information can be retrieved by other opcodes).
Also, remove the node/instance list from "gnt-cluster info" as it...
Move the cluster name from ConfigWriter to SimpleStore.
Reason: if left ConfigWriter, nodes don't know to which cluster they belong.This will bite us later when we'll revisit node join operation.
Cons: we lose the cluster name from the config file, which means a...
- Move --force option to cli.py- Implement “gnt-instance reinstall”- Fix two typos
Comment formatting updates.
It seems the _CheckNodesDirs function is no longer used. Let's remove it.
Since the watcher can run on all nodes, let's get rid of the cron filehandling, as it can be static and outside of ganeti.
This also means we can get rid of a lot of infrastructure too: - the master/node config files checkers - one rpc function
Add description, fix indentation.
- Implement “gnt-node volumes”- Create all --output options using a constant- Put node checking code from opcodes into a single function- Do the same for output fields
Big change/cleanup in relation to the master startup: - move the master node name from the ConfigWriter to SimpleStore (all nodes need this, and it was the only thing pulled in from the ConfigWriter on nodes) - fix mcpu.py and the testing w.r.t. this change; for testing, rename...
Fix a typo in an error message, and actually pass it its parameters.
Reviewed-By: imsnah
Output instance name in error message instead of object representation.
Fix calls to _GenerateMDDRBDBranch.
Check for memory size requirements before failing over an instance.
Implement space requirement checking before creating/importing an instance.
Initial commit.