History | View | Annotate | Download (26.3 kB)
Stop all daemons precautiosly before trying to start ganeti-noded again
Please note that if the pid file is broken or missing we'll not catchthe process (if any is running) and it's up to the user to fix this state
Signed-off-by: René Nussbaumer <rn@google.com>...
InitConfig: create nodegroups as well
This patch also ensures that the initial configuration has all theneeded UUIDs and that they are unique, by using aTemporaryReservationManager inside InitConfit to generate them.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
Change bootstrap.SetupDaemonNode to use scp as we can assume SSH is setup
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Manuel Franceschini <livewire@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Switch to the RPC call to update /etc/hosts in LUAddNode and LURemoveNode
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
gnt-node add: add error msg when using IPv6
Signed-off-by: Manuel Franceschini <livewire@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Use family in backend.StartMaster
This patches changes the StartMaster method to consult the clusterprimary ip version when deciding whether to use arping or ndisc6 afteractivating the master ip.
Signed-off-by: Manuel Franceschini <livewire@google.com>...
Support IPv6 node add
Signed-off-by: Manuel Franceschini <livewire@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add new cluster parameter primary_ip_version
We expose the ip_version (4, 6) to the external interface and internallywe convert it to ip_family (AF_INET=2, AF_INET6=10). This makes the codemore concise as all functions deal with family rather than version....
Fix some small newline style issues
Support for resolving hostnames to IPv6 addresses
This patch enables IPv6 name resolution by using socket.getaddrinfoinstead of socket.gethostbyname_ex.
It renames the HostInfo class to Hostname and unifies its use throughoutthe code. This is achieved by using static calls where no object is...
cluster init: Write ssconf before noded starts
This change is needed as we will need to read the primary ip versioncluster parameter before we start the node daemon. The reason is that weneed to know in advance if we bind to the IPv4 or IPv6 any address....
Introduce new IPAddress classes
This patch unifies the netutils functions dealing with IP addresses tothree classes:- IPAddress: Common IP address functionality- IPv4Address: IPv4 specific functionality- IPv6address: IPv6-specific functionality
Furthermore it adds methods to check whether an address is a loopback...
Always set commonName in X509 certificates
Due to the current switch of the RPC client to PycURL, a bug with newerversions of libcurl surfaced. When the 'Subject' or 'Issuer' of'server.pem' were empty, SSL handshake failed.
This patch changes the certificate generation functions such that they...
Introduce lib/netutils.py
This patch moves network utility functions to a dedicated module.
Add default_iallocator cluster parameter
Add a cluster parameter to hold the iallocator that will be used by defaultwhen required and no alternative (manually-specified iallocator ormanually-specified node(s)) is given.
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>...
Add a delay in master failover
I have seen some very seldom errors where (it seems) the address isstill live for a short while after removing it from the old master, thusthe new master will fail in startup/adding its own IP address.
To prevent against this, we add a delay/retry before we proceed, if the...
Check and set drbd helper during bootstrap
Signed-off-by: Luca Bigliardi <shammash@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Introduce utils.IsValidIP{4,6}()
This patch introduces functions to check for valid IPv4 and IPv6addresses and converts IsValidIP() to return True if it is either a IPv4or a IPv6 address.
For now we do not change the functional behavior and replace IsValidIP...
cfgupgrade: Local variable for cluster-domain-secret filename
This is necessary to allow cfgupgrade to work on a non-standard directory.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Merge branch 'devel-2.1'
Conflicts: doc/security.rst trivial lib/cli.py trivial
Signed-off-by: Balazs Lecz <leczb@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Add --uid-pool option to gnt-cluster init
Signed-off-by: Balazs Lecz <leczb@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Fix cluster behaviour with disabled file storage
There are a few issues with disabled file storage:- cluster initialization is broken by default, as it uses the 'no' setting which is not a valid path- some other parts of the code require the file storage dir to be a...
Fix cfgupgrade with non-default DATA_DIR
Commit 43575108 added bootstrap.GenerateclusterCrypto and commit7506a7f1 changed cfgupgrade to use it. However, this lost thefunctionality of upgrading in non-default DATA_DIR.
To fix this, we enhance bootstrap.GenerateclusterCrypto to accept custom...
Add a new cluster parameter maintain_node_health
This will be used to conditionally enable the watcher node maintenancefeature.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Add cluster domain secret
Information exchanged between different clusters via untrustedthird parties (e.g. for remote instance import/export) must besigned with a secret shared between all involved clusters toensure the third party doesn't modify the information....
Merge remote branch 'origin/devel-2.1'
Conflicts: lib/bootstrap.py: Trivial lib/constants.py: Trivial
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Rename SSL_CERT_FILE to NODED_CERT_FILE
To be consistent with RAPI_CERT_FILE, the rather generic named“SSL_CERT_FILE” constant is renamed to “NODED_CERT_FILE”. The actual filename is not changed.
Rightname confd's HMAC key
Currently, the ganeti-confd's HMAC key is called “cluster HMAC key” orsimply “HMAC key” everywhere. With the implementation of inter-clusterinstance moves, another HMAC key will be introduced for signing criticaldata. They can not be the same, so this patch clarifies the purpose of the...
bootstrap: Add new function to create cluster certs and keys
The code to generate cluster certificates, keys and secrets is currentlyspread over several places. It makes sense to move it to a separatefunction as we want to provide the user with the ability to automatically...
Validate the hostnames at creation time
This patch adds validation of new names used, i.e. at cluster init time,node add time, and instance creation.
For instances, especially when using «--no-name-check» (which skips DNSchecks), we should validate the give name, and also normalize it...
bootstrap: Wait for node daemon when adding new node
Until now this was only done for the master node, thoughthe problem originally fixed in 8f215968 also occurs forother node daemons.
Move function generating SSL certs into utils
Also add unittest.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Generate hmac file with a newline at the end
This makes it slightly easier to cut&paste its content.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Add new “daemon-util” script to start/stop Ganeti daemons
Until now, Ganeti started and stopped its own daemons using custom functions.To start, the daemon was just executed and then sent the appropriate signals tostop it again. Init scripts would have to pay attention to the PID file and...
Introduce a wrapper for hostname resolving
Currently a few of the LU's CheckPrereq use utils.HostInfo which raisesa resolver error in case of failure. This is an exception from thestandard that CheckPrereq should raise an OpPrereqError if the error isin the 'pre' phase (so that it can be retried)....
Another round of pylint-related style fixes
A newer version of pylint, more warnings…
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
bootstrap: Convert to utils.Retry
Convert the rest of the OpPrereqError users
This finishes the conversion of OpPrereqError creation to two-argumentstyle. Any leftovers as one-argument are not breaking anything, justlosing information about the errors.
Signed-off-by: Iustin Pop <iustin@google.com>...
Make cluster initialization more reliable
There was a race condition between starting the node daemonand sending requests to write the ssconf files. With thispatch, the initialization waits up to ten seconds for thenode daemon to become responsive.
Provide feedback from redistributing configuration
This is particularily useful for “gnt-cluster redist-conf”, butalso for all other cases where the configuration files arerewritten on other nodes.
$ gnt-cluster redist-conf… Copy of file /var/lib/ganeti/config.data to node … failed: Error while...
Adding '--no-ssh-init' option to 'gnt-cluster init'.
Allows the initialization of a cluster without the creation or distributionof SSH key pairs. Includes changes for LeaveCluster and RPC.
Signed-off-by: Ken Wehr <ksw@google.com>Signed-off-by: Guido Trotter <ultrotter@google.com>...
bootstrap: Factorize HMAC key generation
Make bootstrap._GenerateSelfSignedSslCert public
Add uuid on node/instance add and cluster init
This patch does a little bit of cleanup first, since we want to callGenerateUniqueID without reacquiring the lock.
Note that we don't necessarily need to do this for the cluster, since atfirst startup ConfigWriter will do it anyway. But it's better to...
Node init: copy hmac key as well
Without this confd will not start when a node is added to the cluster.
Remove RpcResult.RemoteFailMsg completely
Fix authorized_keys generation at cluster init
Copy pub_key in authorized_keys.
Signed-off-by: Luca Bigliardi <shammash@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Use ReadFile/WriteFile in more places
This survived QA, burnin and unittests.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Luca Bigliardi <shammash@google.com>
Add ctime/mtime support to the main ConfigObjects
This patch adds ctime/mtime support to the “main” config objects - theconfig data itself, and the cluster/nodes/instances objects.
These are not added on auto-upgrade, but rather should be migrated if it...
Generate a shared HMAC key at cluster init time
This key is shared on all nodes (via cmdlib._RedistributeAncillaryFiles)and will be used for HMAC authentication of confd messages.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Merge branch 'next' into branch-2.1
Make sure enabled_hypervisors list is valid
Get rid of the default_hypervisor slot
Currently we have both a default_hypervisor and an enabled_hypervisorslist. The former is only settable at cluster init time, while the lattercan be changed with cluster modify.
This becomes cumbersome in a few ways: at cluster init time for example...
Simplify InitConfig and remove SimpleConfigWriter
InitConfig currently creates the cluster config_data, then puts it intoa dict, passes it to SimpleConfigWriter to load it from a dict (whichjust reuses the dict value) and then saves it. The SimpleConfigWriter is...
InitCluster, don't use SimpleConfigWriter
InitConfig returns a SimpleConfigWriter to InitCluster, which thenpasses it on to ssh.WriteKnownHostsFile, which extracts a couple ofvalues from it. One line later the full ConfigWriter is initialized.
By initializing it one line before we can pass the full writer to...
Extend call_node_start_master rpc with no_voting
When the parameter is set to True and start_daemons is also True,ganeti-masterd will be started with the new --no-voting --yes-do-itoptions.
This new option is set to True only on masterfailover, when no_voting is...
Merge branch 'master' into next
Create a new --no-voting option for masterfailover
This allows failing over in certain corner cases, such as a 2 nodecluster with one node down. The man page is also updated to documentthis dangerous option and how to recover from this situation.
bootstrap: Don't leak file descriptor when generating SSL certificate
Fix some typos
Convert master_info rpc to new style result
This was more tricky as the backend function is used by other functionin backend.py. As such, it must be handled specially - it must raisealways an exception and not simply return False, err.
Conver node_leave_cluster rpc to new style result
This patch converts this rpc call to the new style result, and alsochanges in the process the meaning of the QuitGanetiException'sarguments and the node daemon rpc call exception handler.
The problem with the exception handler is that we used a two-stage one,...
Convert node_stop_master rpc to new style result
Convert node_start_master to new style result
This is used in multiple places outside cmdlib.py, so it's a moreinteresting patch.
InitCluster: don't set default_bridge
And remove the -b option, as default nic parameters can be used instead.We could support the option, but that would add more code, and sincecluster init is not a frequent operation, it's better to keep the codeclean....
Allow setting NIC parameters at gnt-cluster init
Change BEGR_DEFAULT to PP_DEFAULT
This way the same constant can represent the default profile also fornic, disk and OS parameters.
Fix a typo in InitCluster
Add cluster-init --no-etc-hosts parameter
If --no-etc-hosts is passed in at cluster init time we set a newparameter in the cluster's object to false, and avoid adding nodes tothe hosts file. The UpgradeConfig function is used to set the value toTrue, when upgrading from an old configuration version....
Remove some superfluous imports
This is for Python 2.6 compatibility.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Fix gnt-cluster getmaster on non-master nodes
The current implementation of “gnt-cluster getmaster” doesn't work onnon-master nodes, which is a regression from 1.2. This patch implementsit (again) via ssconf.
Create runtime dir in bootstrap
Some hypervisors (KVM) need RUN_GANETI_DIR to exist even at cluster inittime. This patch creates it in InitCluster just before hv parameterchecking. Since the code to make list of directories is already repeatedtwice in the code, and this would be the third time, we abstract it into...
Fix some epydoc style issues
99% of the epydoc return tags are "@return:", but each of the modified fileshad one "@returns:" line. We fix this for consistency.
Reviewed-by: imsnah
Instance parameters: force typing
We want all the hv/be parameters to have a known type, rather than arandom mix of empty string, boolean values, and None, so we declare thetype of each variable and we enforce/convert it.
- Add some new constants for enforceable value types...
Add a ‘drained’ attribute to node objects
This attribute will be used to prevent any allocation on the node (anyof replace-disks with new secondary this node, failover to the node,migration to the node).
The patch adds the attribute and initializes it correctly in cluster...
ganeti.bootstrap: Set permissions on newly uploaded files
Reviewed-by: amishchenko
ganeti.bootstrap: Upload remote API certificate to new nodes
ganeti.bootstrap: Prepare for remote API certificate
ganeti.bootstrap: Write SSL key to temporary file and set permissions
Previously, we set the permissions only after writing the key. Thisgave other users on the system a small window during which they couldread the key.
ganeti.bootstrap: Generate SSL certificate for remote API
ganeti.bootstrap: Move SSL certificate generation into separate function
ganeti.bootstrap: Whitespace fix
Reviewed-by: iustinp
cleanup: fix GatherMasterVotes
Remove unused vars
cleanup: _InitSSHSetup doesn't need its argument
Fix epydoc format warnings
This patch should fix all outstanding epydoc parsing errors; as such, weswitch epydoc into verbose mode so that any new errors will be visible.
Add a new node parameter 'offline'
This patch adds a new node parameter called offline that will be used tomark nodes which should be touched by commands.
We also add this flag at cluster init, node add, and export it toiallocator scripts.
Reviewed-by: ultrotter
InitCluster force a config file update
After the cluster is ready we'll load the ConfigWriter and force awriteout of all config files.
Make sure the initial node is a master candidate
gnt-cluster init, handle candidate_pool_size
- Add a new command line option, defaulting to the constant value- Pass the value to bootstrap.InitCluster- Use it to init the new Cluster object
Convert rpc results to a custom type
For a long time we had the problem that both RPC-layer errors andresults from the remote node share the same "valuespace". This isbecause we shouldn't raise an exception when only one node failed(and lose the results from the other nodes)....
Use the new utils.CheckBEParams function
Where we used/forgot to validate beparams we now use the new common function.
Fix master failover
The ssconf files were not updated by the master failover. We need topush them, and since we already have RPC initialized, we can use thestandard ConfigWriter to do so - this will take care of both the configfile and the ssconf files....
Prevent master failover to a non candidate node
InitCluster: initialize master node serial_no
Currently it was left alone, and thus its value was "null".
Improve the node add operation
Currently, the node add operation uses a job to query the node name andthe bootstrap function directly reads the config file for the clustername.
This patch changes to that both the cluster name and the verification ofthe node is done via queries to the master....
Get rid of node daemon password
With the new SSL client certificate stuff it's no longer needed.