1 Ganeti Node OOB Management Framework
2 ====================================
7 Extend Ganeti with Out of Band (:term:`OOB`) Cluster Node Management
13 Ganeti currently has no support for Out of Band management of the nodes
14 in a cluster. It relies on the OS running on the nodes and has therefore
15 limited possibilities when the OS is not responding. The command
16 ``gnt-node powercycle`` can be issued to attempt a reboot of a node that
17 crashed but there are no means to power a node off and power it back
18 on. Supporting this is very handy in the following situations:
20 * **Emergency Power Off**: During emergencies, time is critical and
21 manual tasks just add latency which can be avoided through
22 automation. If a server room overheats, halting the OS on the nodes
23 is not enough. The nodes need to be powered off cleanly to prevent
25 * **Repairs**: In most cases, repairing a node means that the node has
27 * **Crashes**: Software bugs may crash a node. Having an OS
28 independent way to power-cycle a node helps to recover the node
29 without human intervention.
34 Ganeti will be extended with OOB capabilities through adding a new
35 **Cluster Parameter** (``--oob-program``), a new **Node Property**
36 (``--oob-program``), a new **Node State (powered)** and support in
37 ``gnt-node`` for invoking an **External Helper Command** which executes
38 the actual OOB command (``gnt-node <command> nodename ...``). The
39 supported commands are: ``power on``, ``power off``, ``power cycle``,
40 ``power status`` and ``health``.
43 The new **Node State (powered)** is a **State of Record**
44 (:term:`SoR`), not a **State of World** (:term:`SoW`). The maximum
45 execution time of the **External Helper Command** will be limited to
46 60s to prevent the cluster from getting locked for an undefined amount
52 New ``gnt-cluster`` Parameter
53 +++++++++++++++++++++++++++++
55 | Program: ``gnt-cluster``
56 | Command: ``modify|init``
57 | Parameters: ``--oob-program``
58 | Options: ``--oob-program``: executable OOB program (absolute path)
60 New ``gnt-cluster epo`` Command
61 +++++++++++++++++++++++++++++++
63 | Program: ``gnt-cluster``
65 | Parameter: ``--on`` ``--force`` ``--groups`` ``--all``
66 | Options: ``--on``: By default epo turns off, with ``--on`` it tries to get the
68 | ``--force``: To force the operation without asking for confirmation
69 | ``--groups``: To operate on groups instead of nodes
70 | ``--all``: To operate on the whole cluster
72 This is a convenience command to allow easy emergency power off of a
73 whole cluster or part of it. It takes care of all steps needed to get
74 the cluster into a sane state to turn off the nodes.
76 With ``--on`` it does the reverse and tries to bring the rest of the
80 The master node is not able to shut itself cleanly down. Therefore,
81 this command will not do all the work on single node clusters. On
82 multi node clusters the command tries to find another master or if
83 that is not possible prepares everything to the point where the user
84 has to shutdown the master node itself alone this applies also to the
85 single node cluster configuration.
87 New ``gnt-node`` Property
88 +++++++++++++++++++++++++
90 | Program: ``gnt-node``
91 | Command: ``modify|add``
92 | Parameters: ``--oob-program``
93 | Options: ``--oob-program``: executable OOB program (absolute path)
96 If ``--oob-program`` is set to ``!`` then the node has no OOB
97 capabilities. Otherwise, we will inherit the node group respectively
98 the cluster wide value. I.e. the nodes have to opt out from OOB
101 Addition to ``gnt-cluster verify``
102 ++++++++++++++++++++++++++++++++++
104 | Program: ``gnt-cluster``
105 | Command: ``verify``
110 1. existence and execution flag of OOB program on all Master
111 Candidates if the cluster parameter ``--oob-program`` is set or at
112 least one node has the property ``--oob-program`` set. The OOB
113 helper is just invoked on the master
114 2. check if node state powered matches actual power state of the
115 machine for those nodes where ``--oob-program`` is set
120 Ganeti supports the following two boolean states related to the nodes:
123 The cluster still communicates with drained nodes but excludes them
124 from allocation operations
127 if offline, the cluster does not communicate with offline nodes;
128 useful for nodes that are not reachable in order to avoid delays
130 And will extend this list with the following boolean state:
133 if not powered, the cluster does not communicate with not powered
134 nodes if the node property ``--oob-program`` is not set, the state
135 powered is not displayed
137 Additionally modify the meaning of the offline state as follows:
140 if offline, the cluster does not communicate with offline nodes
141 (**with the exception of OOB commands for nodes where**
142 ``--oob-program`` **is set**); useful for nodes that are not reachable
143 in order to avoid delays
145 The corresponding command extensions are:
147 | Program: ``gnt-node``
149 | Parameter: [ ``nodename`` ... ]
152 Additional Output (:term:`SoR`, ommited if node property
153 ``--oob-program`` is not set):
154 powered: ``[True|False]``
156 | Program: ``gnt-node``
157 | Command: ``modify``
158 | Parameter: nodename
159 | Option: [ ``--powered=yes|no`` ]
160 | Reasoning: sometimes you will need to sync the :term:`SoR` with the :term:`SoW` manually
161 | Caveat: ``--powered`` can only be modified if ``--oob-program`` is set for
162 | the node in question
164 New ``gnt-node`` commands: ``power [on|off|cycle|status]``
165 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
167 | Program: ``gnt-node``
168 | Command: ``power [on|off|cycle|status]``
169 | Parameters: [ ``nodename`` ... ]
173 * If no nodenames are passed to ``power [on|off|cycle]``, the user
174 will be prompted with ``"Do you really want to power [on|off|cycle]
175 the following nodes: <display list of OOB capable nodes in the
177 * For ``power-status``, nodename is optional, if omitted, we list the
178 power-status of all OOB capable nodes in the cluster (:term:`SoW`)
179 * User should be warned and needs to confirm with yes if s/he tries to
180 ``power [off|cycle]`` a node with running instances.
185 +-----------------------------+----------------------------------------------+
186 | Exception | Error Message |
187 +=============================+==============================================+
188 | OOB program return code != 0| OOB program execution failed ($ERROR_MSG) |
189 +-----------------------------+----------------------------------------------+
190 | OOB program execution time | OOB program execution timeout exceeded, OOB |
191 | exceeds 60s | program execution aborted |
192 +-----------------------------+----------------------------------------------+
197 +----------------+---------------+----------------+--------------------------+
198 | State before |Command | State after | Comment |
199 | execution | | execution | |
200 +================+===============+================+==========================+
201 | powered: False |``power off`` | powered: False | FYI: IPMI will complain |
202 | | | | if you try to power off |
203 | | | | a machine that is already|
204 | | | | powered off |
205 +----------------+---------------+----------------+--------------------------+
206 | powered: False |``power cycle``| powered: False | FYI: IPMI will complain |
207 | | | | if you try to cycle a |
208 | | | | machine that is already |
209 | | | | powered off |
210 +----------------+---------------+----------------+--------------------------+
211 | powered: False |``power on`` | powered: True | |
212 +----------------+---------------+----------------+--------------------------+
213 | powered: True |``power off`` | powered: False | |
214 +----------------+---------------+----------------+--------------------------+
215 | powered: True |``power cycle``| powered: True | |
216 +----------------+---------------+----------------+--------------------------+
217 | powered: True |``power on`` | powered: True | FYI: IPMI will complain |
218 | | | | if you try to power on |
219 | | | | a machine that is already|
221 +----------------+---------------+----------------+--------------------------+
225 * If the command fails, the Node State remains unchanged.
226 * We will not prevent the user from trying to power off a node that is
227 already powered off since the powered state represents the
228 :term:`SoR` only and not the :term:`SoW`. This can however create
229 problems when the cluster administrator wants to bring the
230 :term:`SoR` in sync with the :term:SoW` without actually having to
231 mess with the node(s). For this case, we allow direct modification
232 of the powered state through the gnt-node modify
233 ``--powered=[yes|no]`` command as long as the node has OOB
234 capabilities (i.e. ``--oob-program`` is set).
235 * All node power state changes will be logged
237 Node Power Status Listing (:term:`SoW`)
238 +++++++++++++++++++++++++++++++++++++++
240 | Program: ``gnt-node``
241 | Command: ``power-status``
242 | Parameters: [ ``nodename`` ... ]
244 Example output (represents :term:`SoW`)::
246 gnt-node oob power-status
249 node2.example.com off
251 node4.example.com unknown
255 * We use ``unknown`` in case the Helper Program could not determine
257 * If no nodenames are provided, we will list the power state of all
258 nodes which are not opted out from OOB management.
259 * Only nodes which are not opted out from OOB management will be
260 listed. Invoking the command on a node that does not meet this
261 condition will result in an error message "Node X does not support
264 Node Power Status Listing (:term:`SoR`)
265 +++++++++++++++++++++++++++++++++++++++
267 | Program: ``gnt-node``
269 | Parameter: [ ``nodename`` ... ]
272 Example output (represents :term:`SoR`)::
274 gnt-node info node1.example.com
275 Node name: node1.example.com
276 primary ip: 192.168.1.1
277 secondary ip: 192.168.2.1
278 master candidate: True
282 primary for instances:
286 secondary for instances:
293 Only nodes which are not opted out from OOB management will report the
296 New ``gnt-node`` oob subcommand: ``health``
297 +++++++++++++++++++++++++++++++++++++++++++
299 | Program: ``gnt-node``
300 | Command: ``health``
301 | Parameters: [ ``nodename`` ... ]
303 | Example: ``/usr/bin/oob health node5.example.com``
307 * If no nodename(s) are provided, we will report the health of all
308 nodes in the cluster which have ``--oob-program`` set.
309 * Only nodes which are not opted out from OOB management will report
310 their health. Invoking the command on a node that does not meet this
311 condition will result in an error message "Node does not support OOB
314 For error handling see `Error Handling`_
316 OOB Program (Helper Program) Parameters, Return Codes and Data Format
317 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
319 | Program: executable OOB program (absolute path)
320 | Parameters: command nodename
321 | Command: [power-{on|off|cycle|status}|health]
323 | Example: ``/usr/bin/oob power-on node1.example.com``
324 | Caveat: maximum runtime is limited to 60s
329 +-------------+-------------------------+
330 | Return code | Meaning |
331 +=============+=========================+
332 | 0 | Command succeeded |
333 +-------------+-------------------------+
334 | 1 | Command failed |
335 +-------------+-------------------------+
336 | others | Unsupported/undefined |
337 +-------------+-------------------------+
339 Error messages are passed from the helper program to Ganeti through
340 :manpage:`stderr(3)` (return code == 1). On :manpage:`stdout(3)`, the
341 helper program will send data back to Ganeti (return code == 0). The
342 format of the data is JSON.
344 +-----------------+------------------------------+
345 | Command | Expected output |
346 +=================+==============================+
347 | ``power-on`` | None |
348 +-----------------+------------------------------+
349 | ``power-off`` | None |
350 +-----------------+------------------------------+
351 | ``power-cycle`` | None |
352 +-----------------+------------------------------+
353 | ``power-status``| ``{ "powered": true|false }``|
354 +-----------------+------------------------------+
357 | | [[item, status], |
358 | | [item, status], |
360 +-----------------+------------------------------+
365 For the health output, the fields are:
367 +--------+------------------------------------------------------------------+
369 +========+==================================================================+
370 | item | String identifier of the item we are querying the health of, |
374 | | * PS Redundancy |
376 +--------+------------------------------------------------------------------+
377 | status | String; Can take one of the following four values: |
383 +--------+------------------------------------------------------------------+
387 * The item output list is defined by the Helper Program. It is up to
388 the author of the Helper Program to decide which items should be
389 monitored and what each corresponding return status is.
390 * Ganeti will currently not take any actions based on the item
391 status. It will however create log entries for items with status
392 WARNING or CRITICAL for each run of the ``gnt-node oob health
393 nodename`` command. Automatic actions (regular monitoring of the
394 item status) is considered a new service and will be treated in a
395 separate design document.
400 The ``gnt-node power-[on|off]`` (power state changes) commands will
401 create log entries following current Ganeti logging practices. In
402 addition, health items with status WARNING or CRITICAL will be logged
403 for each run of ``gnt-node health``.
405 .. vim: set textwidth=72 :