1 Ganeti Node OOB Management Framework
2 ====================================
7 Extend Ganeti with Out of Band Cluster Node Management Capabilities.
12 Ganeti currently has no support for Out of Band management of the nodes in a
13 cluster. It relies on the OS running on the nodes and has therefore limited
14 possibilities when the OS is not responding. The command ``gnt-node powercycle``
15 can be issued to attempt a reboot of a node that crashed but there are no means
16 to power a node off and power it back on. Supporting this is very handy in the
19 * **Emergency Power Off**: During emergencies, time is critical and manual
20 tasks just add latency which can be avoided through automation. If a server
21 room overheats, halting the OS on the nodes is not enough. The nodes need
22 to be powered off cleanly to prevent damage to equipment.
23 * **Repairs**: In most cases, repairing a node means that the node has to be
25 * **Crashes**: Software bugs may crash a node. Having an OS independent way to
26 power-cycle a node helps to recover the node without human intervention.
31 Ganeti will be extended with OOB capabilities through adding a new **Cluster
32 Parameter** (``--oob-program``), a new **Node Property** (``--oob-program``), a
33 new **Node State (powered)** and support in ``gnt-node`` for invoking an
34 **External Helper Command** which executes the actual OOB command (``gnt-node
35 <command> nodename ...``). The supported commands are: ``power on``,
36 ``power off``, ``power cycle``, ``power status`` and ``health``.
39 The new **Node State (powered)** is a **State of Record
40 (SoR)**, not a **State of World (SoW)**. The maximum execution time of the
41 **External Helper Command** will be limited to 60s to prevent the cluster from
42 getting locked for an undefined amount of time.
47 New ``gnt-cluster`` Parameter
48 +++++++++++++++++++++++++++++
50 | Program: ``gnt-cluster``
51 | Command: ``modify|init``
52 | Parameters: ``--oob-program``
53 | Options: ``--oob-program``: executable OOB program (absolute path)
55 New ``gnt-cluster epo`` Command
56 +++++++++++++++++++++++++++++++
58 | Program: ``gnt-cluster``
60 | Parameter: ``--on`` ``--force`` ``--groups`` ``--all``
61 | Options: ``--on``: By default epo turns off, with ``--on`` it tries to get the
63 | ``--force``: To force the operation without asking for confirmation
64 | ``--groups``: To operate on groups instead of nodes
65 | ``--all``: To operate on the whole cluster
67 This is a convenience command to allow easy emergency power off of a whole
68 cluster or part of it. It takes care of all steps needed to get the cluster into
69 a sane state to turn off the nodes.
71 With ``--on`` it does the reverse and tries to bring the rest of the cluster back
75 The master node is not able to shut itself cleanly down. Therefore, this
76 command will not do all the work on single node clusters. On multi node
77 clusters the command tries to find another master or if that is not possible
78 prepares everything to the point where the user has to shutdown the master
79 node itself alone this applies also to the single node cluster configuration.
81 New ``gnt-node`` Property
82 +++++++++++++++++++++++++
84 | Program: ``gnt-node``
85 | Command: ``modify|add``
86 | Parameters: ``--oob-program``
87 | Options: ``--oob-program``: executable OOB program (absolute path)
90 If ``--oob-program`` is set to ``!`` then the node has no OOB capabilities.
91 Otherwise, we will inherit the node group respectively the cluster wide
92 value. I.e. the nodes have to opt out from OOB capabilities.
94 Addition to ``gnt-cluster verify``
95 ++++++++++++++++++++++++++++++++++
97 | Program: ``gnt-cluster``
103 1. existence and execution flag of OOB program on all Master Candidates if
104 the cluster parameter ``--oob-program`` is set or at least one node has
105 the property ``--oob-program`` set. The OOB helper is just invoked on the
107 2. check if node state powered matches actual power state of the machine for
108 those nodes where ``--oob-program`` is set
113 Ganeti supports the following two boolean states related to the nodes:
116 The cluster still communicates with drained nodes but excludes them from
117 allocation operations
120 if offline, the cluster does not communicate with offline nodes; useful for
121 nodes that are not reachable in order to avoid delays
123 And will extend this list with the following boolean state:
126 if not powered, the cluster does not communicate with not powered nodes if
127 the node property ``--oob-program`` is not set, the state powered is not
130 Additionally modify the meaning of the offline state as follows:
133 if offline, the cluster does not communicate with offline nodes (**with the
134 exception of OOB commands for nodes where** ``--oob-program`` **is set**);
135 useful for nodes that are not reachable in order to avoid delays
137 The corresponding command extensions are:
139 | Program: ``gnt-node``
141 | Parameter: [ ``nodename`` ... ]
144 Additional Output (SoR, ommited if node property ``--oob-program`` is not set):
145 powered: ``[True|False]``
147 | Program: ``gnt-node``
148 | Command: ``modify``
149 | Parameter: nodename
150 | Option: [ ``--powered=yes|no`` ]
151 | Reasoning: sometimes you will need to sync the SoR with the SoW manually
152 | Caveat: ``--powered`` can only be modified if ``--oob-program`` is set for
153 | the node in question
155 New ``gnt-node`` commands: ``power [on|off|cycle|status]``
156 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
158 | Program: ``gnt-node``
159 | Command: ``power [on|off|cycle|status]``
160 | Parameters: [ ``nodename`` ... ]
164 * If no nodenames are passed to ``power [on|off|cycle]``, the user will be
165 prompted with ``"Do you really want to power [on|off|cycle] the following
166 nodes: <display list of OOB capable nodes in the cluster)? (y/n)"``
167 * For ``power-status``, nodename is optional, if omitted, we list the
168 power-status of all OOB capable nodes in the cluster (SoW)
169 * User should be warned and needs to confirm with yes if s/he tries to
170 ``power [off|cycle]`` a node with running instances.
175 +------------------------------+-----------------------------------------------+
176 | Exception | Error Message |
177 +==============================+===============================================+
178 | OOB program return code != 0 | OOB program execution failed ($ERROR_MSG) |
179 +------------------------------+-----------------------------------------------+
180 | OOB program execution time | OOB program execution timeout exceeded, OOB |
181 | exceeds 60s | program execution aborted |
182 +------------------------------+-----------------------------------------------+
187 +----------------+-----------------+----------------+--------------------------+
188 | State before | Command | State after | Comment |
189 | execution | | execution | |
190 +================+=================+================+==========================+
191 | powered: False | ``power off`` | powered: False | FYI: IPMI will complain |
192 | | | | if you try to power off |
193 | | | | a machine that is already|
194 | | | | powered off |
195 +----------------+-----------------+----------------+--------------------------+
196 | powered: False | ``power cycle`` | powered: False | FYI: IPMI will complain |
197 | | | | if you try to cycle a |
198 | | | | machine that is already |
199 | | | | powered off |
200 +----------------+-----------------+----------------+--------------------------+
201 | powered: False | ``power on`` | powered: True | |
202 +----------------+-----------------+----------------+--------------------------+
203 | powered: True | ``power off`` | powered: False | |
204 +----------------+-----------------+----------------+--------------------------+
205 | powered: True | ``power cycle`` | powered: True | |
206 +----------------+-----------------+----------------+--------------------------+
207 | powered: True | ``power on`` | powered: True | FYI: IPMI will complain |
208 | | | | if you try to power on |
209 | | | | a machine that is already|
211 +----------------+-----------------+----------------+--------------------------+
215 * If the command fails, the Node State remains unchanged.
216 * We will not prevent the user from trying to power off a node that is
217 already powered off since the powered state represents the **SoR** only and
218 not the **SoW**. This can however create problems when the cluster
219 administrator wants to bring the **SoR** in sync with the **SoW** without
220 actually having to mess with the node(s). For this case, we allow direct
221 modification of the powered state through the gnt-node modify
222 ``--powered=[yes|no]`` command as long as the node has OOB capabilities
223 (i.e. ``--oob-program`` is set).
224 * All node power state changes will be logged
226 Node Power Status Listing (SoW)
227 +++++++++++++++++++++++++++++++
229 | Program: ``gnt-node``
230 | Command: ``power-status``
231 | Parameters: [ ``nodename`` ... ]
233 Example output (represents **SoW**)::
235 gnt-node oob power-status
238 node2.example.com off
240 node4.example.com unknown
244 * We use ``unknown`` in case the Helper Program could not determine the power
246 * If no nodenames are provided, we will list the power state of all nodes
247 which are not opted out from OOB management.
248 * Only nodes which are not opted out from OOB management will be listed.
249 Invoking the command on a node that does not meet this condition will
250 result in an error message "Node X does not support OOB commands".
252 Node Power Status Listing (SoR)
253 +++++++++++++++++++++++++++++++
255 | Program: ``gnt-node``
257 | Parameter: [ ``nodename`` ... ]
260 Example output (represents **SoR**)::
262 gnt-node info node1.example.com
263 Node name: node1.example.com
264 primary ip: 192.168.1.1
265 secondary ip: 192.168.2.1
266 master candidate: True
270 primary for instances:
274 secondary for instances:
281 Only nodes which are not opted out from OOB management will
282 report the powered state.
284 New ``gnt-node`` oob subcommand: ``health``
285 +++++++++++++++++++++++++++++++++++++++++++
287 | Program: ``gnt-node``
288 | Command: ``health``
289 | Parameters: [ ``nodename`` ... ]
291 | Example: ``/usr/bin/oob health node5.example.com``
295 * If no nodename(s) are provided, we will report the health of all nodes in
296 the cluster which have ``--oob-program`` set.
297 * Only nodes which are not opted out from OOB management will report their
298 health. Invoking the command on a node that does not meet this condition
299 will result in an error message "Node does not support OOB commands".
301 For error handling see `Error Handling`_
303 OOB Program (Helper Program) Parameters, Return Codes and Data Format
304 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
306 | Program: executable OOB program (absolute path)
307 | Parameters: command nodename
308 | Command: [power-{on|off|cycle|status}|health]
310 | Example: ``/usr/bin/oob power-on node1.example.com``
311 | Caveat: maximum runtime is limited to 60s
316 +---------------+--------------------------+
317 | Return code | Meaning |
318 +===============+==========================+
319 | 0 | Command succeeded |
320 +---------------+--------------------------+
321 | 1 | Command failed |
322 +---------------+--------------------------+
323 | others | Unsupported/undefined |
324 +---------------+--------------------------+
326 Error messages are passed from the helper program to Ganeti through StdErr
327 (return code == 1). On StdOut, the helper program will send data back to
328 Ganeti (return code == 0). The format of the data is JSON.
330 +------------------+-------------------------------+
331 | Command | Expected output |
332 +==================+===============================+
333 | ``power-on`` | None |
334 +------------------+-------------------------------+
335 | ``power-off`` | None |
336 +------------------+-------------------------------+
337 | ``power-cycle`` | None |
338 +------------------+-------------------------------+
339 | ``power-status`` | ``{ "powered": true|false }`` |
340 +------------------+-------------------------------+
343 | | [[item, status], |
344 | | [item, status], |
346 +------------------+-------------------------------+
351 For the health output, the fields are:
353 +--------+--------------------------------------------------------------------+
355 +========+====================================================================+
356 | item | String identifier of the item we are querying the health of, |
360 | | * PS Redundancy |
362 +--------+--------------------------------------------------------------------+
363 | status | String; Can take one of the following four values: |
369 +--------+--------------------------------------------------------------------+
373 * The item output list is defined by the Helper Program. It is up to the
374 author of the Helper Program to decide which items should be monitored and
375 what each corresponding return status is.
376 * Ganeti will currently not take any actions based on the item status. It
377 will however create log entries for items with status WARNING or CRITICAL
378 for each run of the ``gnt-node oob health nodename`` command. Automatic
379 actions (regular monitoring of the item status) is considered a new service
380 and will be treated in a separate design document.
385 The ``gnt-node power-[on|off]`` (power state changes) commands will create log
386 entries following current Ganeti logging practices. In addition, health items
387 with status WARNING or CRITICAL will be logged for each run of ``gnt-node
390 .. vim: set textwidth=72 :