Initial commit.
[ganeti-local] / docs / hooks.sgml
1 <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
2 ]>
3   <article class="specification">
4   <articleinfo>
5     <title>Ganeti customisation using hooks</title>
6   </articleinfo>
7   <para>Documents ganeti version 1.2</para>
8   <section>
9     <title>Introduction</title>
10
11     <para>
12       In order to allow customisation of operations, ganeti will run
13       scripts under <filename
14       class="directory">/etc/ganeti/hooks</filename> based on certain
15       rules.
16     </para>
17
18       <para>This is similar to the <filename
19       class="directory">/etc/network/</filename> structure present in
20       Debian for network interface handling.</para>
21
22     </section>
23
24
25     <section>
26       <title>Organisation</title>
27
28       <para>For every operation, two sets of scripts are run:
29
30       <itemizedlist>
31           <listitem>
32             <simpara>pre phase (for authorization/checking)</simpara>
33           </listitem>
34           <listitem>
35             <simpara>post phase (for logging)</simpara>
36           </listitem>
37         </itemizedlist>
38       </para>
39
40       <para>Also, for each operation, the scripts are run on one or
41       more nodes, depending on the operation type.</para>
42
43       <para>Note that, even though we call them scripts, we are
44       actually talking about any executable.</para>
45
46       <section>
47         <title><emphasis>pre</emphasis> scripts</title>
48
49         <para>The <emphasis>pre</emphasis> scripts have a definite
50         target: to check that the operation is allowed given the
51         site-specific constraints. You could have, for example, a rule
52         that says every new instance is required to exists in a
53         database; to implement this, you could write a script that
54         checks the new instance parameters against your
55         database.</para>
56
57         <para>The objective of these scripts should be their return
58         code (zero or non-zero for success and failure). However, if
59         they modify the environment in any way, they should be
60         idempotent, as failed executions could be restarted and thus
61         the script(s) run again with exactly the same
62         parameters.</para>
63
64       </section>
65
66       <section>
67         <title><emphasis>post</emphasis> scripts</title>
68
69         <para>These scripts should do whatever you need as a reaction
70         to the completion of an operation. Their return code is not
71         checked (but logged), and they should not depend on the fact
72         that the <emphasis>pre</emphasis> scripts have been
73         run.</para>
74
75       </section>
76
77       <section>
78         <title>Naming</title>
79
80         <para>The allowed names for the scripts consist of (similar to
81         <citerefentry> <refentrytitle>run-parts</refentrytitle>
82         <manvolnum>8</manvolnum> </citerefentry>) upper and lower
83         case, digits, underscores and hyphens. In other words, the
84         regexp
85         <computeroutput>^[a-zA-Z0-9_-]+$</computeroutput>. Also,
86         non-executable scripts will be ignored.
87         </para>
88       </section>
89
90       <section>
91         <title>Order of execution</title>
92
93         <para>On a single node, the scripts in a directory are run in
94         lexicographic order (more exactly, the python string
95         comparison order). It is advisable to implement the usual
96         <emphasis>NN-name</emphasis> convention where
97         <emphasis>NN</emphasis> is a two digit number.</para>
98
99         <para>For an operation whose hooks are run on multiple nodes,
100         there is no specific ordering of nodes with regard to hooks
101         execution; you should assume that the scripts are run in
102         parallel on the target nodes (keeping on each node the above
103         specified ordering).  If you need any kind of inter-node
104         synchronisation, you have to implement it yourself in the
105         scripts.</para>
106
107       </section>
108
109       <section>
110         <title>Execution environment</title>
111
112         <para>The scripts will be run as follows:
113           <itemizedlist>
114           <listitem>
115             <simpara>no command line arguments</simpara>
116           </listitem>
117             <listitem>
118               <simpara>no controlling <acronym>tty</acronym></simpara>
119             </listitem>
120             <listitem>
121               <simpara><varname>stdin</varname> is
122               actually <filename>/dev/null</filename></simpara>
123             </listitem>
124             <listitem>
125               <simpara><varname>stdout</varname> and
126               <varname>stderr</varname> are directed to
127               files</simpara>
128             </listitem>
129           <listitem>
130             <simpara>the <varname>PATH</varname> is reset to
131             <literal>/sbin:/bin:/usr/sbin:/usr/bin</literal></simpara>
132           </listitem>
133           <listitem>
134             <simpara>the environment is cleared, and only
135             ganeti-specific variables will be left</simpara>
136           </listitem>
137           </itemizedlist>
138
139         </para>
140
141       <para>All informations about the cluster is passed using
142       environment variables. Different operations will have sligthly
143       different environments, but most of the variables are
144       common.</para>
145
146     </section>
147
148
149     <section>
150       <title>Operation list</title>
151       <table>
152         <title>Operation list</title>
153         <tgroup cols="7">
154           <colspec>
155           <colspec>
156           <colspec>
157           <colspec>
158           <colspec>
159           <colspec colname="prehooks">
160           <colspec colname="posthooks">
161           <spanspec namest="prehooks" nameend="posthooks"
162             spanname="bothhooks">
163           <thead>
164             <row>
165               <entry>Operation ID</entry>
166               <entry>Directory prefix</entry>
167               <entry>Description</entry>
168               <entry>Command</entry>
169               <entry>Supported env. variables</entry>
170               <entry><emphasis>pre</emphasis> hooks</entry>
171               <entry><emphasis>post</emphasis> hooks</entry>
172             </row>
173           </thead>
174           <tbody>
175             <row>
176               <entry>OP_INIT_CLUSTER</entry>
177               <entry><filename class="directory">cluster-init</filename></entry>
178               <entry>Initialises the cluster</entry>
179               <entry><computeroutput>gnt-cluster init</computeroutput></entry>
180               <entry><constant>CLUSTER</constant>, <constant>MASTER</constant></entry>
181               <entry spanname="bothhooks">master node, cluster name</entry>
182             </row>
183             <row>
184               <entry>OP_MASTER_FAILOVER</entry>
185               <entry><filename class="directory">master-failover</filename></entry>
186               <entry>Changes the master</entry>
187               <entry><computeroutput>gnt-cluster master-failover</computeroutput></entry>
188               <entry><constant>OLD_MASTER</constant>, <constant>NEW_MASTER</constant></entry>
189               <entry>the new master</entry>
190               <entry>all nodes</entry>
191             </row>
192             <row>
193               <entry>OP_ADD_NODE</entry>
194               <entry><filename class="directory">node-add</filename></entry>
195               <entry>Adds a new node to the cluster</entry>
196               <entry><computeroutput>gnt-node add</computeroutput></entry>
197               <entry><constant>NODE_NAME</constant>, <constant>NODE_PIP</constant>, <constant>NODE_SIP</constant></entry>
198               <entry>all existing nodes</entry>
199               <entry>all existing nodes plus the new node</entry>
200             </row>
201             <row>
202               <entry>OP_REMOVE_NODE</entry>
203               <entry><filename class="directory">node-remove</filename></entry>
204               <entry>Removes a node from the cluster</entry>
205               <entry><computeroutput>gnt-node remove</computeroutput></entry>
206               <entry><constant>NODE_NAME</constant></entry>
207               <entry spanname="bothhooks">all existing nodes except the removed node</entry>
208             </row>
209             <row>
210               <entry>OP_INSTANCE_ADD</entry>
211               <entry><filename class="directory">instance-add</filename></entry>
212               <entry>Creates a new instance</entry>
213               <entry><computeroutput>gnt-instance add</computeroutput></entry>
214               <entry><constant>INSTANCE_NAME</constant>, <constant>INSTANCE_PRIMARY</constant>, <constant>INSTANCE_SECONDARIES</constant>, <constant>DISK_TEMPLATE</constant>, <constant>MEM_SIZE</constant>, <constant>DISK_SIZE</constant>, <constant>SWAP_SIZE</constant>, <constant>VCPUS</constant>, <constant>INSTANCE_IP</constant>, <constant>INSTANCE_ADD_MODE</constant>, <constant>SRC_NODE</constant>, <constant>SRC_PATH</constant>, <constant>SRC_IMAGE</constant></entry>
215               <entry spanname="bothhooks" morerows="4">master node, primary and
216                    secondary nodes</entry>
217             </row>
218             <row>
219               <entry>OP_BACKUP_EXPORT</entry>
220               <entry><filename class="directory">instance-export</filename></entry>
221               <entry>Export the instance</entry>
222               <entry><computeroutput>gnt-backup export</computeroutput></entry>
223               <entry><constant>INSTANCE_NAME</constant>, <constant>EXPORT_NODE</constant>, <constant>EXPORT_DO_SHUTDOWN</constant></entry>
224             </row>
225             <row>
226               <entry>OP_INSTANCE_START</entry>
227               <entry><filename class="directory">instance-start</filename></entry>
228               <entry>Starts an instance</entry>
229               <entry><computeroutput>gnt-instance start</computeroutput></entry>
230               <entry><constant>INSTANCE_NAME</constant>, <constant>INSTANCE_PRIMARY</constant>, <constant>INSTANCE_SECONDARIES</constant>, <constant>FORCE</constant></entry>
231             </row>
232             <row>
233               <entry>OP_INSTANCE_SHUTDOWN</entry>
234               <entry><filename class="directory">instance-shutdown</filename></entry>
235               <entry>Stops an instance</entry>
236               <entry><computeroutput>gnt-instance shutdown</computeroutput></entry>
237               <entry><constant>INSTANCE_NAME</constant>, <constant>INSTANCE_PRIMARY</constant>, <constant>INSTANCE_SECONDARIES</constant></entry>
238             </row>
239             <row>
240               <entry>OP_INSTANCE_MODIFY</entry>
241               <entry><filename class="directory">instance-modify</filename></entry>
242               <entry>Modifies the instance parameters.</entry>
243               <entry><computeroutput>gnt-instance modify</computeroutput></entry>
244               <entry><constant>INSTANCE_NAME</constant>, <constant>MEM_SIZE</constant>, <constant>VCPUS</constant>, <constant>INSTANCE_IP</constant></entry>
245             </row>
246             <row>
247               <entry>OP_INSTANCE_FAILOVER</entry>
248               <entry><filename class="directory">instance-failover</filename></entry>
249               <entry>Failover an instance</entry>
250               <entry><computeroutput>gnt-instance start</computeroutput></entry>
251               <entry><constant>INSTANCE_NAME</constant>, <constant>INSTANCE_PRIMARY</constant>, <constant>INSTANCE_SECONDARIES</constant>, <constant>IGNORE_CONSISTENCY</constant></entry>
252             </row>
253             <row>
254               <entry>OP_INSTANCE_REMOVE</entry>
255               <entry><filename class="directory">instance-remove</filename></entry>
256               <entry>Remove an instance</entry>
257               <entry><computeroutput>gnt-instance remove</computeroutput></entry>
258               <entry><constant>INSTANCE_NAME</constant>, <constant>INSTANCE_PRIMARY</constant>, <constant>INSTANCE_SECONDARIES</constant></entry>
259             </row>
260             <row>
261               <entry>OP_INSTANCE_ADD_MDDRBD</entry>
262               <entry><filename class="directory">mirror-add</filename></entry>
263               <entry>Adds a mirror component</entry>
264               <entry><computeroutput>gnt-instance add-mirror</computeroutput></entry>
265               <entry><constant>INSTANCE_NAME</constant>, <constant>NEW_SECONDARY</constant>, <constant>DISK_NAME</constant></entry>
266             </row>
267             <row>
268               <entry>OP_INSTANCE_REMOVE_MDDRBD</entry>
269               <entry><filename class="directory">mirror-remove</filename></entry>
270               <entry>Removes a mirror component</entry>
271               <entry><computeroutput>gnt-instance remove-mirror</computeroutput></entry>
272               <entry><constant>INSTANCE_NAME</constant>, <constant>OLD_SECONDARY</constant>, <constant>DISK_NAME</constant>, <constant>DISK_ID</constant></entry>
273             </row>
274             <row>
275               <entry>OP_INSTANCE_REPLACE_DISKS</entry>
276               <entry><filename class="directory">mirror-replace</filename></entry>
277               <entry>Replace all mirror components</entry>
278               <entry><computeroutput>gnt-instance replace-disks</computeroutput></entry>
279               <entry><constant>INSTANCE_NAME</constant>, <constant>OLD_SECONDARY</constant>, <constant>NEW_SECONDARY</constant></entry>
280
281             </row>
282           </tbody>
283         </tgroup>
284       </table>
285     </section>
286
287     <section>
288       <title>Environment variables</title>
289
290       <para>Note that all variables listed here are actually prefixed
291       with <constant>GANETI_</constant> in order to provide a
292       different namespace.</para>
293
294       <section>
295         <title>Common variables</title>
296
297         <para>This is the list of environment variables supported by
298         all operations:</para>
299
300         <variablelist>
301           <varlistentry>
302             <term>HOOKS_VERSION</term>
303             <listitem>
304               <para>Documents the hooks interface version. In case this
305             doesnt match what the script expects, it should not
306             run. The documents conforms to the version
307             <literal>1</literal>.</para>
308             </listitem>
309           </varlistentry>
310           <varlistentry>
311             <term>HOOKS_PHASE</term>
312             <listitem>
313               <para>one of <constant>PRE</constant> or
314               <constant>POST</constant> denoting which phase are we
315               in.</para>
316             </listitem>
317           </varlistentry>
318           <varlistentry>
319             <term>CLUSTER</term>
320             <listitem>
321               <para>the cluster name</para>
322             </listitem>
323           </varlistentry>
324           <varlistentry>
325             <term>MASTER</term>
326             <listitem>
327               <para>the master node</para>
328             </listitem>
329           </varlistentry>
330           <varlistentry>
331             <term>OP_ID</term>
332             <listitem>
333               <para>one of the <constant>OP_*</constant> values from
334               the table of operations</para>
335             </listitem>
336           </varlistentry>
337           <varlistentry>
338             <term>OBJECT_TYPE</term>
339             <listitem>
340               <para>one of <simplelist type="inline">
341                   <member><constant>INSTANCE</constant></member>
342                   <member><constant>NODE</constant></member>
343                   <member><constant>CLUSTER</constant></member>
344                 </simplelist>, showing the target of the operation.
345              </para>
346             </listitem>
347           </varlistentry>
348           <!-- commented out since it causes problems in our rpc
349                multi-node optimised calls
350           <varlistentry>
351             <term>HOST_NAME</term>
352             <listitem>
353               <para>The name of the node the hook is run on as known by
354             the cluster.</para>
355             </listitem>
356           </varlistentry>
357           <varlistentry>
358             <term>HOST_TYPE</term>
359             <listitem>
360               <para>one of <simplelist type="inline">
361                   <member><constant>MASTER</constant></member>
362                   <member><constant>NODE</constant></member>
363                 </simplelist>, showing the role of this node in the cluster.
364              </para>
365             </listitem>
366           </varlistentry>
367           -->
368         </variablelist>
369       </section>
370
371       <section>
372         <title>Specialised variables</title>
373
374         <para>This is the list of variables which are specific to one
375         or more operations.</para>
376         <variablelist>
377           <varlistentry>
378             <term>INSTANCE_NAME</term>
379             <listitem>
380               <para>The name of the instance which is the target of
381               the operation.</para>
382             </listitem>
383           </varlistentry>
384           <varlistentry>
385             <term>INSTANCE_DISK_TYPE</term>
386             <listitem>
387               <para>The disk type for the instance.</para>
388             </listitem>
389           </varlistentry>
390           <varlistentry>
391             <term>INSTANCE_DISK_SIZE</term>
392             <listitem>
393               <para>The (OS) disk size for the instance.</para>
394             </listitem>
395           </varlistentry>
396           <varlistentry>
397             <term>INSTANCE_OS</term>
398             <listitem>
399               <para>The name of the instance OS.</para>
400             </listitem>
401           </varlistentry>
402           <varlistentry>
403             <term>INSTANCE_PRIMARY</term>
404             <listitem>
405               <para>The name of the node which is the primary for the
406               instance.</para>
407             </listitem>
408           </varlistentry>
409           <varlistentry>
410             <term>INSTANCE_SECONDARIES</term>
411             <listitem>
412               <para>Space-separated list of secondary nodes for the
413               instance.</para>
414             </listitem>
415           </varlistentry>
416           <varlistentry>
417             <term>NODE_NAME</term>
418             <listitem>
419               <para>The target node of this operation (not the node on
420               which the hook runs).</para>
421             </listitem>
422           </varlistentry>
423           <varlistentry>
424             <term>NODE_PIP</term>
425             <listitem>
426               <para>The primary IP of the target node (the one over
427               which inter-node communication is done).</para>
428             </listitem>
429           </varlistentry>
430           <varlistentry>
431             <term>NODE_SIP</term>
432             <listitem>
433               <para>The secondary IP of the target node (the one over
434               which drbd replication is done). This can be equal to
435               the primary ip, in case the cluster is not
436               dual-homed.</para>
437             </listitem>
438           </varlistentry>
439           <varlistentry>
440             <term>OLD_MASTER</term>
441             <term>NEW_MASTER</term>
442             <listitem>
443               <para>The old, respectively the new master for the
444               master failover operation.</para>
445             </listitem>
446           </varlistentry>
447           <varlistentry>
448             <term>FORCE</term>
449             <listitem>
450               <para>This is provided by some operations when the user
451               gave this flag.</para>
452             </listitem>
453           </varlistentry>
454           <varlistentry>
455             <term>IGNORE_CONSISTENCY</term>
456             <listitem>
457               <para>The user has specified this flag. It is used when
458               failing over instances in case the primary node is
459               down.</para>
460             </listitem>
461           </varlistentry>
462           <varlistentry>
463             <term>MEM_SIZE, DISK_SIZE, SWAP_SIZE, VCPUS</term>
464             <listitem>
465               <para>The memory, disk, swap size and the number of
466               processor selected for the instance (in
467               <command>gnt-instance add</command> or
468               <command>gnt-instance modify</command>).</para>
469             </listitem>
470           </varlistentry>
471           <varlistentry>
472             <term>INSTANCE_IP</term>
473             <listitem>
474               <para>If defined, the instance IP in the
475               <command>gnt-instance add</command> and
476               <command>gnt-instance set</command> commands. If not
477               defined, it means that no IP has been defined.</para>
478             </listitem>
479           </varlistentry>
480           <varlistentry>
481             <term>DISK_TEMPLATE</term>
482             <listitem>
483               <para>The disk template type when creating the instance.</para>
484             </listitem>
485           </varlistentry>
486           <varlistentry>
487             <term>INSTANCE_ADD_MODE</term>
488             <listitem>
489               <para>The mode of the create: either
490               <constant>create</constant> for create from scratch or
491               <constant>import</constant> for restoring from an
492               exported image.</para>
493             </listitem>
494           </varlistentry>
495           <varlistentry>
496             <term>SRC_NODE, SRC_PATH, SRC_IMAGE</term>
497             <listitem>
498               <para>In case the instance has been added by import,
499               these variables are defined and point to the source
500               node, source path (the directory containing the image
501               and the config file) and the source disk image
502               file.</para>
503             </listitem>
504           </varlistentry>
505           <varlistentry>
506             <term>DISK_NAME</term>
507             <listitem>
508               <para>The disk name (either <filename>sda</filename> or
509               <filename>sdb</filename>) in mirror operations
510               (add/remove mirror).</para>
511             </listitem>
512           </varlistentry>
513           <varlistentry>
514             <term>DISK_ID</term>
515             <listitem>
516               <para>The disk id for mirror remove operations. You can
517               look this up using <command>gnt-instance
518               info</command>.</para>
519             </listitem>
520           </varlistentry>
521           <varlistentry>
522             <term>NEW_SECONDARY</term>
523             <listitem>
524               <para>The name of the node on which the new mirror
525               componet is being added. This can be the name of the
526               current secondary, if the new mirror is on the same
527               secondary.</para>
528             </listitem>
529           </varlistentry>
530           <varlistentry>
531             <term>OLD_SECONDARY</term>
532             <listitem>
533               <para>The name of the old secondary. This is used in
534               both <command>replace-disks</command> and
535               <command>remove-mirror</command>. Note that this can be
536               equal to the new secondary (only
537               <command>replace-disks</command> has both variables) if
538               the secondary node hasn't actually changed).</para>
539             </listitem>
540           </varlistentry>
541           <varlistentry>
542             <term>EXPORT_NODE</term>
543             <listitem>
544               <para>The node on which the exported image of the
545               instance was done.</para>
546             </listitem>
547           </varlistentry>
548           <varlistentry>
549             <term>EXPORT_DO_SHUTDOWN</term>
550             <listitem>
551               <para>This variable tells if the instance has been
552               shutdown or not while doing the export. In the "was
553               shutdown" case, it's likely that the filesystem is
554               consistent, whereas in the "did not shutdown" case, the
555               filesystem would need a check (journal replay or full
556               fsck) in order to guarantee consistency.</para>
557             </listitem>
558           </varlistentry>
559         </variablelist>
560
561       </section>
562
563     </section>
564
565   </section>
566   </article>