root / man / ganeti-watcher.rst @ 54f834df
History | View | Annotate | Download (2.6 kB)
1 | bae4b322 | Iustin Pop | ganeti-watcher(8) Ganeti | Version @GANETI_VERSION@ |
---|---|---|---|
2 | bae4b322 | Iustin Pop | =================================================== |
3 | bae4b322 | Iustin Pop | |
4 | bae4b322 | Iustin Pop | Name |
5 | bae4b322 | Iustin Pop | ---- |
6 | bae4b322 | Iustin Pop | |
7 | bae4b322 | Iustin Pop | ganeti-watcher - Ganeti cluster watcher |
8 | bae4b322 | Iustin Pop | |
9 | bae4b322 | Iustin Pop | Synopsis |
10 | bae4b322 | Iustin Pop | -------- |
11 | bae4b322 | Iustin Pop | |
12 | bae4b322 | Iustin Pop | **ganeti-watcher** [``--debug``] |
13 | bae4b322 | Iustin Pop | [``--job-age=``*age*] |
14 | bae4b322 | Iustin Pop | [``--ignore-pause``] |
15 | bae4b322 | Iustin Pop | |
16 | bae4b322 | Iustin Pop | DESCRIPTION |
17 | bae4b322 | Iustin Pop | ----------- |
18 | bae4b322 | Iustin Pop | |
19 | bae4b322 | Iustin Pop | The **ganeti-watcher** is a periodically run script which is |
20 | bae4b322 | Iustin Pop | responsible for keeping the instances in the correct status. It has |
21 | bae4b322 | Iustin Pop | two separate functions, one for the master node and another one |
22 | bae4b322 | Iustin Pop | that runs on every node. |
23 | bae4b322 | Iustin Pop | |
24 | bae4b322 | Iustin Pop | If the watcher is disabled at cluster level (via the |
25 | bae4b322 | Iustin Pop | **gnt-cluster watcher pause** command), it will exit without doing |
26 | f7b769b1 | Iustin Pop | anything. The cluster-level pause can be overridden via the |
27 | bae4b322 | Iustin Pop | ``--ignore-pause`` option, for example if during a maintenance the |
28 | bae4b322 | Iustin Pop | watcher needs to be disabled in general, but the administrator |
29 | bae4b322 | Iustin Pop | wants to run it just once. |
30 | bae4b322 | Iustin Pop | |
31 | bae4b322 | Iustin Pop | The ``--debug`` option will increase the verbosity of the watcher |
32 | bae4b322 | Iustin Pop | and also activate logging to the standard error. |
33 | bae4b322 | Iustin Pop | |
34 | bae4b322 | Iustin Pop | Master operations |
35 | bae4b322 | Iustin Pop | ~~~~~~~~~~~~~~~~~ |
36 | bae4b322 | Iustin Pop | |
37 | bae4b322 | Iustin Pop | Its primary function is to try to keep running all instances which |
38 | bae4b322 | Iustin Pop | are marked as *up* in the configuration file, by trying to start |
39 | bae4b322 | Iustin Pop | them a limited number of times. |
40 | bae4b322 | Iustin Pop | |
41 | bae4b322 | Iustin Pop | Another function is to "repair" DRBD links by reactivating the |
42 | bae4b322 | Iustin Pop | block devices of instances which have secondaries on nodes that |
43 | bae4b322 | Iustin Pop | have been rebooted. |
44 | bae4b322 | Iustin Pop | |
45 | bae4b322 | Iustin Pop | The watcher will also archive old jobs (older than the age given |
46 | bae4b322 | Iustin Pop | via the ``--job-age`` option, which defaults to 6 hours), in order |
47 | bae4b322 | Iustin Pop | to keep the job queue manageable. |
48 | bae4b322 | Iustin Pop | |
49 | bae4b322 | Iustin Pop | Node operations |
50 | bae4b322 | Iustin Pop | ~~~~~~~~~~~~~~~ |
51 | bae4b322 | Iustin Pop | |
52 | bae4b322 | Iustin Pop | The watcher will restart any down daemons that are appropriate for |
53 | bae4b322 | Iustin Pop | the current node. |
54 | bae4b322 | Iustin Pop | |
55 | bae4b322 | Iustin Pop | In addition, it will execute any scripts which exist under the |
56 | bae4b322 | Iustin Pop | "watcher" directory in the Ganeti hooks directory |
57 | bae4b322 | Iustin Pop | (``@SYSCONFDIR@/ganeti/hooks``). This should be used for lightweight |
58 | bae4b322 | Iustin Pop | actions, like starting any extra daemons. |
59 | bae4b322 | Iustin Pop | |
60 | bae4b322 | Iustin Pop | If the cluster parameter ``maintain_node_health`` is enabled, then the |
61 | bae4b322 | Iustin Pop | watcher will also shutdown instances and DRBD devices if the node is |
62 | bae4b322 | Iustin Pop | declared as offline by known master candidates. |
63 | bae4b322 | Iustin Pop | |
64 | bae4b322 | Iustin Pop | The watcher does synchronous queries but will submit jobs for |
65 | bae4b322 | Iustin Pop | executing the changes. Due to locking, it could be that the jobs |
66 | bae4b322 | Iustin Pop | execute much later than the watcher submits them. |
67 | bae4b322 | Iustin Pop | |
68 | bae4b322 | Iustin Pop | FILES |
69 | bae4b322 | Iustin Pop | ----- |
70 | bae4b322 | Iustin Pop | |
71 | bae4b322 | Iustin Pop | The command has a state file located at |
72 | bae4b322 | Iustin Pop | ``@LOCALSTATEDIR@/lib/ganeti/watcher.data`` (only used on the master) |
73 | bae4b322 | Iustin Pop | and a log file at ``@LOCALSTATEDIR@/log/ganeti/watcher.log``. Removal |
74 | bae4b322 | Iustin Pop | of either file will not affect correct operation; the removal of the |
75 | bae4b322 | Iustin Pop | state file will just cause the restart counters for the instances to |
76 | bae4b322 | Iustin Pop | reset to zero. |
77 | 9ff4f2c0 | Michael Hanselmann | |
78 | 9ff4f2c0 | Michael Hanselmann | .. vim: set textwidth=72 : |
79 | 9ff4f2c0 | Michael Hanselmann | .. Local Variables: |
80 | 9ff4f2c0 | Michael Hanselmann | .. mode: rst |
81 | 9ff4f2c0 | Michael Hanselmann | .. fill-column: 72 |
82 | 9ff4f2c0 | Michael Hanselmann | .. End: |