Revision 6328fea3 man/ganeti-watcher.sgml
b/man/ganeti-watcher.sgml | ||
---|---|---|
48 | 48 |
<para> |
49 | 49 |
The <command>&dhpackage;</command> is a periodically run script |
50 | 50 |
which is responsible for keeping the instances in the correct |
51 |
status. |
|
51 |
status. It has two separate functions, one for the master node |
|
52 |
and another one that runs on every node. |
|
52 | 53 |
</para> |
53 | 54 |
|
54 |
<para> |
|
55 |
Its primary function is to try to keep running all instances |
|
56 |
which are marked as <emphasis>up</emphasis> in the configuration |
|
57 |
file, by trying to start them a limited number of times. |
|
58 |
</para> |
|
55 |
<refsect2> |
|
56 |
<title>Master operations</title> |
|
59 | 57 |
|
60 |
<para> |
|
61 |
Its other function is to <quote>repair</quote> DRBD links by
|
|
62 |
reactivating the block devices of instances which have
|
|
63 |
secondaries on nodes that have been rebooted.
|
|
64 |
</para> |
|
58 |
<para>
|
|
59 |
Its primary function is to try to keep running all instances
|
|
60 |
which are marked as <emphasis>up</emphasis> in the configuration
|
|
61 |
file, by trying to start them a limited number of times.
|
|
62 |
</para>
|
|
65 | 63 |
|
66 |
<para> |
|
67 |
The watcher does synchronous queries but will submit jobs for |
|
68 |
executing the changes. Due to locking, it could be that the jobs |
|
69 |
execute much later than the watcher executes them. |
|
70 |
</para> |
|
64 |
<para> |
|
65 |
Its other function is to <quote>repair</quote> DRBD links by |
|
66 |
reactivating the block devices of instances which have |
|
67 |
secondaries on nodes that have been rebooted. |
|
68 |
</para> |
|
69 |
|
|
70 |
</refsect2> |
|
71 |
|
|
72 |
<refsect2> |
|
73 |
|
|
74 |
<title>Node operations</title> |
|
75 |
|
|
76 |
<para> |
|
77 |
The watcher will restart any down daemons that are appropriate |
|
78 |
for the current node. |
|
79 |
</para> |
|
80 |
|
|
81 |
<para> |
|
82 |
In addition, it will execute any scripts which exist under the |
|
83 |
<quote>watcher</quote> directory in the ganeti hooks directory |
|
84 |
(@SYSCONFDIR@/ganeti/hooks). This should be used for |
|
85 |
lightweight actions, like starting any extra daemons. |
|
86 |
</para> |
|
87 |
|
|
88 |
<para> |
|
89 |
If the cluster |
|
90 |
parameter <literal>maintain_node_health</literal> is enabled, |
|
91 |
then the watcher will also shutdown instances and DRBD devices |
|
92 |
if the node is declared as offline by known master candidates. |
|
93 |
</para> |
|
94 |
|
|
95 |
<para> |
|
96 |
The watcher does synchronous queries but will submit jobs for |
|
97 |
executing the changes. Due to locking, it could be that the jobs |
|
98 |
execute much later than the watcher executes them. |
|
99 |
</para> |
|
100 |
|
|
101 |
</refsect2> |
|
102 |
|
|
103 |
|
|
104 |
</refsect1> |
|
105 |
|
|
106 |
<refsect1> |
|
107 |
<title>FILES</title> |
|
71 | 108 |
|
72 | 109 |
<para> |
73 | 110 |
The command has a state file located at |
74 |
<filename>@LOCALSTATEDIR@/lib/ganeti/watcher.data</filename> and a log
|
|
75 |
file at |
|
111 |
<filename>@LOCALSTATEDIR@/lib/ganeti/watcher.data</filename> |
|
112 |
(only used on the master) and a log file at
|
|
76 | 113 |
<filename>@LOCALSTATEDIR@/log/ganeti/watcher.log</filename>. Removal of |
77 | 114 |
either file will not affect correct operation; the removal of |
78 | 115 |
the state file will just cause the restart counters for the |
Also available in: Unified diff