48 |
48 |
<para>
|
49 |
49 |
The <command>&dhpackage;</command> is a periodically run script
|
50 |
50 |
which is responsible for keeping the instances in the correct
|
51 |
|
status.
|
|
51 |
status. It has two separate functions, one for the master node
|
|
52 |
and another one that runs on every node.
|
52 |
53 |
</para>
|
53 |
54 |
|
54 |
|
<para>
|
55 |
|
Its primary function is to try to keep running all instances
|
56 |
|
which are marked as <emphasis>up</emphasis> in the configuration
|
57 |
|
file, by trying to start them a limited number of times.
|
58 |
|
</para>
|
|
55 |
<refsect2>
|
|
56 |
<title>Master operations</title>
|
59 |
57 |
|
60 |
|
<para>
|
61 |
|
Its other function is to <quote>repair</quote> DRBD links by
|
62 |
|
reactivating the block devices of instances which have
|
63 |
|
secondaries on nodes that have been rebooted.
|
64 |
|
</para>
|
|
58 |
<para>
|
|
59 |
Its primary function is to try to keep running all instances
|
|
60 |
which are marked as <emphasis>up</emphasis> in the configuration
|
|
61 |
file, by trying to start them a limited number of times.
|
|
62 |
</para>
|
65 |
63 |
|
66 |
|
<para>
|
67 |
|
The watcher does synchronous queries but will submit jobs for
|
68 |
|
executing the changes. Due to locking, it could be that the jobs
|
69 |
|
execute much later than the watcher executes them.
|
70 |
|
</para>
|
|
64 |
<para>
|
|
65 |
Its other function is to <quote>repair</quote> DRBD links by
|
|
66 |
reactivating the block devices of instances which have
|
|
67 |
secondaries on nodes that have been rebooted.
|
|
68 |
</para>
|
|
69 |
|
|
70 |
</refsect2>
|
|
71 |
|
|
72 |
<refsect2>
|
|
73 |
|
|
74 |
<title>Node operations</title>
|
|
75 |
|
|
76 |
<para>
|
|
77 |
The watcher will restart any down daemons that are appropriate
|
|
78 |
for the current node.
|
|
79 |
</para>
|
|
80 |
|
|
81 |
<para>
|
|
82 |
In addition, it will execute any scripts which exist under the
|
|
83 |
<quote>watcher</quote> directory in the ganeti hooks directory
|
|
84 |
(@SYSCONFDIR@/ganeti/hooks). This should be used for
|
|
85 |
lightweight actions, like starting any extra daemons.
|
|
86 |
</para>
|
|
87 |
|
|
88 |
<para>
|
|
89 |
If the cluster
|
|
90 |
parameter <literal>maintain_node_health</literal> is enabled,
|
|
91 |
then the watcher will also shutdown instances and DRBD devices
|
|
92 |
if the node is declared as offline by known master candidates.
|
|
93 |
</para>
|
|
94 |
|
|
95 |
<para>
|
|
96 |
The watcher does synchronous queries but will submit jobs for
|
|
97 |
executing the changes. Due to locking, it could be that the jobs
|
|
98 |
execute much later than the watcher executes them.
|
|
99 |
</para>
|
|
100 |
|
|
101 |
</refsect2>
|
|
102 |
|
|
103 |
|
|
104 |
</refsect1>
|
|
105 |
|
|
106 |
<refsect1>
|
|
107 |
<title>FILES</title>
|
71 |
108 |
|
72 |
109 |
<para>
|
73 |
110 |
The command has a state file located at
|
74 |
|
<filename>@LOCALSTATEDIR@/lib/ganeti/watcher.data</filename> and a log
|
75 |
|
file at
|
|
111 |
<filename>@LOCALSTATEDIR@/lib/ganeti/watcher.data</filename>
|
|
112 |
(only used on the master) and a log file at
|
76 |
113 |
<filename>@LOCALSTATEDIR@/log/ganeti/watcher.log</filename>. Removal of
|
77 |
114 |
either file will not affect correct operation; the removal of
|
78 |
115 |
the state file will just cause the restart counters for the
|