<!-- Fill in your name for FIRSTNAME and SURNAME. -->
<!-- Please adjust the date whenever revising the manpage. -->
- <!ENTITY dhdate "<date>June 20, 2007</date>">
+ <!ENTITY dhdate "<date>February 11, 2009</date>">
<!-- SECTION should be 1-8, maybe w/ subsection other parameters are
allowed: see man(7), man(1). -->
<!ENTITY dhsection "<manvolnum>8</manvolnum>">
<refentryinfo>
<copyright>
<year>2007</year>
+ <year>2008</year>
+ <year>2009</year>
<holder>Google Inc.</holder>
</copyright>
&dhdate;
&dhucpackage;
&dhsection;
- <refmiscinfo>ganeti 1.2</refmiscinfo>
+ <refmiscinfo>ganeti 2.0</refmiscinfo>
</refmeta>
<refnamediv>
<refname>&dhpackage;</refname>
</para>
<para>
- Its function is to try to keep running all instances which are
- marked as <emphasis>up</emphasis> in the configuration file, by
- trying to start them a limited number of times.
+ Its primary function is to try to keep running all instances
+ which are marked as <emphasis>up</emphasis> in the configuration
+ file, by trying to start them a limited number of times.
</para>
- <para>In order to prevent piling up commands, all the
- <emphasis>gnt-*</emphasis> commands executed by ganeti-watcher are
- run with a timeout of 15 seconds.
+ <para>
+ Its other function is to <quote>repair</quote> DRBD links by
+ reactivating the block devices of instances which have
+ secondaries on nodes that have been rebooted.
+ </para>
+
+ <para>
+ The watcher does synchronous queries but will submit jobs for
+ executing the changes. Due to locking, it could be that the jobs
+ execute much later than the watcher executes them.
</para>
<para>
The command has a state file located at
- <filename>/var/lib/ganeti/restart_state</filename> and a log
+ <filename>@LOCALSTATEDIR@/lib/ganeti/watcher.data</filename> and a log
file at
- <filename>/var/log/ganeti/watcher.log</filename>. Removal of
+ <filename>@LOCALSTATEDIR@/log/ganeti/watcher.log</filename>. Removal of
either file will not affect correct operation; the removal of
the state file will just cause the restart counters for the
instances to reset to zero.
</refsect1>
- <refsect1>
- <title>KNOWN BUGS</title>
-
- <para>
- Due to the way we initialize DRBD peers, restarting a secondary
- node for an instance will cause the DRBD endpoints on that node
- to disappear, thus all instances which have that node as a
- secondary will lose redundancy. The watcher does not detect this
- situation. The workaround is to manually run
- <command>gnt-instance activate-disks</command> for all the
- affected instances.
- </para>
- </refsect1>
-
&footer;
</refentry>