Statistics
| Branch: | Tag: | Revision:

root / man / ganeti-watcher.rst @ 3a817255

History | View | Annotate | Download (3.1 kB)

1
ganeti-watcher(8) Ganeti | Version @GANETI_VERSION@
2
===================================================
3

    
4
Name
5
----
6

    
7
ganeti-watcher - Ganeti cluster watcher
8

    
9
Synopsis
10
--------
11

    
12
**ganeti-watcher** [``--debug``]
13
[``--job-age=``*age*]
14
[``--ignore-pause``]
15

    
16
DESCRIPTION
17
-----------
18

    
19
The **ganeti-watcher** is a periodically run script which is
20
responsible for keeping the instances in the correct status. It has
21
two separate functions, one for the master node and another one
22
that runs on every node.
23

    
24
If the watcher is disabled at cluster level (via the
25
**gnt-cluster watcher pause** command), it will exit without doing
26
anything. The cluster-level pause can be overridden via the
27
``--ignore-pause`` option, for example if during a maintenance the
28
watcher needs to be disabled in general, but the administrator
29
wants to run it just once.
30

    
31
The ``--debug`` option will increase the verbosity of the watcher
32
and also activate logging to the standard error.
33

    
34
Master operations
35
~~~~~~~~~~~~~~~~~
36

    
37
Its primary function is to try to keep running all instances which
38
are marked as *up* in the configuration file, by trying to start
39
them a limited number of times.
40

    
41
Another function is to "repair" DRBD links by reactivating the
42
block devices of instances which have secondaries on nodes that
43
have been rebooted.
44

    
45
The watcher will also archive old jobs (older than the age given
46
via the ``--job-age`` option, which defaults to 6 hours), in order
47
to keep the job queue manageable.
48

    
49
Node operations
50
~~~~~~~~~~~~~~~
51

    
52
The watcher will restart any down daemons that are appropriate for
53
the current node.
54

    
55
In addition, it will execute any scripts which exist under the
56
"watcher" directory in the Ganeti hooks directory
57
(``@SYSCONFDIR@/ganeti/hooks``). This should be used for lightweight
58
actions, like starting any extra daemons.
59

    
60
If the cluster parameter ``maintain_node_health`` is enabled, then the
61
watcher will also shutdown instances and DRBD devices if the node is
62
declared as offline by known master candidates.
63

    
64
The watcher does synchronous queries but will submit jobs for
65
executing the changes. Due to locking, it could be that the jobs
66
execute much later than the watcher submits them.
67

    
68
FILES
69
-----
70

    
71
The command has a set of state files (one per group) located at
72
``@LOCALSTATEDIR@/lib/ganeti/watcher.GROUP-UUID.data`` (only used on the
73
master) and a log file at
74
``@LOCALSTATEDIR@/log/ganeti/watcher.log``. Removal of either file(s)
75
will not affect correct operation; the removal of the state file will
76
just cause the restart counters for the instances to reset to zero, and
77
mark nodes as freshly rebooted (so for example DRBD minors will be
78
re-activated).
79

    
80
In some cases, it's even desirable to reset the watcher state, for
81
example after maintenance actions, or when you want to simulate the
82
reboot of all nodes, so in this case, you can remove all state files:
83

    
84
.. code-block:: bash
85

    
86
    rm -f @LOCALSTATEDIR@/lib/ganeti/watcher.*.data
87
    rm -f @LOCALSTATEDIR@/lib/ganeti/watcher.*.instance-status
88
    rm -f @LOCALSTATEDIR@/lib/ganeti/instance-status
89

    
90
And then re-run the watcher.
91

    
92
.. vim: set textwidth=72 :
93
.. Local Variables:
94
.. mode: rst
95
.. fill-column: 72
96
.. End: