1 <!doctype refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [
3 <!-- Please adjust the date whenever revising the manpage. -->
4 <!ENTITY dhdate "<date>February 11, 2009</date>">
5 <!-- SECTION should be 1-8, maybe w/ subsection other parameters are
6 allowed: see man(7), man(1). -->
7 <!ENTITY dhsection "<manvolnum>8</manvolnum>">
8 <!ENTITY dhucpackage "<refentrytitle>ganeti-masterd</refentrytitle>">
9 <!ENTITY dhpackage "ganeti-masterd">
11 <!ENTITY debian "<productname>Debian</productname>">
12 <!ENTITY gnu "<acronym>GNU</acronym>">
13 <!ENTITY gpl "&gnu; <acronym>GPL</acronym>">
14 <!ENTITY footer SYSTEM "footer.sgml">
22 <holder>Google Inc.</holder>
30 <refmiscinfo>ganeti 2.0</refmiscinfo>
33 <refname>&dhpackage;</refname>
35 <refpurpose>ganeti master daemon</refpurpose>
39 <command>&dhpackage; </command>
42 <arg>--no-voting</arg>
47 <title>DESCRIPTION</title>
50 The <command>&dhpackage;</command> is the daemon which is
51 responsible for the overall cluster coordination. Without it, no
52 change can be performed on the cluster.
56 For testing purposes, you can give the <option>-f</option>
57 option and the program won't detach from the running terminal.
61 Debug-level message can be activated by giving the
62 <option>-d</option> option.
67 The role of the master daemon is to coordinate all the actions
68 that change the state of the cluster. Things like accepting
69 new jobs, coordinating the changes on nodes (via RPC calls to
70 the respective node daemons), maintaining the configuration
71 and so on are done via this daemon.
75 The only action that can be done without the master daemon is
76 the failover of the master role to another node in the
77 cluster, via the <command>gnt-cluster masterfailover</command>
82 If the master daemon is stopped, the instances are not
83 affected, but they won't be restarted automatically in case of
89 <title>STARTUP</title>
91 At startup, the master daemon will confirm with the node
92 daemons that the node it is running is indeed the master node
93 of the cluster. It will abort if it doesn't get half plus one
94 positive answers (offline nodes are queried too, just in case
95 our configuration is stale).
99 For small clusters with a number of nodes down, and especially
100 for two-node clusters where the other has gone done, this
101 creates a problem. In this case the
102 <option>--no-voting</option> option can be used to skip this
103 process. The option requires interactive confirmation, as
104 having two masters on the same cluster is a very dangerous
105 situation and will most likely lead to data loss.
110 <title>JOB QUEUE</title>
112 The master daemon maintains a job queue (located under
114 class="directory">@LOCALSTATEDIR@/lib/ganeti/queue</filename>) in
115 which all current jobs are stored, one job per file serialized
116 in JSON format; in this directory a subdirectory called
117 <filename class="directory">archive</filename> holds archived
122 The moving of jobs from the current to the queue directory is
123 done via a request to the master; this can be accomplished
124 from the command line with the <command>gnt-job
125 archive</command> or <command>gnt-job autoarchive</command>
126 commands. In case of problems with the master, a job file can
127 simply be moved away or deleted (but this might leave the
128 cluster inconsistent).
134 <title>COMMUNICATION PROTOCOL</title>
136 The master accepts commands over a Unix socket, using JSON
137 serialized messages separated by a specific byte sequence. For
138 more details, see the design documentation supplied with
149 <!-- Keep this comment at the end of the file
154 sgml-minimize-attributes:nil
155 sgml-always-quote-attributes:t
158 sgml-parent-document:nil
159 sgml-default-dtd-file:nil
160 sgml-exposed-tags:nil
161 sgml-local-catalogs:nil
162 sgml-local-ecat-files:nil