Statistics
| Branch: | Tag: | Revision:

root / doc / admin.sgml @ 1d91c392

History | View | Annotate | Download (15.2 kB)

1
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
2
]>
3
  <article class="specification">
4
  <articleinfo>
5
    <title>Ganeti administrator's guide</title>
6
  </articleinfo>
7
  <para>Documents Ganeti version 1.2</para>
8
  <sect1>
9
    <title>Introduction</title>
10

    
11
    <para>Ganeti is a virtualization cluster management software. You are
12
    expected to be a system administrator familiar with your Linux distribution
13
    and the Xen virtualization environment before using it.
14
    </para>
15

    
16
    <para>The various components of Ganeti all have man pages and interactive
17
    help. This manual though will help you getting familiar with the system by
18
    explaining the most common operations, grouped by related use.
19
    </para>
20

    
21
    <para>After a terminology glossary and a section on the prerequisites
22
    needed to use this manual, the rest of this document is divided in three
23
    main sections, which group different features of Ganeti:
24
      <itemizedlist>
25
        <listitem>
26
          <simpara>Instance Management</simpara>
27
        </listitem>
28
        <listitem>
29
          <simpara>High Availability Features</simpara>
30
        </listitem>
31
        <listitem>
32
          <simpara>Debugging Features</simpara>
33
        </listitem>
34
      </itemizedlist>
35
    </para>
36

    
37
    <sect2>
38
      <title>Ganeti terminology</title>
39

    
40
      <para>
41
        This section provides a small introduction to Ganeti terminology, which
42
        might be useful to read the rest of the document.
43

    
44
        <glosslist>
45
          <glossentry>
46
            <glossterm>Cluster</glossterm>
47
            <glossdef>
48
              <simpara>
49
                A set of machines (nodes) that cooperate to offer a
50
                coherent highly available virtualization service.
51
              </simpara>
52
            </glossdef>
53
          </glossentry>
54
          <glossentry>
55
            <glossterm>Node</glossterm>
56
            <glossdef>
57
              <simpara>
58
                A physical machine which is member of a cluster.
59
                Nodes are the basic cluster infrastructure, and are
60
                not fault tolerant.
61
              </simpara>
62
            </glossdef>
63
          </glossentry>
64
          <glossentry>
65
            <glossterm>Master node</glossterm>
66
            <glossdef>
67
              <simpara>
68
                The node which controls the Cluster, from which all
69
                Ganeti commands must be given.
70
              </simpara>
71
            </glossdef>
72
          </glossentry>
73
          <glossentry>
74
            <glossterm>Instance</glossterm>
75
            <glossdef>
76
              <simpara>
77
                A virtual machine which runs on a cluster. It can be a
78
                fault tolerant highly available entity.
79
              </simpara>
80
            </glossdef>
81
          </glossentry>
82
          <glossentry>
83
            <glossterm>Pool</glossterm>
84
            <glossdef>
85
              <simpara>
86
                A pool is a set of clusters sharing the same network.
87
              </simpara>
88
            </glossdef>
89
          </glossentry>
90
          <glossentry>
91
            <glossterm>Meta-Cluster</glossterm>
92
            <glossdef>
93
              <simpara>
94
                Anything that concerns more than one cluster.
95
              </simpara>
96
            </glossdef>
97
          </glossentry>
98
        </glosslist>
99
      </para>
100
    </sect2>
101

    
102
    <sect2>
103
      <title>Prerequisites</title>
104

    
105
      <para>
106
        You need to have your Ganeti cluster installed and configured before
107
        you try any of the commands in this document. Please follow the
108
        <emphasis>Ganeti installation tutorial</emphasis> for instructions on
109
        how to do that.
110
      </para>
111
    </sect2>
112

    
113
  </sect1>
114

    
115
  <sect1>
116
    <title>Managing Instances</title>
117

    
118
    <sect2>
119
      <title>Adding/Removing an instance</title>
120

    
121
      <para>
122
        Adding a new virtual instance to your Ganeti cluster is really easy.
123
        The command is:
124

    
125
        <synopsis>gnt-instance add -n <replaceable>TARGET_NODE</replaceable> -o <replaceable>OS_TYPE</replaceable> -t <replaceable>DISK_TEMPLATE</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis>
126

    
127
        The instance name must be resolvable (e.g. exist in DNS) and
128
        of course map to an address in the same subnet as the cluster
129
        itself. Options you can give to this command include:
130

    
131
      <itemizedlist>
132
        <listitem>
133
          <simpara>The disk size (<option>-s</option>)</simpara>
134
        </listitem>
135
        <listitem>
136
          <simpara>The swap size (<option>--swap-size</option>)</simpara>
137
        </listitem>
138
        <listitem>
139
          <simpara>The memory size (<option>-m</option>)</simpara>
140
        </listitem>
141
        <listitem>
142
          <simpara>The number of virtual CPUs (<option>-p</option>)</simpara>
143
        </listitem>
144
        <listitem>
145
          <simpara>The instance ip address (<option>-i</option>) (use the value
146
            <literal>auto</literal> to make Ganeti record the address from
147
            dns)</simpara>
148
        </listitem>
149
        <listitem>
150
          <simpara>The bridge to connect the instance to (<option>-b</option>),
151
            if you don't want to use the default one</simpara>
152
        </listitem>
153
      </itemizedlist>
154
      </para>
155

    
156
      <para>There are four types of disk template you can choose from:</para>
157

    
158
      <variablelist>
159
        <varlistentry>
160
          <term>diskless</term>
161
          <listitem>
162
            <para>The instance has no disks. Only used for special purpouse
163
              operating systems or for testing.</para>
164
          </listitem>
165
        </varlistentry>
166

    
167
        <varlistentry>
168
          <term>plain</term>
169
          <listitem>
170
            <para>The instance will use LVM devices as backend for its disks.
171
              No redundancy is provided.</para>
172
          </listitem>
173
        </varlistentry>
174

    
175
        <varlistentry>
176
          <term>local_raid1</term>
177
          <listitem>
178
            <para>A local mirror is set between LVM devices to back the
179
              instance. This provides some redundancy for the instance's
180
              data.</para>
181
          </listitem>
182
        </varlistentry>
183

    
184
        <varlistentry>
185
          <term>remote_raid1</term>
186
          <listitem>
187
            <simpara><emphasis role="strong">Note:</emphasis> This is only
188
              valid for multi-node clusters using drbd 0.7.</simpara>
189
            <simpara>
190
              A mirror is set between the local node and a remote one, which
191
              must be specified with the second value of the --node option. Use
192
              this option to obtain a highly available instance that can be
193
              failed over to a remote node should the primary one fail.
194
            </simpara>
195
          </listitem>
196
        </varlistentry>
197

    
198
        <varlistentry>
199
          <term>drbd</term>
200
          <listitem>
201
            <simpara><emphasis role="strong">Note:</emphasis> This is only
202
              valid for multi-node clusters using drbd 8.0.</simpara>
203
            <simpara>
204
              This is similar to the
205
              <replaceable>remote_raid1</replaceable> option, but uses
206
              new features in drbd 8 to simplify the device
207
              stack. From a user's point of view, this will improve
208
              the speed of the <command>replace-disks</command>
209
              command and (in future versions) provide more
210
              functionality.
211
            </simpara>
212
          </listitem>
213
        </varlistentry>
214

    
215
      </variablelist>
216

    
217
      <para>
218
        For example if you want to create an highly available instance use the
219
        remote_raid1 or drbd disk templates:
220
        <synopsis>gnt-instance add -n <replaceable>TARGET_NODE</replaceable><optional>:<replaceable>SECONDARY_NODE</replaceable></optional> -o <replaceable>OS_TYPE</replaceable> -t remote_raid1 \
221
  <replaceable>INSTANCE_NAME</replaceable></synopsis>
222

    
223
      <para>
224
        To know which operating systems your cluster supports you can use
225
        <synopsis>gnt-os list</synopsis>
226
      </para>
227

    
228
      <para>
229
        Removing an instance is even easier than creating one. This operation
230
        is non-reversible and destroys all the contents of your instance. Use
231
        with care:
232

    
233
        <synopsis>gnt-instance remove <replaceable>INSTANCE_NAME</replaceable></synopsis>
234
      </para>
235
    </sect2>
236

    
237
    <sect2>
238
      <title>Starting/Stopping an instance</title>
239

    
240
      <para>
241
        Instances are automatically started at instance creation time. To
242
        manually start one which is currently stopped you can run:
243

    
244
        <synopsis>gnt-instance startup <replaceable>INSTANCE_NAME</replaceable></synopsis>
245

    
246
        While the command to stop one is:
247

    
248
        <synopsis>gnt-instance shutdown <replaceable>INSTANCE_NAME</replaceable></synopsis>
249

    
250
        The command to see all the instances configured and their status is:
251

    
252
        <synopsis>gnt-instance list</synopsis>
253

    
254
      </para>
255

    
256
      <para>
257
        Do not use the xen commands to stop instances. If you run for
258
        example xm shutdown or xm destroy on an instance Ganeti will
259
        automatically restart it (via the
260
        <citerefentry><refentrytitle>ganeti-watcher</refentrytitle>
261
        <manvolnum>8</manvolnum></citerefentry>)
262
      </para>
263

    
264
    </sect2>
265

    
266
    <sect2>
267
      <title>Exporting/Importing an instance</title>
268

    
269
      <para>
270
        You can create a snapshot of an instance disk and Ganeti
271
        configuration, which then you can backup, or import into
272
        another cluster. The way to export an instance is:
273

    
274
        <synopsis>gnt-backup export -n <replaceable>TARGET_NODE</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis>
275

    
276
        The target node can be any node in the cluster with enough
277
        space under <filename class="directory">/srv/ganeti</filename>
278
        to hold the instance image. Use the
279
        <option>--noshutdown</option> option to snapshot an instance
280
        without rebooting it. Any previous snapshot of the same
281
        instance existing cluster-wide under <filename
282
        class="directory">/srv/ganeti</filename> will be removed by
283
        this operation: if you want to keep them move them out of the
284
        Ganeti exports directory.
285
      </para>
286

    
287
      <para>
288
        Importing an instance is similar to creating a new one. The command is:
289

    
290
        <synopsis>gnt-backup import -n <replaceable>TARGET_NODE</replaceable> -t <replaceable>DISK_TEMPLATE</replaceable> --src-node=<replaceable>NODE</replaceable> --src-dir=DIR INSTANCE_NAME</synopsis>
291

    
292
        Most of the options available for the command
293
        <emphasis>gnt-instance add</emphasis> are supported here too.
294

    
295
      </para>
296
    </sect2>
297

    
298
  </sect1>
299

    
300

    
301
  <sect1>
302
    <title>High availability features</title>
303

    
304
    <note>
305
      <simpara>This section only applies to multi-node clusters.</simpara>
306
    </note>
307

    
308
    <sect2>
309
      <title>Failing over an instance</title>
310

    
311
      <para>
312
        If an instance is built in highly available mode you can at
313
        any time fail it over to its secondary node, even if the
314
        primary has somehow failed and it's not up anymore. Doing it
315
        is really easy, on the master node you can just run:
316

    
317
        <synopsis>gnt-instance failover <replaceable>INSTANCE_NAME</replaceable></synopsis>
318

    
319
        That's it. After the command completes the secondary node is
320
        now the primary, and vice versa.
321
      </para>
322
    </sect2>
323

    
324
    <sect2>
325
      <title>Replacing an instance disks</title>
326

    
327
      <para>
328
        So what if instead the secondary node for an instance has
329
        failed, or you plan to remove a node from your cluster, and
330
        you failed over all its instances, but it's still secondary
331
        for some? The solution here is to replace the instance disks,
332
        changing the secondary node. This is done in two ways, depending on the disk template type. For <literal>remote_raid1</literal>:
333

    
334
        <synopsis>gnt-instance replace-disks <option>-n <replaceable>NEW_SECONDARY</replaceable></option> <replaceable>INSTANCE_NAME</replaceable></synopsis>
335

    
336
        and for <literal>drbd</literal>:
337
        <synopsis>gnt-instance replace-disks <option>-s</option> <option>-n <replaceable>NEW_SECONDARY</replaceable></option> <replaceable>INSTANCE_NAME</replaceable></synopsis>
338

    
339
        This process is a bit longer, but involves no instance
340
        downtime, and at the end of it the instance has changed its
341
        secondary node, to which it can if necessary be failed over.
342
      </para>
343
    </sect2>
344
    <sect2>
345
      <title>Failing over the master node</title>
346

    
347
      <para>
348
        This is all good as long as the Ganeti Master Node is
349
        up. Should it go down, or should you wish to decommission it,
350
        just run on any other node the command:
351

    
352
        <synopsis>gnt-cluster masterfailover</synopsis>
353

    
354
        and the node you ran it on is now the new master.
355
      </para>
356
    </sect2>
357
    <sect2>
358
      <title>Adding/Removing nodes</title>
359

    
360
      <para>
361
        And of course, now that you know how to move instances around,
362
        it's easy to free up a node, and then you can remove it from
363
        the cluster:
364

    
365
        <synopsis>gnt-node remove <replaceable>NODE_NAME</replaceable></synopsis>
366

    
367
        and maybe add a new one:
368

    
369
        <synopsis>gnt-node add <optional><option>--secondary-ip=<replaceable>ADDRESS</replaceable></option></optional> <replaceable>NODE_NAME</replaceable>
370

    
371
      </synopsis>
372
      </para>
373
    </sect2>
374
  </sect1>
375

    
376
  <sect1>
377
    <title>Debugging Features</title>
378

    
379
    <para>
380
      At some point you might need to do some debugging operations on
381
      your cluster or on your instances. This section will help you
382
      with the most used debugging functionalities.
383
    </para>
384

    
385
    <sect2>
386
      <title>Accessing an instance's disks</title>
387

    
388
      <para>
389
        From an instance's primary node you have access to its
390
        disks. Never ever mount the underlying logical volume manually
391
        on a fault tolerant instance, or you risk breaking
392
        replication. The correct way to access them is to run the
393
        command:
394

    
395
        <synopsis>gnt-instance activate-disks <replaceable>INSTANCE_NAME</replaceable></synopsis>
396

    
397
        And then access the device that gets created.  After you've
398
        finished you can deactivate them with the deactivate-disks
399
        command, which works in the same way.
400
      </para>
401
    </sect2>
402

    
403
    <sect2>
404
      <title>Accessing an instance's console</title>
405

    
406
      <para>
407
        The command to access a running instance's console is:
408

    
409
        <synopsis>gnt-instance console <replaceable>INSTANCE_NAME</replaceable></synopsis>
410

    
411
        Use the console normally and then type
412
        <userinput>^]</userinput> when done, to exit.
413
      </para>
414
    </sect2>
415

    
416
    <sect2>
417
      <title>Instance OS definitions Debugging</title>
418

    
419
      <para>
420
        Should you have any problems with operating systems support
421
        the command to ran to see a complete status for all your nodes
422
        is:
423

    
424
        <synopsis>gnt-os diagnose</synopsis>
425

    
426
      </para>
427

    
428
    </sect2>
429

    
430
    <sect2>
431
      <title>Cluster-wide debugging</title>
432

    
433
      <para>
434
        The gnt-cluster command offers several options to run tests or
435
        execute cluster-wide operations. For example:
436

    
437
      <screen>
438
gnt-cluster command
439
gnt-cluster copyfile
440
gnt-cluster verify
441
gnt-cluster getmaster
442
gnt-cluster version
443
      </screen>
444

    
445
        See the man page <citerefentry>
446
        <refentrytitle>gnt-cluster</refentrytitle>
447
        <manvolnum>8</manvolnum> </citerefentry> to know more about
448
        their usage.
449
      </para>
450
    </sect2>
451

    
452
  </sect1>
453

    
454
  </article>