Statistics
| Branch: | Tag: | Revision:

root / docs / admin.sgml @ 1005b0c1

History | View | Annotate | Download (12 kB)

1
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
2
]>
3
  <article class="specification">
4
  <articleinfo>
5
    <title>Ganeti administrator's guide</title>
6
  </articleinfo>
7
  <para>Documents Ganeti version 1.2</para>
8
  <sect1>
9
    <title>Introduction</title>
10

    
11
    <para>Ganeti is a virtualization cluster management software. You are
12
    expected to be a system administrator familiar with your Linux distribution
13
    and the Xen virtualization environment before using it.
14
    </para>
15

    
16
    <para>The various components of Ganeti all have man pages and interactive
17
    help. This manual though will help you getting familiar with the system by
18
    explaining the most common operations, grouped by related use.
19
    </para>
20

    
21
    <para>After a terminology glossary and a section on the prerequisites
22
    needed to use this manual, the rest of this document is divided in three
23
    main sections, which group different features of Ganeti:
24
      <itemizedlist>
25
        <listitem>
26
          <simpara>Instance Management</simpara>
27
        </listitem>
28
        <listitem>
29
          <simpara>High Availability Features</simpara>
30
        </listitem>
31
        <listitem>
32
          <simpara>Debugging Features</simpara>
33
        </listitem>
34
      </itemizedlist>
35
    </para>
36

    
37
    <sect2>
38
      <title>Ganeti Terminology</title>
39

    
40
      <para>This section provides a small introduction to Ganeti terminology,
41
      which might be useful to read the rest of the document.
42

    
43
      <variablelist>
44
        <varlistentry>
45
          <term>Cluster</term>
46
	  <listitem><para>A set of machines (nodes) that cooperate to offer a
47
	  coherent highly available virtualization service.</para></listitem>
48
        </varlistentry>
49

    
50
        <varlistentry>
51
          <term>Node</term>
52
	  <listitem><para>A physical machine which is member of a cluster.
53
	  Nodes are the basic cluster infrastructure, and are not fault
54
	  tolerant.</para></listitem>
55
        </varlistentry>
56

    
57
        <varlistentry>
58
          <term>Master Node</term>
59
	  <listitem><para>The node which controls the Cluster, from which all
60
	  Ganeti commands must be given.</para></listitem>
61
        </varlistentry>
62

    
63
        <varlistentry>
64
          <term>Instance</term>
65
	  <listitem><para>A virtual machine which runs on a cluster. It can be
66
	  a fault tolerant highly available entity.</para></listitem>
67
        </varlistentry>
68

    
69
        <varlistentry>
70
          <term>Pool</term>
71
	  <listitem><para>A pool is a set of clusters sharing the same
72
	  network.</para></listitem>
73
        </varlistentry>
74

    
75
        <varlistentry>
76
          <term>Meta-Cluster</term>
77
	  <listitem><para>Anything that concerns more than one
78
	  cluster.</para></listitem>
79
        </varlistentry>
80

    
81
      </variablelist>
82

    
83
      </para>
84
    </sect2>
85

    
86
    <sect2>
87
      <title>Prerequisites</title>
88

    
89
      <para>You need to have your Ganeti cluster installed and configured
90
      before you try any of the commands in this document. Please follow the
91
      "installing tutorial" for instructions on how to do that.
92
      </para>
93
    </sect2>
94

    
95
  </sect1>
96

    
97
  <sect1>
98
    <title>Managing Instances</title>
99

    
100
    <sect2>
101
      <title>Adding/Removing an instance</title>
102

    
103
      <para>Adding a new virtual instance to your Ganeti cluster is really
104
      easy. The command is:
105
      <programlisting>
106
gnt-instance add -n TARGET_NODE -o OS_TYPE -t DISK_TEMPLATE INSTANCE_NAME
107
      </programlisting>
108
      The instance name must exist in dns and of course map to an address in
109
      the same subnet as the cluster itself. Options you can give to this
110
      command include:
111
      <itemizedlist>
112
        <listitem>
113
          <simpara>The disk size (-s)</simpara>
114
        </listitem>
115
        <listitem>
116
          <simpara>The swap size (--swap-size)</simpara>
117
        </listitem>
118
        <listitem>
119
          <simpara>The memory size (-m)</simpara>
120
        </listitem>
121
        <listitem>
122
          <simpara>The number of virtual CPUs (-p)</simpara>
123
        </listitem>
124
        <listitem>
125
	  <simpara>The instance ip address (-i) (use -i auto to make Ganeti
126
	  record the address from dns)</simpara>
127
        </listitem>
128
        <listitem>
129
	  <simpara>The bridge to connect the instance to (-b), if you don't
130
	  want to use the default one</simpara>
131
        </listitem>
132
      </itemizedlist>
133
      </para>
134

    
135
      <para>There are four types of disk template you can choose from:
136

    
137
      <variablelist>
138
        <varlistentry>
139
          <term>diskless</term>
140
	  <listitem><para>The instance has no disks. Only used for special
141
	  purpouse operating systems or for testing.</para></listitem>
142
        </varlistentry>
143

    
144
        <varlistentry>
145
          <term>plain</term>
146
	  <listitem><para>The instance will use LVM devices as backend for its
147
	  disks. No redundancy is provided.</para></listitem>
148
        </varlistentry>
149

    
150
        <varlistentry>
151
          <term>local_raid1</term>
152
	  <listitem><para>A local mirror is set between LVM devices to back the
153
	  instance. This provides some redundancy for the instance's
154
	  data.</para></listitem>
155
        </varlistentry>
156

    
157
        <varlistentry>
158
          <term>remote_raid1</term>
159
	  <listitem><para>A mirror is set between the local node and a remote
160
	  one, which must be specified with the --secondary-node option. Use
161
	  this option to obtain a highly available instance that can be failed
162
	  over to a remote node should the primary one fail.
163
	  </para></listitem>
164
        </varlistentry>
165

    
166
      </variablelist>
167

    
168
      For example if you want to create an highly available instance use the
169
      remote_raid1 disk template:
170
      <programlisting>
171
gnt-instance add -n TARGET_NODE -o OS_TYPE -t remote_raid1 \
172
  --secondary-node=SECONDARY_NODE INSTANCE_NAME
173
      </programlisting>
174
      To know which operating systems your cluster supports you can use:
175
      <programlisting>
176
gnt-os list
177
      </programlisting>
178
      </para>
179

    
180
      <para>
181
      Removing an instance is even easier than creating one. This operation is
182
      non-reversible and destroys all the contents of your instance. Use with
183
      care:
184
      <programlisting>
185
gnt-instance remove INSTANCE_NAME
186
      </programlisting>
187
      </para>
188
    </sect2>
189

    
190
    <sect2>
191
      <title>Starting/Stopping an instance</title>
192

    
193
      <para>Instances are automatically started at instance creation time. To
194
      manually start one which is currently stopped you can run:
195
      <programlisting>
196
gnt-instance startup INSTANCE_NAME
197
      </programlisting>
198
      While the command to stop one is:
199
      <programlisting>
200
gnt-instance shutdown INSTANCE_NAME
201
      </programlisting>
202
      The command to see all the instances configured and their status is:
203
      <programlisting>
204
gnt-instance list
205
      </programlisting>
206
      </para>
207

    
208
      <para>Do not use the xen commands to stop instances. If you run for
209
      example xm shutdown or xm destroy on an instance Ganeti will
210
      automatically restart it (via the
211
      <citerefentry><refentrytitle>ganeti-watcher</refentrytitle>
212
      <manvolnum>8</manvolnum></citerefentry>)
213
      </para>
214

    
215
    </sect2>
216

    
217
    <sect2>
218
      <title>Exporting/Importing an instance</title>
219

    
220
      <para>You can create a snapshot of an instance disk and Ganeti
221
      configuration, which then you can backup, or import into another cluster.
222
      The way to export an instance is:
223
      <programlisting>
224
gnt-backup export -n TARGET_NODE INSTANCE_NAME
225
      </programlisting>
226
      The target node can be any node in the cluster with enough space under
227
      /srv/ganeti to hold the instance image. Use the --noshutdown option to
228
      snapshot an instance without rebooting it. Any previous snapshot of the
229
      same instance existing cluster-wide under /srv/ganeti will be removed by
230
      this operation: if you want to keep them move them out of the Ganeti
231
      exports directory.
232
      </para>
233

    
234
      <para>Importing an instance is as easy as creating a new one. The command
235
      is:
236
      <programlisting>
237
gnt-backup import -n TRGT_NODE -t DISK_TMPL --src-node=NODE --src-dir=DIR INST_NAME
238
      </programlisting>
239
      Most of the options available for gnt-instance add are supported here
240
      too.
241
      </para>
242
    </sect2>
243

    
244
  </sect1>
245

    
246

    
247
  <sect1>
248
    <title>High availability features</title>
249

    
250
    <sect2>
251
      <title>Failing over an instance</title>
252

    
253
      <para>If an instance is built in highly available mode you can at any
254
      time fail it over to its secondary node, even if the primary has somehow
255
      failed and it's not up anymore. Doing it is really easy, on the master
256
      node you can just run:
257
      <programlisting>
258
gnt-instance failover INSTANCE_NAME
259
      </programlisting>
260
      That's it. After the command completes the secondary node is now the
261
      primary, and vice versa.
262
      </para>
263
    </sect2>
264
    <sect2>
265
      <title>Replacing an instance disks</title>
266

    
267
      <para>So what if instead the secondary node for an instance has failed,
268
      or you plan to remove a node from your cluster, and you failed over all
269
      its instances, but it's still secondary for some? The solution here is to
270
      replace the instance disks, changing the secondary node:
271
      <programlisting>
272
gnt-instance replace-disks -n NEW_SECONDARY INSTANCE_NAME
273
      </programlisting>
274
      This process is a bit longer, but involves no instance downtime, and at
275
      the end of it the instance has changed its secondary node, to which it
276
      can if necessary be failed over.
277
      </para>
278
    </sect2>
279
    <sect2>
280
      <title>Failing over the master node</title>
281

    
282
      <para>This is all good as long as the Ganeti Master Node is up. Should it
283
      go down, or should you wish to decommission it, just run on any other node
284
      the command:
285
      <programlisting>
286
gnt-cluster masterfailover
287
      </programlisting>
288
      and the node you ran it on is now the new master.
289
      </para>
290
    </sect2>
291
    <sect2>
292
      <title>Adding/Removing nodes</title>
293

    
294
      <para>And of course, now that you know how to move instances around, it's
295
      easy to free up a node, and then you can remove it from the cluster:
296
      <programlisting>
297
gnt-node remove NODE_NAME
298
      </programlisting>
299
      and maybe add a new one:
300
      <programlisting>
301
gnt-node add [--secondary-ip=ADDRESS] NODE_NAME
302
      </programlisting>
303
      </para>
304
    </sect2>
305
  </sect1>
306

    
307
  <sect1>
308
    <title>Debugging Features</title>
309

    
310
    <para>At some point you might need to do some debugging operations on your
311
    cluster or on your instances. This section will help you with the most used
312
    debugging functionalities.
313
    </para>
314

    
315
    <sect2>
316
      <title>Accessing an instance's disks</title>
317

    
318
      <para>From an instance's primary node you have access to its disks. Never
319
      ever mount the underlying logical volume manually on a fault tolerant
320
      instance, though or you risk breaking replication. The correct way to
321
      access them is to run the command:
322
      <programlisting>
323
gnt-instance activate-disks INSTANCE_NAME
324
      </programlisting>
325
      And then access the device that gets created.  Of course after you've
326
      finished you can deactivate them with the deactivate-disks command, which
327
      works in the same way.
328
      </para>
329
    </sect2>
330

    
331
    <sect2>
332
      <title>Accessing an instance's console</title>
333

    
334
      <para>The command to access a running instance's console is:
335
      <programlisting>
336
gnt-instance console INSTANCE_NAME
337
      </programlisting>
338
      Use the console normally and then type ^] when done, to exit.
339
      </para>
340
    </sect2>
341

    
342
    <sect2>
343
      <title>Instance Operating System Debugging</title>
344

    
345
      <para>Should you have any problems with operating systems support the
346
      command to ran to see a complete status for all your nodes is:
347
      <programlisting>
348
gnt-os diagnose
349
      </programlisting>
350
      </para>
351

    
352
    </sect2>
353

    
354
    <sect2>
355
      <title>Cluster-wide debugging</title>
356

    
357
      <para>The gnt-cluster command offers several options to run tests or
358
      execute cluster-wide operations. For example:
359
      <programlisting>
360
gnt-cluster command
361
gnt-cluster copyfile
362
gnt-cluster verify
363
gnt-cluster getmaster
364
gnt-cluster version
365
      </programlisting>
366
      See the respective help to know more about their usage.
367
      </para>
368
    </sect2>
369

    
370
  </sect1>
371

    
372
  </article>