Statistics
| Branch: | Tag: | Revision:

root / doc / walkthrough.rst @ 7142485a

History | View | Annotate | Download (43 kB)

1 c71a1a3d Iustin Pop
Ganeti walk-through
2 c71a1a3d Iustin Pop
===================
3 c71a1a3d Iustin Pop
4 c71a1a3d Iustin Pop
Documents Ganeti version |version|
5 c71a1a3d Iustin Pop
6 c71a1a3d Iustin Pop
.. contents::
7 c71a1a3d Iustin Pop
8 c71a1a3d Iustin Pop
.. highlight:: text
9 c71a1a3d Iustin Pop
10 c71a1a3d Iustin Pop
Introduction
11 c71a1a3d Iustin Pop
------------
12 c71a1a3d Iustin Pop
13 c71a1a3d Iustin Pop
This document serves as a more example-oriented guide to Ganeti; while
14 c71a1a3d Iustin Pop
the administration guide shows a conceptual approach, here you will find
15 c71a1a3d Iustin Pop
a step-by-step example to managing instances and the cluster.
16 c71a1a3d Iustin Pop
17 c71a1a3d Iustin Pop
Our simulated, example cluster will have three machines, named
18 c71a1a3d Iustin Pop
``node1``, ``node2``, ``node3``. Note that in real life machines will
19 c71a1a3d Iustin Pop
usually FQDNs but here we use short names for brevity. We will use a
20 926feaf1 Manuel Franceschini
secondary network for replication data, ``192.0.2.0/24``, with nodes
21 c71a1a3d Iustin Pop
having the last octet the same as their index. The cluster name will be
22 c71a1a3d Iustin Pop
``example-cluster``. All nodes have the same simulated hardware
23 c71a1a3d Iustin Pop
configuration, two disks of 750GB, 32GB of memory and 4 CPUs.
24 c71a1a3d Iustin Pop
25 c71a1a3d Iustin Pop
On this cluster, we will create up to seven instances, named
26 c71a1a3d Iustin Pop
``instance1`` to ``instance7``.
27 c71a1a3d Iustin Pop
28 c71a1a3d Iustin Pop
29 c71a1a3d Iustin Pop
Cluster creation
30 c71a1a3d Iustin Pop
----------------
31 c71a1a3d Iustin Pop
32 c71a1a3d Iustin Pop
Follow the :doc:`install` document and prepare the nodes. Then it's time
33 c71a1a3d Iustin Pop
to initialise the cluster::
34 c71a1a3d Iustin Pop
35 926feaf1 Manuel Franceschini
  node1# gnt-cluster init -s 192.0.2.1 --enabled-hypervisors=xen-pvm example-cluster
36 c71a1a3d Iustin Pop
  node1#
37 c71a1a3d Iustin Pop
38 c71a1a3d Iustin Pop
The creation was fine. Let's check that one node we have is functioning
39 c71a1a3d Iustin Pop
correctly::
40 c71a1a3d Iustin Pop
41 c71a1a3d Iustin Pop
  node1# gnt-node list
42 c71a1a3d Iustin Pop
  Node  DTotal DFree MTotal MNode MFree Pinst Sinst
43 c71a1a3d Iustin Pop
  node1   1.3T  1.3T  32.0G  1.0G 30.5G     0     0
44 c71a1a3d Iustin Pop
  node1# gnt-cluster verify
45 c71a1a3d Iustin Pop
  Mon Oct 26 02:08:51 2009 * Verifying global settings
46 c71a1a3d Iustin Pop
  Mon Oct 26 02:08:51 2009 * Gathering data (1 nodes)
47 c71a1a3d Iustin Pop
  Mon Oct 26 02:08:52 2009 * Verifying node status
48 c71a1a3d Iustin Pop
  Mon Oct 26 02:08:52 2009 * Verifying instance status
49 c71a1a3d Iustin Pop
  Mon Oct 26 02:08:52 2009 * Verifying orphan volumes
50 c71a1a3d Iustin Pop
  Mon Oct 26 02:08:52 2009 * Verifying remaining instances
51 c71a1a3d Iustin Pop
  Mon Oct 26 02:08:52 2009 * Verifying N+1 Memory redundancy
52 c71a1a3d Iustin Pop
  Mon Oct 26 02:08:52 2009 * Other Notes
53 c71a1a3d Iustin Pop
  Mon Oct 26 02:08:52 2009 * Hooks Results
54 c71a1a3d Iustin Pop
  node1#
55 c71a1a3d Iustin Pop
56 c71a1a3d Iustin Pop
Since this proceeded correctly, let's add the other two nodes::
57 c71a1a3d Iustin Pop
58 926feaf1 Manuel Franceschini
  node1# gnt-node add -s 192.0.2.2 node2
59 c71a1a3d Iustin Pop
  -- WARNING --
60 c71a1a3d Iustin Pop
  Performing this operation is going to replace the ssh daemon keypair
61 c71a1a3d Iustin Pop
  on the target machine (node2) with the ones of the current one
62 c71a1a3d Iustin Pop
  and grant full intra-cluster ssh root access to/from it
63 c71a1a3d Iustin Pop
64 926feaf1 Manuel Franceschini
  The authenticity of host 'node2 (192.0.2.2)' can't be established.
65 c71a1a3d Iustin Pop
  RSA key fingerprint is 9f:…
66 c71a1a3d Iustin Pop
  Are you sure you want to continue connecting (yes/no)? yes
67 c71a1a3d Iustin Pop
  root@node2's password:
68 c71a1a3d Iustin Pop
  Mon Oct 26 02:11:54 2009  - INFO: Node will be a master candidate
69 926feaf1 Manuel Franceschini
  node1# gnt-node add -s 192.0.2.3 node3
70 c71a1a3d Iustin Pop
  -- WARNING --
71 c71a1a3d Iustin Pop
  Performing this operation is going to replace the ssh daemon keypair
72 c71a1a3d Iustin Pop
  on the target machine (node2) with the ones of the current one
73 c71a1a3d Iustin Pop
  and grant full intra-cluster ssh root access to/from it
74 c71a1a3d Iustin Pop
75 926feaf1 Manuel Franceschini
  The authenticity of host 'node3 (192.0.2.3)' can't be established.
76 c71a1a3d Iustin Pop
  RSA key fingerprint is 9f:…
77 c71a1a3d Iustin Pop
  Are you sure you want to continue connecting (yes/no)? yes
78 c71a1a3d Iustin Pop
  root@node2's password:
79 c71a1a3d Iustin Pop
  Mon Oct 26 02:11:54 2009  - INFO: Node will be a master candidate
80 c71a1a3d Iustin Pop
81 c71a1a3d Iustin Pop
Checking the cluster status again::
82 c71a1a3d Iustin Pop
83 c71a1a3d Iustin Pop
  node1# gnt-node list
84 c71a1a3d Iustin Pop
  Node  DTotal DFree MTotal MNode MFree Pinst Sinst
85 c71a1a3d Iustin Pop
  node1   1.3T  1.3T  32.0G  1.0G 30.5G     0     0
86 c71a1a3d Iustin Pop
  node2   1.3T  1.3T  32.0G  1.0G 30.5G     0     0
87 c71a1a3d Iustin Pop
  node3   1.3T  1.3T  32.0G  1.0G 30.5G     0     0
88 c71a1a3d Iustin Pop
  node1# gnt-cluster verify
89 c71a1a3d Iustin Pop
  Mon Oct 26 02:15:14 2009 * Verifying global settings
90 c71a1a3d Iustin Pop
  Mon Oct 26 02:15:14 2009 * Gathering data (3 nodes)
91 c71a1a3d Iustin Pop
  Mon Oct 26 02:15:16 2009 * Verifying node status
92 c71a1a3d Iustin Pop
  Mon Oct 26 02:15:16 2009 * Verifying instance status
93 c71a1a3d Iustin Pop
  Mon Oct 26 02:15:16 2009 * Verifying orphan volumes
94 c71a1a3d Iustin Pop
  Mon Oct 26 02:15:16 2009 * Verifying remaining instances
95 c71a1a3d Iustin Pop
  Mon Oct 26 02:15:16 2009 * Verifying N+1 Memory redundancy
96 c71a1a3d Iustin Pop
  Mon Oct 26 02:15:16 2009 * Other Notes
97 c71a1a3d Iustin Pop
  Mon Oct 26 02:15:16 2009 * Hooks Results
98 c71a1a3d Iustin Pop
  node1#
99 c71a1a3d Iustin Pop
100 c71a1a3d Iustin Pop
And let's check that we have a valid OS::
101 c71a1a3d Iustin Pop
102 c71a1a3d Iustin Pop
  node1# gnt-os list
103 c71a1a3d Iustin Pop
  Name
104 c71a1a3d Iustin Pop
  debootstrap
105 c71a1a3d Iustin Pop
  node1#
106 c71a1a3d Iustin Pop
107 1cdc9dbb Bernardo Dal Seno
Running a burn-in
108 1cdc9dbb Bernardo Dal Seno
-----------------
109 c71a1a3d Iustin Pop
110 c71a1a3d Iustin Pop
Now that the cluster is created, it is time to check that the hardware
111 c71a1a3d Iustin Pop
works correctly, that the hypervisor can actually create instances,
112 c71a1a3d Iustin Pop
etc. This is done via the debootstrap tool as described in the admin
113 c71a1a3d Iustin Pop
guide. Similar output lines are replaced with ``…`` in the below log::
114 c71a1a3d Iustin Pop
115 c71a1a3d Iustin Pop
  node1# /usr/lib/ganeti/tools/burnin -o debootstrap -p instance{1..5}
116 c71a1a3d Iustin Pop
  - Testing global parameters
117 c71a1a3d Iustin Pop
  - Creating instances
118 c71a1a3d Iustin Pop
    * instance instance1
119 c71a1a3d Iustin Pop
      on node1, node2
120 c71a1a3d Iustin Pop
    * instance instance2
121 c71a1a3d Iustin Pop
      on node2, node3
122 c71a1a3d Iustin Pop
123 c71a1a3d Iustin Pop
    * instance instance5
124 c71a1a3d Iustin Pop
      on node2, node3
125 c71a1a3d Iustin Pop
    * Submitted job ID(s) 157, 158, 159, 160, 161
126 c71a1a3d Iustin Pop
      waiting for job 157 for instance1
127 c71a1a3d Iustin Pop
128 c71a1a3d Iustin Pop
      waiting for job 161 for instance5
129 c71a1a3d Iustin Pop
  - Replacing disks on the same nodes
130 c71a1a3d Iustin Pop
    * instance instance1
131 c71a1a3d Iustin Pop
      run replace_on_secondary
132 c71a1a3d Iustin Pop
      run replace_on_primary
133 c71a1a3d Iustin Pop
134 c71a1a3d Iustin Pop
    * instance instance5
135 c71a1a3d Iustin Pop
      run replace_on_secondary
136 c71a1a3d Iustin Pop
      run replace_on_primary
137 c71a1a3d Iustin Pop
    * Submitted job ID(s) 162, 163, 164, 165, 166
138 c71a1a3d Iustin Pop
      waiting for job 162 for instance1
139 c71a1a3d Iustin Pop
140 c71a1a3d Iustin Pop
  - Changing the secondary node
141 c71a1a3d Iustin Pop
    * instance instance1
142 c71a1a3d Iustin Pop
      run replace_new_secondary node3
143 c71a1a3d Iustin Pop
    * instance instance2
144 c71a1a3d Iustin Pop
      run replace_new_secondary node1
145 c71a1a3d Iustin Pop
146 c71a1a3d Iustin Pop
    * instance instance5
147 c71a1a3d Iustin Pop
      run replace_new_secondary node1
148 c71a1a3d Iustin Pop
    * Submitted job ID(s) 167, 168, 169, 170, 171
149 c71a1a3d Iustin Pop
      waiting for job 167 for instance1
150 c71a1a3d Iustin Pop
151 c71a1a3d Iustin Pop
  - Growing disks
152 c71a1a3d Iustin Pop
    * instance instance1
153 c71a1a3d Iustin Pop
      increase disk/0 by 128 MB
154 c71a1a3d Iustin Pop
155 c71a1a3d Iustin Pop
    * instance instance5
156 c71a1a3d Iustin Pop
      increase disk/0 by 128 MB
157 c71a1a3d Iustin Pop
    * Submitted job ID(s) 173, 174, 175, 176, 177
158 c71a1a3d Iustin Pop
      waiting for job 173 for instance1
159 c71a1a3d Iustin Pop
160 c71a1a3d Iustin Pop
  - Failing over instances
161 c71a1a3d Iustin Pop
    * instance instance1
162 c71a1a3d Iustin Pop
163 c71a1a3d Iustin Pop
    * instance instance5
164 c71a1a3d Iustin Pop
    * Submitted job ID(s) 179, 180, 181, 182, 183
165 c71a1a3d Iustin Pop
      waiting for job 179 for instance1
166 c71a1a3d Iustin Pop
167 c71a1a3d Iustin Pop
  - Migrating instances
168 c71a1a3d Iustin Pop
    * instance instance1
169 c71a1a3d Iustin Pop
      migration and migration cleanup
170 c71a1a3d Iustin Pop
171 c71a1a3d Iustin Pop
    * instance instance5
172 c71a1a3d Iustin Pop
      migration and migration cleanup
173 c71a1a3d Iustin Pop
    * Submitted job ID(s) 184, 185, 186, 187, 188
174 c71a1a3d Iustin Pop
      waiting for job 184 for instance1
175 c71a1a3d Iustin Pop
176 c71a1a3d Iustin Pop
  - Exporting and re-importing instances
177 c71a1a3d Iustin Pop
    * instance instance1
178 c71a1a3d Iustin Pop
      export to node node3
179 c71a1a3d Iustin Pop
      remove instance
180 c71a1a3d Iustin Pop
      import from node3 to node1, node2
181 c71a1a3d Iustin Pop
      remove export
182 c71a1a3d Iustin Pop
183 c71a1a3d Iustin Pop
    * instance instance5
184 c71a1a3d Iustin Pop
      export to node node1
185 c71a1a3d Iustin Pop
      remove instance
186 c71a1a3d Iustin Pop
      import from node1 to node2, node3
187 c71a1a3d Iustin Pop
      remove export
188 c71a1a3d Iustin Pop
    * Submitted job ID(s) 196, 197, 198, 199, 200
189 c71a1a3d Iustin Pop
      waiting for job 196 for instance1
190 c71a1a3d Iustin Pop
191 c71a1a3d Iustin Pop
  - Reinstalling instances
192 c71a1a3d Iustin Pop
    * instance instance1
193 c71a1a3d Iustin Pop
      reinstall without passing the OS
194 c71a1a3d Iustin Pop
      reinstall specifying the OS
195 c71a1a3d Iustin Pop
196 c71a1a3d Iustin Pop
    * instance instance5
197 c71a1a3d Iustin Pop
      reinstall without passing the OS
198 c71a1a3d Iustin Pop
      reinstall specifying the OS
199 c71a1a3d Iustin Pop
    * Submitted job ID(s) 203, 204, 205, 206, 207
200 c71a1a3d Iustin Pop
      waiting for job 203 for instance1
201 c71a1a3d Iustin Pop
202 c71a1a3d Iustin Pop
  - Rebooting instances
203 c71a1a3d Iustin Pop
    * instance instance1
204 c71a1a3d Iustin Pop
      reboot with type 'hard'
205 c71a1a3d Iustin Pop
      reboot with type 'soft'
206 c71a1a3d Iustin Pop
      reboot with type 'full'
207 c71a1a3d Iustin Pop
208 c71a1a3d Iustin Pop
    * instance instance5
209 c71a1a3d Iustin Pop
      reboot with type 'hard'
210 c71a1a3d Iustin Pop
      reboot with type 'soft'
211 c71a1a3d Iustin Pop
      reboot with type 'full'
212 c71a1a3d Iustin Pop
    * Submitted job ID(s) 208, 209, 210, 211, 212
213 c71a1a3d Iustin Pop
      waiting for job 208 for instance1
214 c71a1a3d Iustin Pop
215 c71a1a3d Iustin Pop
  - Adding and removing disks
216 c71a1a3d Iustin Pop
    * instance instance1
217 c71a1a3d Iustin Pop
      adding a disk
218 c71a1a3d Iustin Pop
      removing last disk
219 c71a1a3d Iustin Pop
220 c71a1a3d Iustin Pop
    * instance instance5
221 c71a1a3d Iustin Pop
      adding a disk
222 c71a1a3d Iustin Pop
      removing last disk
223 c71a1a3d Iustin Pop
    * Submitted job ID(s) 213, 214, 215, 216, 217
224 c71a1a3d Iustin Pop
      waiting for job 213 for instance1
225 c71a1a3d Iustin Pop
226 c71a1a3d Iustin Pop
  - Adding and removing NICs
227 c71a1a3d Iustin Pop
    * instance instance1
228 c71a1a3d Iustin Pop
      adding a NIC
229 c71a1a3d Iustin Pop
      removing last NIC
230 c71a1a3d Iustin Pop
231 c71a1a3d Iustin Pop
    * instance instance5
232 c71a1a3d Iustin Pop
      adding a NIC
233 c71a1a3d Iustin Pop
      removing last NIC
234 c71a1a3d Iustin Pop
    * Submitted job ID(s) 218, 219, 220, 221, 222
235 c71a1a3d Iustin Pop
      waiting for job 218 for instance1
236 c71a1a3d Iustin Pop
237 c71a1a3d Iustin Pop
  - Activating/deactivating disks
238 c71a1a3d Iustin Pop
    * instance instance1
239 c71a1a3d Iustin Pop
      activate disks when online
240 c71a1a3d Iustin Pop
      activate disks when offline
241 c71a1a3d Iustin Pop
      deactivate disks (when offline)
242 c71a1a3d Iustin Pop
243 c71a1a3d Iustin Pop
    * instance instance5
244 c71a1a3d Iustin Pop
      activate disks when online
245 c71a1a3d Iustin Pop
      activate disks when offline
246 c71a1a3d Iustin Pop
      deactivate disks (when offline)
247 c71a1a3d Iustin Pop
    * Submitted job ID(s) 223, 224, 225, 226, 227
248 c71a1a3d Iustin Pop
      waiting for job 223 for instance1
249 c71a1a3d Iustin Pop
250 c71a1a3d Iustin Pop
  - Stopping and starting instances
251 c71a1a3d Iustin Pop
    * instance instance1
252 c71a1a3d Iustin Pop
253 c71a1a3d Iustin Pop
    * instance instance5
254 c71a1a3d Iustin Pop
    * Submitted job ID(s) 230, 231, 232, 233, 234
255 c71a1a3d Iustin Pop
      waiting for job 230 for instance1
256 c71a1a3d Iustin Pop
257 c71a1a3d Iustin Pop
  - Removing instances
258 c71a1a3d Iustin Pop
    * instance instance1
259 c71a1a3d Iustin Pop
260 c71a1a3d Iustin Pop
    * instance instance5
261 c71a1a3d Iustin Pop
    * Submitted job ID(s) 235, 236, 237, 238, 239
262 c71a1a3d Iustin Pop
      waiting for job 235 for instance1
263 c71a1a3d Iustin Pop
264 c71a1a3d Iustin Pop
  node1#
265 c71a1a3d Iustin Pop
266 1cdc9dbb Bernardo Dal Seno
You can see in the above what operations the burn-in does. Ideally, the
267 1cdc9dbb Bernardo Dal Seno
burn-in log would proceed successfully through all the steps and end
268 c71a1a3d Iustin Pop
cleanly, without throwing errors.
269 c71a1a3d Iustin Pop
270 c71a1a3d Iustin Pop
Instance operations
271 c71a1a3d Iustin Pop
-------------------
272 c71a1a3d Iustin Pop
273 c71a1a3d Iustin Pop
Creation
274 c71a1a3d Iustin Pop
++++++++
275 c71a1a3d Iustin Pop
276 c71a1a3d Iustin Pop
At this point, Ganeti and the hardware seems to be functioning
277 c71a1a3d Iustin Pop
correctly, so we'll follow up with creating the instances manually::
278 c71a1a3d Iustin Pop
279 c71a1a3d Iustin Pop
  node1# gnt-instance add -t drbd -o debootstrap -s 256m -n node1:node2 instance3
280 c71a1a3d Iustin Pop
  Mon Oct 26 04:06:52 2009  - INFO: Selected nodes for instance instance1 via iallocator hail: node2, node3
281 c71a1a3d Iustin Pop
  Mon Oct 26 04:06:53 2009 * creating instance disks...
282 c71a1a3d Iustin Pop
  Mon Oct 26 04:06:57 2009 adding instance instance1 to cluster config
283 c71a1a3d Iustin Pop
  Mon Oct 26 04:06:57 2009  - INFO: Waiting for instance instance1 to sync disks.
284 c71a1a3d Iustin Pop
  Mon Oct 26 04:06:57 2009  - INFO: - device disk/0: 20.00% done, 4 estimated seconds remaining
285 c71a1a3d Iustin Pop
  Mon Oct 26 04:07:01 2009  - INFO: Instance instance1's disks are in sync.
286 c71a1a3d Iustin Pop
  Mon Oct 26 04:07:01 2009 creating os for instance instance1 on node node2
287 c71a1a3d Iustin Pop
  Mon Oct 26 04:07:01 2009 * running the instance OS create scripts...
288 c71a1a3d Iustin Pop
  Mon Oct 26 04:07:14 2009 * starting instance...
289 c71a1a3d Iustin Pop
  node1# gnt-instance add -t drbd -o debootstrap -s 256m -n node1:node2 instanc<drbd -o debootstrap -s 256m -n node1:node2 instance2
290 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:37 2009 * creating instance disks...
291 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:40 2009 adding instance instance2 to cluster config
292 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:41 2009  - INFO: Waiting for instance instance2 to sync disks.
293 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:41 2009  - INFO: - device disk/0: 35.40% done, 1 estimated seconds remaining
294 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:42 2009  - INFO: - device disk/0: 58.50% done, 1 estimated seconds remaining
295 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:43 2009  - INFO: - device disk/0: 86.20% done, 0 estimated seconds remaining
296 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:44 2009  - INFO: - device disk/0: 92.40% done, 0 estimated seconds remaining
297 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:44 2009  - INFO: - device disk/0: 97.00% done, 0 estimated seconds remaining
298 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:44 2009  - INFO: Instance instance2's disks are in sync.
299 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:44 2009 creating os for instance instance2 on node node1
300 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:44 2009 * running the instance OS create scripts...
301 c71a1a3d Iustin Pop
  Mon Oct 26 04:11:57 2009 * starting instance...
302 c71a1a3d Iustin Pop
  node1#
303 c71a1a3d Iustin Pop
304 c71a1a3d Iustin Pop
The above shows one instance created via an iallocator script, and one
305 c71a1a3d Iustin Pop
being created with manual node assignment. The other three instances
306 c71a1a3d Iustin Pop
were also created and now it's time to check them::
307 c71a1a3d Iustin Pop
308 c71a1a3d Iustin Pop
  node1# gnt-instance list
309 c71a1a3d Iustin Pop
  Instance  Hypervisor OS          Primary_node Status  Memory
310 c71a1a3d Iustin Pop
  instance1 xen-pvm    debootstrap node2        running   128M
311 c71a1a3d Iustin Pop
  instance2 xen-pvm    debootstrap node1        running   128M
312 c71a1a3d Iustin Pop
  instance3 xen-pvm    debootstrap node1        running   128M
313 c71a1a3d Iustin Pop
  instance4 xen-pvm    debootstrap node3        running   128M
314 c71a1a3d Iustin Pop
  instance5 xen-pvm    debootstrap node2        running   128M
315 c71a1a3d Iustin Pop
316 c71a1a3d Iustin Pop
Accessing instances
317 c71a1a3d Iustin Pop
+++++++++++++++++++
318 c71a1a3d Iustin Pop
319 c71a1a3d Iustin Pop
Accessing an instance's console is easy::
320 c71a1a3d Iustin Pop
321 c71a1a3d Iustin Pop
  node1# gnt-instance console instance2
322 c71a1a3d Iustin Pop
  [    0.000000] Bootdata ok (command line is root=/dev/sda1 ro)
323 c71a1a3d Iustin Pop
  [    0.000000] Linux version 2.6…
324 c71a1a3d Iustin Pop
  [    0.000000] BIOS-provided physical RAM map:
325 c71a1a3d Iustin Pop
  [    0.000000]  Xen: 0000000000000000 - 0000000008800000 (usable)
326 c71a1a3d Iustin Pop
  [13138176.018071] Built 1 zonelists.  Total pages: 34816
327 c71a1a3d Iustin Pop
  [13138176.018074] Kernel command line: root=/dev/sda1 ro
328 c71a1a3d Iustin Pop
  [13138176.018694] Initializing CPU#0
329 c71a1a3d Iustin Pop
330 c71a1a3d Iustin Pop
  Checking file systems...fsck 1.41.3 (12-Oct-2008)
331 c71a1a3d Iustin Pop
  done.
332 c71a1a3d Iustin Pop
  Setting kernel variables (/etc/sysctl.conf)...done.
333 c71a1a3d Iustin Pop
  Mounting local filesystems...done.
334 c71a1a3d Iustin Pop
  Activating swapfile swap...done.
335 c71a1a3d Iustin Pop
  Setting up networking....
336 c71a1a3d Iustin Pop
  Configuring network interfaces...done.
337 c71a1a3d Iustin Pop
  Setting console screen modes and fonts.
338 c71a1a3d Iustin Pop
  INIT: Entering runlevel: 2
339 c71a1a3d Iustin Pop
  Starting enhanced syslogd: rsyslogd.
340 c71a1a3d Iustin Pop
  Starting periodic command scheduler: crond.
341 c71a1a3d Iustin Pop
342 c71a1a3d Iustin Pop
  Debian GNU/Linux 5.0 instance2 tty1
343 c71a1a3d Iustin Pop
344 c71a1a3d Iustin Pop
  instance2 login:
345 c71a1a3d Iustin Pop
346 c71a1a3d Iustin Pop
At this moment you can login to the instance and, after configuring the
347 c71a1a3d Iustin Pop
network (and doing this on all instances), we can check their
348 c71a1a3d Iustin Pop
connectivity::
349 c71a1a3d Iustin Pop
350 c71a1a3d Iustin Pop
  node1# fping instance{1..5}
351 c71a1a3d Iustin Pop
  instance1 is alive
352 c71a1a3d Iustin Pop
  instance2 is alive
353 c71a1a3d Iustin Pop
  instance3 is alive
354 c71a1a3d Iustin Pop
  instance4 is alive
355 c71a1a3d Iustin Pop
  instance5 is alive
356 c71a1a3d Iustin Pop
  node1#
357 c71a1a3d Iustin Pop
358 c71a1a3d Iustin Pop
Removal
359 c71a1a3d Iustin Pop
+++++++
360 c71a1a3d Iustin Pop
361 c71a1a3d Iustin Pop
Removing unwanted instances is also easy::
362 c71a1a3d Iustin Pop
363 c71a1a3d Iustin Pop
  node1# gnt-instance remove instance5
364 c71a1a3d Iustin Pop
  This will remove the volumes of the instance instance5 (including
365 c71a1a3d Iustin Pop
  mirrors), thus removing all the data of the instance. Continue?
366 c71a1a3d Iustin Pop
  y/[n]/?: y
367 c71a1a3d Iustin Pop
  node1#
368 c71a1a3d Iustin Pop
369 c71a1a3d Iustin Pop
370 c71a1a3d Iustin Pop
Recovering from hardware failures
371 c71a1a3d Iustin Pop
---------------------------------
372 c71a1a3d Iustin Pop
373 c71a1a3d Iustin Pop
Recovering from node failure
374 c71a1a3d Iustin Pop
++++++++++++++++++++++++++++
375 c71a1a3d Iustin Pop
376 c71a1a3d Iustin Pop
We are now left with four instances. Assume that at this point, node3,
377 c71a1a3d Iustin Pop
which has one primary and one secondary instance, crashes::
378 c71a1a3d Iustin Pop
379 c71a1a3d Iustin Pop
  node1# gnt-node info node3
380 c71a1a3d Iustin Pop
  Node name: node3
381 926feaf1 Manuel Franceschini
    primary ip: 198.51.100.1
382 926feaf1 Manuel Franceschini
    secondary ip: 192.0.2.3
383 c71a1a3d Iustin Pop
    master candidate: True
384 c71a1a3d Iustin Pop
    drained: False
385 c71a1a3d Iustin Pop
    offline: False
386 c71a1a3d Iustin Pop
    primary for instances:
387 c71a1a3d Iustin Pop
      - instance4
388 c71a1a3d Iustin Pop
    secondary for instances:
389 c71a1a3d Iustin Pop
      - instance1
390 c71a1a3d Iustin Pop
  node1# fping node3
391 c71a1a3d Iustin Pop
  node3 is unreachable
392 c71a1a3d Iustin Pop
393 c71a1a3d Iustin Pop
At this point, the primary instance of that node (instance4) is down,
394 c71a1a3d Iustin Pop
but the secondary instance (instance1) is not affected except it has
395 c71a1a3d Iustin Pop
lost disk redundancy::
396 c71a1a3d Iustin Pop
397 c71a1a3d Iustin Pop
  node1# fping instance{1,4}
398 c71a1a3d Iustin Pop
  instance1 is alive
399 c71a1a3d Iustin Pop
  instance4 is unreachable
400 c71a1a3d Iustin Pop
  node1#
401 c71a1a3d Iustin Pop
402 c71a1a3d Iustin Pop
If we try to check the status of instance4 via the instance info
403 c71a1a3d Iustin Pop
command, it fails because it tries to contact node3 which is down::
404 c71a1a3d Iustin Pop
405 c71a1a3d Iustin Pop
  node1# gnt-instance info instance4
406 c71a1a3d Iustin Pop
  Failure: command execution error:
407 c71a1a3d Iustin Pop
  Error checking node node3: Connection failed (113: No route to host)
408 c71a1a3d Iustin Pop
  node1#
409 c71a1a3d Iustin Pop
410 c71a1a3d Iustin Pop
So we need to mark node3 as being *offline*, and thus Ganeti won't talk
411 c71a1a3d Iustin Pop
to it anymore::
412 c71a1a3d Iustin Pop
413 c71a1a3d Iustin Pop
  node1# gnt-node modify -O yes -f node3
414 c71a1a3d Iustin Pop
  Mon Oct 26 04:34:12 2009  - WARNING: Not enough master candidates (desired 10, new value will be 2)
415 c71a1a3d Iustin Pop
  Mon Oct 26 04:34:15 2009  - WARNING: Communication failure to node node3: Connection failed (113: No route to host)
416 c71a1a3d Iustin Pop
  Modified node node3
417 c71a1a3d Iustin Pop
   - offline -> True
418 c71a1a3d Iustin Pop
   - master_candidate -> auto-demotion due to offline
419 c71a1a3d Iustin Pop
  node1#
420 c71a1a3d Iustin Pop
421 c71a1a3d Iustin Pop
And now we can failover the instance::
422 c71a1a3d Iustin Pop
423 c71a1a3d Iustin Pop
  node1# gnt-instance failover --ignore-consistency instance4
424 c71a1a3d Iustin Pop
  Failover will happen to image instance4. This requires a shutdown of
425 c71a1a3d Iustin Pop
  the instance. Continue?
426 c71a1a3d Iustin Pop
  y/[n]/?: y
427 c71a1a3d Iustin Pop
  Mon Oct 26 04:35:34 2009 * checking disk consistency between source and target
428 c71a1a3d Iustin Pop
  Failure: command execution error:
429 c71a1a3d Iustin Pop
  Disk disk/0 is degraded on target node, aborting failover.
430 c71a1a3d Iustin Pop
  node1# gnt-instance failover --ignore-consistency instance4
431 c71a1a3d Iustin Pop
  Failover will happen to image instance4. This requires a shutdown of
432 c71a1a3d Iustin Pop
  the instance. Continue?
433 c71a1a3d Iustin Pop
  y/[n]/?: y
434 c71a1a3d Iustin Pop
  Mon Oct 26 04:35:47 2009 * checking disk consistency between source and target
435 c71a1a3d Iustin Pop
  Mon Oct 26 04:35:47 2009 * shutting down instance on source node
436 c71a1a3d Iustin Pop
  Mon Oct 26 04:35:47 2009  - WARNING: Could not shutdown instance instance4 on node node3. Proceeding anyway. Please make sure node node3 is down. Error details: Node is marked offline
437 c71a1a3d Iustin Pop
  Mon Oct 26 04:35:47 2009 * deactivating the instance's disks on source node
438 c71a1a3d Iustin Pop
  Mon Oct 26 04:35:47 2009  - WARNING: Could not shutdown block device disk/0 on node node3: Node is marked offline
439 c71a1a3d Iustin Pop
  Mon Oct 26 04:35:47 2009 * activating the instance's disks on target node
440 c71a1a3d Iustin Pop
  Mon Oct 26 04:35:47 2009  - WARNING: Could not prepare block device disk/0 on node node3 (is_primary=False, pass=1): Node is marked offline
441 c71a1a3d Iustin Pop
  Mon Oct 26 04:35:48 2009 * starting the instance on the target node
442 c71a1a3d Iustin Pop
  node1#
443 c71a1a3d Iustin Pop
444 c71a1a3d Iustin Pop
Note in our first attempt, Ganeti refused to do the failover since it
445 c71a1a3d Iustin Pop
wasn't sure what is the status of the instance's disks. We pass the
446 c71a1a3d Iustin Pop
``--ignore-consistency`` flag and then we can failover::
447 c71a1a3d Iustin Pop
448 c71a1a3d Iustin Pop
  node1# gnt-instance list
449 c71a1a3d Iustin Pop
  Instance  Hypervisor OS          Primary_node Status  Memory
450 c71a1a3d Iustin Pop
  instance1 xen-pvm    debootstrap node2        running   128M
451 c71a1a3d Iustin Pop
  instance2 xen-pvm    debootstrap node1        running   128M
452 c71a1a3d Iustin Pop
  instance3 xen-pvm    debootstrap node1        running   128M
453 c71a1a3d Iustin Pop
  instance4 xen-pvm    debootstrap node1        running   128M
454 c71a1a3d Iustin Pop
  node1#
455 c71a1a3d Iustin Pop
456 c71a1a3d Iustin Pop
But at this point, both instance1 and instance4 are without disk
457 c71a1a3d Iustin Pop
redundancy::
458 c71a1a3d Iustin Pop
459 c71a1a3d Iustin Pop
  node1# gnt-instance info instance1
460 c71a1a3d Iustin Pop
  Instance name: instance1
461 c71a1a3d Iustin Pop
  UUID: 45173e82-d1fa-417c-8758-7d582ab7eef4
462 c71a1a3d Iustin Pop
  Serial number: 2
463 c71a1a3d Iustin Pop
  Creation time: 2009-10-26 04:06:57
464 c71a1a3d Iustin Pop
  Modification time: 2009-10-26 04:07:14
465 c71a1a3d Iustin Pop
  State: configured to be up, actual state is up
466 c71a1a3d Iustin Pop
    Nodes:
467 c71a1a3d Iustin Pop
      - primary: node2
468 c71a1a3d Iustin Pop
      - secondaries: node3
469 c71a1a3d Iustin Pop
    Operating system: debootstrap
470 c71a1a3d Iustin Pop
    Allocated network port: None
471 c71a1a3d Iustin Pop
    Hypervisor: xen-pvm
472 c71a1a3d Iustin Pop
      - root_path: default (/dev/sda1)
473 c71a1a3d Iustin Pop
      - kernel_args: default (ro)
474 c71a1a3d Iustin Pop
      - use_bootloader: default (False)
475 c71a1a3d Iustin Pop
      - bootloader_args: default ()
476 c71a1a3d Iustin Pop
      - bootloader_path: default ()
477 c71a1a3d Iustin Pop
      - kernel_path: default (/boot/vmlinuz-2.6-xenU)
478 c71a1a3d Iustin Pop
      - initrd_path: default ()
479 c71a1a3d Iustin Pop
    Hardware:
480 c71a1a3d Iustin Pop
      - VCPUs: 1
481 2a50e2e8 Guido Trotter
      - maxmem: 256MiB
482 2a50e2e8 Guido Trotter
      - minmem: 512MiB
483 c71a1a3d Iustin Pop
      - NICs:
484 c71a1a3d Iustin Pop
        - nic/0: MAC: aa:00:00:78:da:63, IP: None, mode: bridged, link: xen-br0
485 c71a1a3d Iustin Pop
    Disks:
486 c71a1a3d Iustin Pop
      - disk/0: drbd8, size 256M
487 c71a1a3d Iustin Pop
        access mode: rw
488 c71a1a3d Iustin Pop
        nodeA:       node2, minor=0
489 c71a1a3d Iustin Pop
        nodeB:       node3, minor=0
490 c71a1a3d Iustin Pop
        port:        11035
491 c71a1a3d Iustin Pop
        auth key:    8e950e3cec6854b0181fbc3a6058657701f2d458
492 c71a1a3d Iustin Pop
        on primary:  /dev/drbd0 (147:0) in sync, status *DEGRADED*
493 c71a1a3d Iustin Pop
        child devices:
494 c71a1a3d Iustin Pop
          - child 0: lvm, size 256M
495 c71a1a3d Iustin Pop
            logical_id: xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_data
496 c71a1a3d Iustin Pop
            on primary: /dev/xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_data (254:0)
497 c71a1a3d Iustin Pop
          - child 1: lvm, size 128M
498 c71a1a3d Iustin Pop
            logical_id: xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta
499 c71a1a3d Iustin Pop
            on primary: /dev/xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta (254:1)
500 c71a1a3d Iustin Pop
501 c71a1a3d Iustin Pop
The output is similar for instance4. In order to recover this, we need
502 c71a1a3d Iustin Pop
to run the node evacuate command which will change from the current
503 c71a1a3d Iustin Pop
secondary node to a new one (in this case, we only have two working
504 c71a1a3d Iustin Pop
nodes, so all instances will be end on nodes one and two)::
505 c71a1a3d Iustin Pop
506 c71a1a3d Iustin Pop
  node1# gnt-node evacuate -I hail node3
507 c71a1a3d Iustin Pop
  Relocate instance(s) 'instance1','instance4' from node
508 c71a1a3d Iustin Pop
   node3 using iallocator hail?
509 c71a1a3d Iustin Pop
  y/[n]/?: y
510 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:39 2009  - INFO: Selected new secondary for instance 'instance1': node1
511 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:40 2009  - INFO: Selected new secondary for instance 'instance4': node2
512 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:40 2009 Replacing disk(s) 0 for instance1
513 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:40 2009 STEP 1/6 Check device existence
514 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:40 2009  - INFO: Checking disk/0 on node2
515 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:40 2009  - INFO: Checking volume groups
516 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:40 2009 STEP 2/6 Check peer consistency
517 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:40 2009  - INFO: Checking disk/0 consistency on node node2
518 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:40 2009 STEP 3/6 Allocate new storage
519 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:40 2009  - INFO: Adding new local storage on node1 for disk/0
520 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:41 2009 STEP 4/6 Changing drbd configuration
521 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:41 2009  - INFO: activating a new drbd on node1 for disk/0
522 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:42 2009  - INFO: Shutting down drbd for disk/0 on old node
523 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:42 2009  - WARNING: Failed to shutdown drbd for disk/0 on oldnode: Node is marked offline
524 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:42 2009       Hint: Please cleanup this device manually as soon as possible
525 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:42 2009  - INFO: Detaching primary drbds from the network (=> standalone)
526 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:42 2009  - INFO: Updating instance configuration
527 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:45 2009  - INFO: Attaching primary drbds to new secondary (standalone => connected)
528 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:46 2009 STEP 5/6 Sync devices
529 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:46 2009  - INFO: Waiting for instance instance1 to sync disks.
530 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:46 2009  - INFO: - device disk/0: 13.90% done, 7 estimated seconds remaining
531 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009  - INFO: Instance instance1's disks are in sync.
532 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009 STEP 6/6 Removing old storage
533 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009  - INFO: Remove logical volumes for 0
534 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009  - WARNING: Can't remove old LV: Node is marked offline
535 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009       Hint: remove unused LVs manually
536 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009  - WARNING: Can't remove old LV: Node is marked offline
537 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009       Hint: remove unused LVs manually
538 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009 Replacing disk(s) 0 for instance4
539 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009 STEP 1/6 Check device existence
540 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009  - INFO: Checking disk/0 on node1
541 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009  - INFO: Checking volume groups
542 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009 STEP 2/6 Check peer consistency
543 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:53 2009  - INFO: Checking disk/0 consistency on node node1
544 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:54 2009 STEP 3/6 Allocate new storage
545 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:54 2009  - INFO: Adding new local storage on node2 for disk/0
546 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:54 2009 STEP 4/6 Changing drbd configuration
547 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:54 2009  - INFO: activating a new drbd on node2 for disk/0
548 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:55 2009  - INFO: Shutting down drbd for disk/0 on old node
549 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:55 2009  - WARNING: Failed to shutdown drbd for disk/0 on oldnode: Node is marked offline
550 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:55 2009       Hint: Please cleanup this device manually as soon as possible
551 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:55 2009  - INFO: Detaching primary drbds from the network (=> standalone)
552 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:55 2009  - INFO: Updating instance configuration
553 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:55 2009  - INFO: Attaching primary drbds to new secondary (standalone => connected)
554 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:56 2009 STEP 5/6 Sync devices
555 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:56 2009  - INFO: Waiting for instance instance4 to sync disks.
556 c71a1a3d Iustin Pop
  Mon Oct 26 05:05:56 2009  - INFO: - device disk/0: 12.40% done, 8 estimated seconds remaining
557 c71a1a3d Iustin Pop
  Mon Oct 26 05:06:04 2009  - INFO: Instance instance4's disks are in sync.
558 c71a1a3d Iustin Pop
  Mon Oct 26 05:06:04 2009 STEP 6/6 Removing old storage
559 c71a1a3d Iustin Pop
  Mon Oct 26 05:06:04 2009  - INFO: Remove logical volumes for 0
560 c71a1a3d Iustin Pop
  Mon Oct 26 05:06:04 2009  - WARNING: Can't remove old LV: Node is marked offline
561 c71a1a3d Iustin Pop
  Mon Oct 26 05:06:04 2009       Hint: remove unused LVs manually
562 c71a1a3d Iustin Pop
  Mon Oct 26 05:06:04 2009  - WARNING: Can't remove old LV: Node is marked offline
563 c71a1a3d Iustin Pop
  Mon Oct 26 05:06:04 2009       Hint: remove unused LVs manually
564 c71a1a3d Iustin Pop
  node1#
565 c71a1a3d Iustin Pop
566 c71a1a3d Iustin Pop
And now node3 is completely free of instances and can be repaired::
567 c71a1a3d Iustin Pop
568 c71a1a3d Iustin Pop
  node1# gnt-node list
569 c71a1a3d Iustin Pop
  Node  DTotal DFree MTotal MNode MFree Pinst Sinst
570 c71a1a3d Iustin Pop
  node1   1.3T  1.3T  32.0G  1.0G 30.2G     3     1
571 c71a1a3d Iustin Pop
  node2   1.3T  1.3T  32.0G  1.0G 30.4G     1     3
572 c71a1a3d Iustin Pop
  node3      ?     ?      ?     ?     ?     0     0
573 c71a1a3d Iustin Pop
574 c71a1a3d Iustin Pop
Re-adding a node to the cluster
575 c71a1a3d Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
576 c71a1a3d Iustin Pop
577 c71a1a3d Iustin Pop
578 c71a1a3d Iustin Pop
Let's say node3 has been repaired and is now ready to be
579 c71a1a3d Iustin Pop
reused. Re-adding it is simple::
580 c71a1a3d Iustin Pop
581 c71a1a3d Iustin Pop
  node1# gnt-node add --readd node3
582 926feaf1 Manuel Franceschini
  The authenticity of host 'node3 (198.51.100.1)' can't be established.
583 c71a1a3d Iustin Pop
  RSA key fingerprint is 9f:2e:5a:2e:e0:bd:00:09:e4:5c:32:f2:27:57:7a:f4.
584 c71a1a3d Iustin Pop
  Are you sure you want to continue connecting (yes/no)? yes
585 c71a1a3d Iustin Pop
  Mon Oct 26 05:27:39 2009  - INFO: Readding a node, the offline/drained flags were reset
586 c71a1a3d Iustin Pop
  Mon Oct 26 05:27:39 2009  - INFO: Node will be a master candidate
587 c71a1a3d Iustin Pop
588 1cdc9dbb Bernardo Dal Seno
And it is now working again::
589 c71a1a3d Iustin Pop
590 c71a1a3d Iustin Pop
  node1# gnt-node list
591 c71a1a3d Iustin Pop
  Node  DTotal DFree MTotal MNode MFree Pinst Sinst
592 c71a1a3d Iustin Pop
  node1   1.3T  1.3T  32.0G  1.0G 30.2G     3     1
593 c71a1a3d Iustin Pop
  node2   1.3T  1.3T  32.0G  1.0G 30.4G     1     3
594 c71a1a3d Iustin Pop
  node3   1.3T  1.3T  32.0G  1.0G 30.4G     0     0
595 c71a1a3d Iustin Pop
596 1cdc9dbb Bernardo Dal Seno
.. note:: If Ganeti has been built with the htools
597 1ebe6dbd Iustin Pop
   component enabled, you can shuffle the instances around to have a
598 1ebe6dbd Iustin Pop
   better use of the nodes.
599 c71a1a3d Iustin Pop
600 c71a1a3d Iustin Pop
Disk failures
601 c71a1a3d Iustin Pop
+++++++++++++
602 c71a1a3d Iustin Pop
603 c71a1a3d Iustin Pop
A disk failure is simpler than a full node failure. First, a single disk
604 c71a1a3d Iustin Pop
failure should not cause data-loss for any redundant instance; only the
605 c71a1a3d Iustin Pop
performance of some instances might be reduced due to more network
606 c71a1a3d Iustin Pop
traffic.
607 c71a1a3d Iustin Pop
608 c71a1a3d Iustin Pop
Let take the cluster status in the above listing, and check what volumes
609 c71a1a3d Iustin Pop
are in use::
610 c71a1a3d Iustin Pop
611 c71a1a3d Iustin Pop
  node1# gnt-node volumes -o phys,instance node2
612 c71a1a3d Iustin Pop
  PhysDev   Instance
613 c71a1a3d Iustin Pop
  /dev/sdb1 instance4
614 c71a1a3d Iustin Pop
  /dev/sdb1 instance4
615 c71a1a3d Iustin Pop
  /dev/sdb1 instance1
616 c71a1a3d Iustin Pop
  /dev/sdb1 instance1
617 c71a1a3d Iustin Pop
  /dev/sdb1 instance3
618 c71a1a3d Iustin Pop
  /dev/sdb1 instance3
619 c71a1a3d Iustin Pop
  /dev/sdb1 instance2
620 c71a1a3d Iustin Pop
  /dev/sdb1 instance2
621 c71a1a3d Iustin Pop
  node1#
622 c71a1a3d Iustin Pop
623 c71a1a3d Iustin Pop
You can see that all instances on node2 have logical volumes on
624 c71a1a3d Iustin Pop
``/dev/sdb1``. Let's simulate a disk failure on that disk::
625 c71a1a3d Iustin Pop
626 c71a1a3d Iustin Pop
  node1# ssh node2
627 c71a1a3d Iustin Pop
  node2# echo offline > /sys/block/sdb/device/state
628 c71a1a3d Iustin Pop
  node2# vgs
629 c71a1a3d Iustin Pop
    /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
630 c71a1a3d Iustin Pop
    /dev/sdb1: read failed after 0 of 4096 at 750153695232: Input/output error
631 c71a1a3d Iustin Pop
    /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
632 c71a1a3d Iustin Pop
    Couldn't find device with uuid '954bJA-mNL0-7ydj-sdpW-nc2C-ZrCi-zFp91c'.
633 c71a1a3d Iustin Pop
    Couldn't find all physical volumes for volume group xenvg.
634 c71a1a3d Iustin Pop
    /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
635 c71a1a3d Iustin Pop
    /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
636 c71a1a3d Iustin Pop
    Couldn't find device with uuid '954bJA-mNL0-7ydj-sdpW-nc2C-ZrCi-zFp91c'.
637 c71a1a3d Iustin Pop
    Couldn't find all physical volumes for volume group xenvg.
638 c71a1a3d Iustin Pop
    Volume group xenvg not found
639 c71a1a3d Iustin Pop
  node2#
640 c71a1a3d Iustin Pop
641 c71a1a3d Iustin Pop
At this point, the node is broken and if we are to examine
642 c71a1a3d Iustin Pop
instance2 we get (simplified output shown)::
643 c71a1a3d Iustin Pop
644 c71a1a3d Iustin Pop
  node1# gnt-instance info instance2
645 c71a1a3d Iustin Pop
  Instance name: instance2
646 c71a1a3d Iustin Pop
  State: configured to be up, actual state is up
647 c71a1a3d Iustin Pop
    Nodes:
648 c71a1a3d Iustin Pop
      - primary: node1
649 c71a1a3d Iustin Pop
      - secondaries: node2
650 c71a1a3d Iustin Pop
    Disks:
651 c71a1a3d Iustin Pop
      - disk/0: drbd8, size 256M
652 c71a1a3d Iustin Pop
        on primary:   /dev/drbd0 (147:0) in sync, status ok
653 c71a1a3d Iustin Pop
        on secondary: /dev/drbd1 (147:1) in sync, status *DEGRADED* *MISSING DISK*
654 c71a1a3d Iustin Pop
655 c71a1a3d Iustin Pop
This instance has a secondary only on node2. Let's verify a primary
656 c71a1a3d Iustin Pop
instance of node2::
657 c71a1a3d Iustin Pop
658 c71a1a3d Iustin Pop
  node1# gnt-instance info instance1
659 c71a1a3d Iustin Pop
  Instance name: instance1
660 c71a1a3d Iustin Pop
  State: configured to be up, actual state is up
661 c71a1a3d Iustin Pop
    Nodes:
662 c71a1a3d Iustin Pop
      - primary: node2
663 c71a1a3d Iustin Pop
      - secondaries: node1
664 c71a1a3d Iustin Pop
    Disks:
665 c71a1a3d Iustin Pop
      - disk/0: drbd8, size 256M
666 c71a1a3d Iustin Pop
        on primary:   /dev/drbd0 (147:0) in sync, status *DEGRADED* *MISSING DISK*
667 c71a1a3d Iustin Pop
        on secondary: /dev/drbd3 (147:3) in sync, status ok
668 c71a1a3d Iustin Pop
  node1# gnt-instance console instance1
669 c71a1a3d Iustin Pop
670 c71a1a3d Iustin Pop
  Debian GNU/Linux 5.0 instance1 tty1
671 c71a1a3d Iustin Pop
672 c71a1a3d Iustin Pop
  instance1 login: root
673 c71a1a3d Iustin Pop
  Last login: Tue Oct 27 01:24:09 UTC 2009 on tty1
674 c71a1a3d Iustin Pop
  instance1:~# date > test
675 c71a1a3d Iustin Pop
  instance1:~# sync
676 c71a1a3d Iustin Pop
  instance1:~# cat test
677 c71a1a3d Iustin Pop
  Tue Oct 27 01:25:20 UTC 2009
678 c71a1a3d Iustin Pop
  instance1:~# dmesg|tail
679 c71a1a3d Iustin Pop
  [5439785.235448] NET: Registered protocol family 15
680 c71a1a3d Iustin Pop
  [5439785.235489] 802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
681 c71a1a3d Iustin Pop
  [5439785.235495] All bugs added by David S. Miller <davem@redhat.com>
682 c71a1a3d Iustin Pop
  [5439785.235517] XENBUS: Device with no driver: device/console/0
683 c71a1a3d Iustin Pop
  [5439785.236576] kjournald starting.  Commit interval 5 seconds
684 c71a1a3d Iustin Pop
  [5439785.236588] EXT3-fs: mounted filesystem with ordered data mode.
685 c71a1a3d Iustin Pop
  [5439785.236625] VFS: Mounted root (ext3 filesystem) readonly.
686 c71a1a3d Iustin Pop
  [5439785.236663] Freeing unused kernel memory: 172k freed
687 c71a1a3d Iustin Pop
  [5439787.533779] EXT3 FS on sda1, internal journal
688 c71a1a3d Iustin Pop
  [5440655.065431] eth0: no IPv6 routers present
689 c71a1a3d Iustin Pop
  instance1:~#
690 c71a1a3d Iustin Pop
691 c71a1a3d Iustin Pop
As you can see, the instance is running fine and doesn't see any disk
692 c71a1a3d Iustin Pop
issues. It is now time to fix node2 and re-establish redundancy for the
693 c71a1a3d Iustin Pop
involved instances.
694 c71a1a3d Iustin Pop
695 c71a1a3d Iustin Pop
.. note:: For Ganeti 2.0 we need to fix manually the volume group on
696 c71a1a3d Iustin Pop
   node2 by running ``vgreduce --removemissing xenvg``
697 c71a1a3d Iustin Pop
698 c71a1a3d Iustin Pop
::
699 c71a1a3d Iustin Pop
700 c71a1a3d Iustin Pop
  node1# gnt-node repair-storage node2 lvm-vg xenvg
701 c71a1a3d Iustin Pop
  Mon Oct 26 18:14:03 2009 Repairing storage unit 'xenvg' on node2 ...
702 c71a1a3d Iustin Pop
  node1# ssh node2 vgs
703 c71a1a3d Iustin Pop
    VG    #PV #LV #SN Attr   VSize   VFree
704 c71a1a3d Iustin Pop
    xenvg   1   8   0 wz--n- 673.84G 673.84G
705 c71a1a3d Iustin Pop
  node1#
706 c71a1a3d Iustin Pop
707 c71a1a3d Iustin Pop
This has removed the 'bad' disk from the volume group, which is now left
708 c71a1a3d Iustin Pop
with only one PV. We can now replace the disks for the involved
709 c71a1a3d Iustin Pop
instances::
710 c71a1a3d Iustin Pop
711 c71a1a3d Iustin Pop
  node1# for i in instance{1..4}; do gnt-instance replace-disks -a $i; done
712 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:38 2009 Replacing disk(s) 0 for instance1
713 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:38 2009 STEP 1/6 Check device existence
714 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:38 2009  - INFO: Checking disk/0 on node1
715 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:38 2009  - INFO: Checking disk/0 on node2
716 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:38 2009  - INFO: Checking volume groups
717 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:38 2009 STEP 2/6 Check peer consistency
718 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:38 2009  - INFO: Checking disk/0 consistency on node node1
719 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:39 2009 STEP 3/6 Allocate new storage
720 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:39 2009  - INFO: Adding storage on node2 for disk/0
721 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:39 2009 STEP 4/6 Changing drbd configuration
722 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:39 2009  - INFO: Detaching disk/0 drbd from local storage
723 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:40 2009  - INFO: Renaming the old LVs on the target node
724 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:40 2009  - INFO: Renaming the new LVs on the target node
725 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:40 2009  - INFO: Adding new mirror component on node2
726 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:41 2009 STEP 5/6 Sync devices
727 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:41 2009  - INFO: Waiting for instance instance1 to sync disks.
728 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:41 2009  - INFO: - device disk/0: 12.40% done, 9 estimated seconds remaining
729 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:50 2009  - INFO: Instance instance1's disks are in sync.
730 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:50 2009 STEP 6/6 Removing old storage
731 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:50 2009  - INFO: Remove logical volumes for disk/0
732 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:52 2009 Replacing disk(s) 0 for instance2
733 c71a1a3d Iustin Pop
  Mon Oct 26 18:15:52 2009 STEP 1/6 Check device existence
734 c71a1a3d Iustin Pop
735 c71a1a3d Iustin Pop
  Mon Oct 26 18:16:01 2009 STEP 6/6 Removing old storage
736 c71a1a3d Iustin Pop
  Mon Oct 26 18:16:01 2009  - INFO: Remove logical volumes for disk/0
737 c71a1a3d Iustin Pop
  Mon Oct 26 18:16:02 2009 Replacing disk(s) 0 for instance3
738 c71a1a3d Iustin Pop
  Mon Oct 26 18:16:02 2009 STEP 1/6 Check device existence
739 c71a1a3d Iustin Pop
740 c71a1a3d Iustin Pop
  Mon Oct 26 18:16:09 2009 STEP 6/6 Removing old storage
741 c71a1a3d Iustin Pop
  Mon Oct 26 18:16:09 2009  - INFO: Remove logical volumes for disk/0
742 c71a1a3d Iustin Pop
  Mon Oct 26 18:16:10 2009 Replacing disk(s) 0 for instance4
743 c71a1a3d Iustin Pop
  Mon Oct 26 18:16:10 2009 STEP 1/6 Check device existence
744 c71a1a3d Iustin Pop
745 c71a1a3d Iustin Pop
  Mon Oct 26 18:16:18 2009 STEP 6/6 Removing old storage
746 c71a1a3d Iustin Pop
  Mon Oct 26 18:16:18 2009  - INFO: Remove logical volumes for disk/0
747 c71a1a3d Iustin Pop
  node1#
748 c71a1a3d Iustin Pop
749 c71a1a3d Iustin Pop
As this point, all instances should be healthy again.
750 c71a1a3d Iustin Pop
751 c71a1a3d Iustin Pop
.. note:: Ganeti 2.0 doesn't have the ``-a`` option to replace-disks, so
752 c71a1a3d Iustin Pop
   for it you have to run the loop twice, once over primary instances
753 c71a1a3d Iustin Pop
   with argument ``-p`` and once secondary instances with argument
754 c71a1a3d Iustin Pop
   ``-s``, but otherwise the operations are similar::
755 c71a1a3d Iustin Pop
756 c71a1a3d Iustin Pop
     node1# gnt-instance replace-disks -p instance1
757 c71a1a3d Iustin Pop
758 c71a1a3d Iustin Pop
     node1# for i in instance{2..4}; do gnt-instance replace-disks -s $i; done
759 c71a1a3d Iustin Pop
760 c71a1a3d Iustin Pop
Common cluster problems
761 c71a1a3d Iustin Pop
-----------------------
762 c71a1a3d Iustin Pop
763 c71a1a3d Iustin Pop
There are a number of small issues that might appear on a cluster that
764 c71a1a3d Iustin Pop
can be solved easily as long as the issue is properly identified. For
765 c71a1a3d Iustin Pop
this exercise we will consider the case of node3, which was broken
766 c71a1a3d Iustin Pop
previously and re-added to the cluster without reinstallation. Running
767 c71a1a3d Iustin Pop
cluster verify on the cluster reports::
768 c71a1a3d Iustin Pop
769 c71a1a3d Iustin Pop
  node1# gnt-cluster verify
770 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:08 2009 * Verifying global settings
771 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:08 2009 * Gathering data (3 nodes)
772 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009 * Verifying node status
773 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009   - ERROR: node node3: unallocated drbd minor 0 is in use
774 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009   - ERROR: node node3: unallocated drbd minor 1 is in use
775 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009 * Verifying instance status
776 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009   - ERROR: instance instance4: instance should not run on node node3
777 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009 * Verifying orphan volumes
778 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 22459cf8-117d-4bea-a1aa-791667d07800.disk0_data is unknown
779 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data is unknown
780 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta is unknown
781 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta is unknown
782 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009 * Verifying remaining instances
783 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009 * Verifying N+1 Memory redundancy
784 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009 * Other Notes
785 c71a1a3d Iustin Pop
  Mon Oct 26 18:30:10 2009 * Hooks Results
786 c71a1a3d Iustin Pop
  node1#
787 c71a1a3d Iustin Pop
788 c71a1a3d Iustin Pop
Instance status
789 c71a1a3d Iustin Pop
+++++++++++++++
790 c71a1a3d Iustin Pop
791 c71a1a3d Iustin Pop
As you can see, *instance4* has a copy running on node3, because we
792 c71a1a3d Iustin Pop
forced the failover when node3 failed. This case is dangerous as the
793 240c769f Andrea Spadaccini
instance will have the same IP and MAC address, wreaking havoc on the
794 c71a1a3d Iustin Pop
network environment and anyone who tries to use it.
795 c71a1a3d Iustin Pop
796 c71a1a3d Iustin Pop
Ganeti doesn't directly handle this case. It is recommended to logon to
797 c71a1a3d Iustin Pop
node3 and run::
798 c71a1a3d Iustin Pop
799 c71a1a3d Iustin Pop
  node3# xm destroy instance4
800 c71a1a3d Iustin Pop
801 c71a1a3d Iustin Pop
Unallocated DRBD minors
802 c71a1a3d Iustin Pop
+++++++++++++++++++++++
803 c71a1a3d Iustin Pop
804 c71a1a3d Iustin Pop
There are still unallocated DRBD minors on node3. Again, these are not
805 c71a1a3d Iustin Pop
handled by Ganeti directly and need to be cleaned up via DRBD commands::
806 c71a1a3d Iustin Pop
807 c71a1a3d Iustin Pop
  node3# drbdsetup /dev/drbd0 down
808 c71a1a3d Iustin Pop
  node3# drbdsetup /dev/drbd1 down
809 c71a1a3d Iustin Pop
  node3#
810 c71a1a3d Iustin Pop
811 c71a1a3d Iustin Pop
Orphan volumes
812 c71a1a3d Iustin Pop
++++++++++++++
813 c71a1a3d Iustin Pop
814 c71a1a3d Iustin Pop
At this point, the only remaining problem should be the so-called
815 c71a1a3d Iustin Pop
*orphan* volumes. This can happen also in the case of an aborted
816 c71a1a3d Iustin Pop
disk-replace, or similar situation where Ganeti was not able to recover
817 c71a1a3d Iustin Pop
automatically. Here you need to remove them manually via LVM commands::
818 c71a1a3d Iustin Pop
819 c71a1a3d Iustin Pop
  node3# lvremove xenvg
820 c71a1a3d Iustin Pop
  Do you really want to remove active logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_data"? [y/n]: y
821 c71a1a3d Iustin Pop
    Logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_data" successfully removed
822 c71a1a3d Iustin Pop
  Do you really want to remove active logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta"? [y/n]: y
823 c71a1a3d Iustin Pop
    Logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta" successfully removed
824 c71a1a3d Iustin Pop
  Do you really want to remove active logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data"? [y/n]: y
825 c71a1a3d Iustin Pop
    Logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data" successfully removed
826 c71a1a3d Iustin Pop
  Do you really want to remove active logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta"? [y/n]: y
827 c71a1a3d Iustin Pop
    Logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta" successfully removed
828 c71a1a3d Iustin Pop
  node3#
829 c71a1a3d Iustin Pop
830 c71a1a3d Iustin Pop
At this point cluster verify shouldn't complain anymore::
831 c71a1a3d Iustin Pop
832 c71a1a3d Iustin Pop
  node1# gnt-cluster verify
833 c71a1a3d Iustin Pop
  Mon Oct 26 18:37:51 2009 * Verifying global settings
834 c71a1a3d Iustin Pop
  Mon Oct 26 18:37:51 2009 * Gathering data (3 nodes)
835 c71a1a3d Iustin Pop
  Mon Oct 26 18:37:53 2009 * Verifying node status
836 c71a1a3d Iustin Pop
  Mon Oct 26 18:37:53 2009 * Verifying instance status
837 c71a1a3d Iustin Pop
  Mon Oct 26 18:37:53 2009 * Verifying orphan volumes
838 c71a1a3d Iustin Pop
  Mon Oct 26 18:37:53 2009 * Verifying remaining instances
839 c71a1a3d Iustin Pop
  Mon Oct 26 18:37:53 2009 * Verifying N+1 Memory redundancy
840 c71a1a3d Iustin Pop
  Mon Oct 26 18:37:53 2009 * Other Notes
841 c71a1a3d Iustin Pop
  Mon Oct 26 18:37:53 2009 * Hooks Results
842 c71a1a3d Iustin Pop
  node1#
843 c71a1a3d Iustin Pop
844 c71a1a3d Iustin Pop
N+1 errors
845 c71a1a3d Iustin Pop
++++++++++
846 c71a1a3d Iustin Pop
847 c71a1a3d Iustin Pop
Since redundant instances in Ganeti have a primary/secondary model, it
848 c71a1a3d Iustin Pop
is needed to leave aside on each node enough memory so that if one of
849 c71a1a3d Iustin Pop
its peer node fails, all the secondary instances that have that node as
850 c71a1a3d Iustin Pop
primary can be relocated. More specifically, if instance2 has node1 as
851 c71a1a3d Iustin Pop
primary and node2 as secondary (and node1 and node2 do not have any
852 c71a1a3d Iustin Pop
other instances in this layout), then it means that node2 must have
853 c71a1a3d Iustin Pop
enough free memory so that if node1 fails, we can failover instance2
854 c71a1a3d Iustin Pop
without any other operations (for reducing the downtime window). Let's
855 c71a1a3d Iustin Pop
increase the memory of the current instances to 4G, and add three new
856 c71a1a3d Iustin Pop
instances, two on node2:node3 with 8GB of RAM and one on node1:node2,
857 c71a1a3d Iustin Pop
with 12GB of RAM (numbers chosen so that we run out of memory)::
858 c71a1a3d Iustin Pop
859 c71a1a3d Iustin Pop
  node1# gnt-instance modify -B memory=4G instance1
860 c71a1a3d Iustin Pop
  Modified instance instance1
861 2a50e2e8 Guido Trotter
   - be/maxmem -> 4096
862 2a50e2e8 Guido Trotter
   - be/minmem -> 4096
863 c71a1a3d Iustin Pop
  Please don't forget that these parameters take effect only at the next start of the instance.
864 c71a1a3d Iustin Pop
  node1# gnt-instance modify …
865 c71a1a3d Iustin Pop
866 c71a1a3d Iustin Pop
  node1# gnt-instance add -t drbd -n node2:node3 -s 512m -B memory=8G -o debootstrap instance5
867 c71a1a3d Iustin Pop
868 c71a1a3d Iustin Pop
  node1# gnt-instance add -t drbd -n node2:node3 -s 512m -B memory=8G -o debootstrap instance6
869 c71a1a3d Iustin Pop
870 c71a1a3d Iustin Pop
  node1# gnt-instance add -t drbd -n node1:node2 -s 512m -B memory=8G -o debootstrap instance7
871 c71a1a3d Iustin Pop
  node1# gnt-instance reboot --all
872 c71a1a3d Iustin Pop
  The reboot will operate on 7 instances.
873 c71a1a3d Iustin Pop
  Do you want to continue?
874 c71a1a3d Iustin Pop
  Affected instances:
875 c71a1a3d Iustin Pop
    instance1
876 c71a1a3d Iustin Pop
    instance2
877 c71a1a3d Iustin Pop
    instance3
878 c71a1a3d Iustin Pop
    instance4
879 c71a1a3d Iustin Pop
    instance5
880 c71a1a3d Iustin Pop
    instance6
881 c71a1a3d Iustin Pop
    instance7
882 c71a1a3d Iustin Pop
  y/[n]/?: y
883 c71a1a3d Iustin Pop
  Submitted jobs 677, 678, 679, 680, 681, 682, 683
884 c71a1a3d Iustin Pop
  Waiting for job 677 for instance1...
885 c71a1a3d Iustin Pop
  Waiting for job 678 for instance2...
886 c71a1a3d Iustin Pop
  Waiting for job 679 for instance3...
887 c71a1a3d Iustin Pop
  Waiting for job 680 for instance4...
888 c71a1a3d Iustin Pop
  Waiting for job 681 for instance5...
889 c71a1a3d Iustin Pop
  Waiting for job 682 for instance6...
890 c71a1a3d Iustin Pop
  Waiting for job 683 for instance7...
891 c71a1a3d Iustin Pop
  node1#
892 c71a1a3d Iustin Pop
893 c71a1a3d Iustin Pop
We rebooted instances for the memory changes to have effect. Now the
894 c71a1a3d Iustin Pop
cluster looks like::
895 c71a1a3d Iustin Pop
896 c71a1a3d Iustin Pop
  node1# gnt-node list
897 c71a1a3d Iustin Pop
  Node  DTotal DFree MTotal MNode MFree Pinst Sinst
898 c71a1a3d Iustin Pop
  node1   1.3T  1.3T  32.0G  1.0G  6.5G     4     1
899 c71a1a3d Iustin Pop
  node2   1.3T  1.3T  32.0G  1.0G 10.5G     3     4
900 c71a1a3d Iustin Pop
  node3   1.3T  1.3T  32.0G  1.0G 30.5G     0     2
901 c71a1a3d Iustin Pop
  node1# gnt-cluster verify
902 c71a1a3d Iustin Pop
  Mon Oct 26 18:59:36 2009 * Verifying global settings
903 c71a1a3d Iustin Pop
  Mon Oct 26 18:59:36 2009 * Gathering data (3 nodes)
904 c71a1a3d Iustin Pop
  Mon Oct 26 18:59:37 2009 * Verifying node status
905 c71a1a3d Iustin Pop
  Mon Oct 26 18:59:37 2009 * Verifying instance status
906 c71a1a3d Iustin Pop
  Mon Oct 26 18:59:37 2009 * Verifying orphan volumes
907 c71a1a3d Iustin Pop
  Mon Oct 26 18:59:37 2009 * Verifying remaining instances
908 c71a1a3d Iustin Pop
  Mon Oct 26 18:59:37 2009 * Verifying N+1 Memory redundancy
909 861a296e Miguel Di Ciurcio Filho
  Mon Oct 26 18:59:37 2009   - ERROR: node node2: not enough memory to accommodate instance failovers should node node1 fail
910 c71a1a3d Iustin Pop
  Mon Oct 26 18:59:37 2009 * Other Notes
911 c71a1a3d Iustin Pop
  Mon Oct 26 18:59:37 2009 * Hooks Results
912 c71a1a3d Iustin Pop
  node1#
913 c71a1a3d Iustin Pop
914 c71a1a3d Iustin Pop
The cluster verify error above shows that if node1 fails, node2 will not
915 c71a1a3d Iustin Pop
have enough memory to failover all primary instances on node1 to it. To
916 c71a1a3d Iustin Pop
solve this, you have a number of options:
917 c71a1a3d Iustin Pop
918 c71a1a3d Iustin Pop
- try to manually move instances around (but this can become complicated
919 c71a1a3d Iustin Pop
  for any non-trivial cluster)
920 2a50e2e8 Guido Trotter
- try to reduce the minimum memory of some instances on the source node
921 2a50e2e8 Guido Trotter
  of the N+1 failure (in the example above ``node1``): this will allow
922 2a50e2e8 Guido Trotter
  it to start and be failed over/migrated with less than its maximum
923 2a50e2e8 Guido Trotter
  memory
924 2a50e2e8 Guido Trotter
- try to reduce the runtime/maximum memory of some instances on the
925 2a50e2e8 Guido Trotter
  destination node of the N+1 failure (in the example above ``node2``)
926 2a50e2e8 Guido Trotter
  to create additional available node memory (check the :doc:`admin`
927 2a50e2e8 Guido Trotter
  guide for what Ganeti will and won't automatically do in regards to
928 2a50e2e8 Guido Trotter
  instance runtime memory modification)
929 1ebe6dbd Iustin Pop
- if Ganeti has been built with the htools package enabled, you can run
930 1ebe6dbd Iustin Pop
  the ``hbal`` tool which will try to compute an automated cluster
931 1ebe6dbd Iustin Pop
  solution that complies with the N+1 rule
932 c71a1a3d Iustin Pop
933 c71a1a3d Iustin Pop
Network issues
934 c71a1a3d Iustin Pop
++++++++++++++
935 c71a1a3d Iustin Pop
936 c71a1a3d Iustin Pop
In case a node has problems with the network (usually the secondary
937 c71a1a3d Iustin Pop
network, as problems with the primary network will render the node
938 c71a1a3d Iustin Pop
unusable for ganeti commands), it will show up in cluster verify as::
939 c71a1a3d Iustin Pop
940 c71a1a3d Iustin Pop
  node1# gnt-cluster verify
941 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:19 2009 * Verifying global settings
942 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:19 2009 * Gathering data (3 nodes)
943 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009 * Verifying node status
944 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009   - ERROR: node node1: tcp communication with node 'node3': failure using the secondary interface(s)
945 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009   - ERROR: node node2: tcp communication with node 'node3': failure using the secondary interface(s)
946 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009   - ERROR: node node3: tcp communication with node 'node1': failure using the secondary interface(s)
947 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009   - ERROR: node node3: tcp communication with node 'node2': failure using the secondary interface(s)
948 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009   - ERROR: node node3: tcp communication with node 'node3': failure using the secondary interface(s)
949 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009 * Verifying instance status
950 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009 * Verifying orphan volumes
951 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009 * Verifying remaining instances
952 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009 * Verifying N+1 Memory redundancy
953 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009 * Other Notes
954 c71a1a3d Iustin Pop
  Mon Oct 26 19:07:23 2009 * Hooks Results
955 c71a1a3d Iustin Pop
  node1#
956 c71a1a3d Iustin Pop
957 c71a1a3d Iustin Pop
This shows that both node1 and node2 have problems contacting node3 over
958 c71a1a3d Iustin Pop
the secondary network, and node3 has problems contacting them. From this
959 c71a1a3d Iustin Pop
output is can be deduced that since node1 and node2 can communicate
960 c71a1a3d Iustin Pop
between themselves, node3 is the one having problems, and you need to
961 c71a1a3d Iustin Pop
investigate its network settings/connection.
962 c71a1a3d Iustin Pop
963 c71a1a3d Iustin Pop
Migration problems
964 c71a1a3d Iustin Pop
++++++++++++++++++
965 c71a1a3d Iustin Pop
966 c71a1a3d Iustin Pop
Since live migration can sometimes fail and leave the instance in an
967 c71a1a3d Iustin Pop
inconsistent state, Ganeti provides a ``--cleanup`` argument to the
968 c71a1a3d Iustin Pop
migrate command that does:
969 c71a1a3d Iustin Pop
970 c71a1a3d Iustin Pop
- check on which node the instance is actually running (has the
971 c71a1a3d Iustin Pop
  command failed before or after the actual migration?)
972 c71a1a3d Iustin Pop
- reconfigure the DRBD disks accordingly
973 c71a1a3d Iustin Pop
974 c71a1a3d Iustin Pop
It is always safe to run this command as long as the instance has good
975 c71a1a3d Iustin Pop
data on its primary node (i.e. not showing as degraded). If so, you can
976 c71a1a3d Iustin Pop
simply run::
977 c71a1a3d Iustin Pop
978 c71a1a3d Iustin Pop
  node1# gnt-instance migrate --cleanup instance1
979 c71a1a3d Iustin Pop
  Instance instance1 will be recovered from a failed migration. Note
980 c71a1a3d Iustin Pop
  that the migration procedure (including cleanup) is **experimental**
981 c71a1a3d Iustin Pop
  in this version. This might impact the instance if anything goes
982 c71a1a3d Iustin Pop
  wrong. Continue?
983 c71a1a3d Iustin Pop
  y/[n]/?: y
984 c71a1a3d Iustin Pop
  Mon Oct 26 19:13:49 2009 Migrating instance instance1
985 c71a1a3d Iustin Pop
  Mon Oct 26 19:13:49 2009 * checking where the instance actually runs (if this hangs, the hypervisor might be in a bad state)
986 c71a1a3d Iustin Pop
  Mon Oct 26 19:13:49 2009 * instance confirmed to be running on its primary node (node2)
987 c71a1a3d Iustin Pop
  Mon Oct 26 19:13:49 2009 * switching node node1 to secondary mode
988 c71a1a3d Iustin Pop
  Mon Oct 26 19:13:50 2009 * wait until resync is done
989 c71a1a3d Iustin Pop
  Mon Oct 26 19:13:50 2009 * changing into standalone mode
990 c71a1a3d Iustin Pop
  Mon Oct 26 19:13:50 2009 * changing disks into single-master mode
991 c71a1a3d Iustin Pop
  Mon Oct 26 19:13:50 2009 * wait until resync is done
992 c71a1a3d Iustin Pop
  Mon Oct 26 19:13:51 2009 * done
993 c71a1a3d Iustin Pop
  node1#
994 c71a1a3d Iustin Pop
995 c71a1a3d Iustin Pop
In use disks at instance shutdown
996 c71a1a3d Iustin Pop
+++++++++++++++++++++++++++++++++
997 c71a1a3d Iustin Pop
998 c71a1a3d Iustin Pop
If you see something like the following when trying to shutdown or
999 c71a1a3d Iustin Pop
deactivate disks for an instance::
1000 c71a1a3d Iustin Pop
1001 c71a1a3d Iustin Pop
  node1# gnt-instance shutdown instance1
1002 c71a1a3d Iustin Pop
  Mon Oct 26 19:16:23 2009  - WARNING: Could not shutdown block device disk/0 on node node2: drbd0: can't shutdown drbd device: /dev/drbd0: State change failed: (-12) Device is held open by someone\n
1003 c71a1a3d Iustin Pop
1004 c71a1a3d Iustin Pop
It most likely means something is holding open the underlying DRBD
1005 c71a1a3d Iustin Pop
device. This can be bad if the instance is not running, as it might mean
1006 c71a1a3d Iustin Pop
that there was concurrent access from both the node and the instance to
1007 c71a1a3d Iustin Pop
the disks, but not always (e.g. you could only have had the partitions
1008 c71a1a3d Iustin Pop
activated via ``kpartx``).
1009 c71a1a3d Iustin Pop
1010 c71a1a3d Iustin Pop
To troubleshoot this issue you need to follow standard Linux practices,
1011 c71a1a3d Iustin Pop
and pay attention to the hypervisor being used:
1012 c71a1a3d Iustin Pop
1013 c71a1a3d Iustin Pop
- check if (in the above example) ``/dev/drbd0`` on node2 is being
1014 c71a1a3d Iustin Pop
  mounted somewhere (``cat /proc/mounts``)
1015 c71a1a3d Iustin Pop
- check if the device is not being used by device mapper itself:
1016 c71a1a3d Iustin Pop
  ``dmsetup ls`` and look for entries of the form ``drbd0pX``, and if so
1017 c71a1a3d Iustin Pop
  remove them with either ``kpartx -d`` or ``dmsetup remove``
1018 c71a1a3d Iustin Pop
1019 c71a1a3d Iustin Pop
For Xen, check if it's not using the disks itself::
1020 c71a1a3d Iustin Pop
1021 c71a1a3d Iustin Pop
  node1# xenstore-ls /local/domain/0/backend/vbd|grep -e "domain =" -e physical-device
1022 c71a1a3d Iustin Pop
  domain = "instance2"
1023 c71a1a3d Iustin Pop
  physical-device = "93:0"
1024 c71a1a3d Iustin Pop
  domain = "instance3"
1025 c71a1a3d Iustin Pop
  physical-device = "93:1"
1026 c71a1a3d Iustin Pop
  domain = "instance4"
1027 c71a1a3d Iustin Pop
  physical-device = "93:2"
1028 c71a1a3d Iustin Pop
  node1#
1029 c71a1a3d Iustin Pop
1030 c71a1a3d Iustin Pop
You can see in the above output that the node exports three disks, to
1031 c71a1a3d Iustin Pop
three instances. The ``physical-device`` key is in major:minor format in
1032 c71a1a3d Iustin Pop
hexadecimal, and 0x93 represents DRBD's major number. Thus we can see
1033 c71a1a3d Iustin Pop
from the above that instance2 has /dev/drbd0, instance3 /dev/drbd1, and
1034 c71a1a3d Iustin Pop
instance4 /dev/drbd2.
1035 c71a1a3d Iustin Pop
1036 e986f20c Michael Hanselmann
LUXI version mismatch
1037 e986f20c Michael Hanselmann
+++++++++++++++++++++
1038 e986f20c Michael Hanselmann
1039 e986f20c Michael Hanselmann
LUXI is the protocol used for communication between clients and the
1040 e986f20c Michael Hanselmann
master daemon. Starting in Ganeti 2.3, the peers exchange their version
1041 e986f20c Michael Hanselmann
in each message. When they don't match, an error is raised::
1042 e986f20c Michael Hanselmann
1043 e986f20c Michael Hanselmann
  $ gnt-node modify -O yes node3
1044 e986f20c Michael Hanselmann
  Unhandled Ganeti error: LUXI version mismatch, server 2020000, request 2030000
1045 e986f20c Michael Hanselmann
1046 e986f20c Michael Hanselmann
Usually this means that server and client are from different Ganeti
1047 e986f20c Michael Hanselmann
versions or import their libraries from different, consistent paths
1048 e986f20c Michael Hanselmann
(e.g. an older version installed in another place). You can print the
1049 e986f20c Michael Hanselmann
import path for Ganeti's modules using the following command (note that
1050 e986f20c Michael Hanselmann
depending on your setup you might have to use an explicit version in the
1051 e986f20c Michael Hanselmann
Python command, e.g. ``python2.6``)::
1052 e986f20c Michael Hanselmann
1053 e986f20c Michael Hanselmann
  python -c 'import ganeti; print ganeti.__file__'
1054 e986f20c Michael Hanselmann
1055 c71a1a3d Iustin Pop
.. vim: set textwidth=72 :
1056 c71a1a3d Iustin Pop
.. Local Variables:
1057 c71a1a3d Iustin Pop
.. mode: rst
1058 c71a1a3d Iustin Pop
.. fill-column: 72
1059 c71a1a3d Iustin Pop
.. End: