root / doc / walkthrough.rst @ 679008e7
History | View | Annotate | Download (41.7 kB)
1 | c71a1a3d | Iustin Pop | Ganeti walk-through |
---|---|---|---|
2 | c71a1a3d | Iustin Pop | =================== |
3 | c71a1a3d | Iustin Pop | |
4 | c71a1a3d | Iustin Pop | Documents Ganeti version |version| |
5 | c71a1a3d | Iustin Pop | |
6 | c71a1a3d | Iustin Pop | .. contents:: |
7 | c71a1a3d | Iustin Pop | |
8 | c71a1a3d | Iustin Pop | .. highlight:: text |
9 | c71a1a3d | Iustin Pop | |
10 | c71a1a3d | Iustin Pop | Introduction |
11 | c71a1a3d | Iustin Pop | ------------ |
12 | c71a1a3d | Iustin Pop | |
13 | c71a1a3d | Iustin Pop | This document serves as a more example-oriented guide to Ganeti; while |
14 | c71a1a3d | Iustin Pop | the administration guide shows a conceptual approach, here you will find |
15 | c71a1a3d | Iustin Pop | a step-by-step example to managing instances and the cluster. |
16 | c71a1a3d | Iustin Pop | |
17 | c71a1a3d | Iustin Pop | Our simulated, example cluster will have three machines, named |
18 | c71a1a3d | Iustin Pop | ``node1``, ``node2``, ``node3``. Note that in real life machines will |
19 | c71a1a3d | Iustin Pop | usually FQDNs but here we use short names for brevity. We will use a |
20 | c71a1a3d | Iustin Pop | secondary network for replication data, ``192.168.2.0/24``, with nodes |
21 | c71a1a3d | Iustin Pop | having the last octet the same as their index. The cluster name will be |
22 | c71a1a3d | Iustin Pop | ``example-cluster``. All nodes have the same simulated hardware |
23 | c71a1a3d | Iustin Pop | configuration, two disks of 750GB, 32GB of memory and 4 CPUs. |
24 | c71a1a3d | Iustin Pop | |
25 | c71a1a3d | Iustin Pop | On this cluster, we will create up to seven instances, named |
26 | c71a1a3d | Iustin Pop | ``instance1`` to ``instance7``. |
27 | c71a1a3d | Iustin Pop | |
28 | c71a1a3d | Iustin Pop | |
29 | c71a1a3d | Iustin Pop | Cluster creation |
30 | c71a1a3d | Iustin Pop | ---------------- |
31 | c71a1a3d | Iustin Pop | |
32 | c71a1a3d | Iustin Pop | Follow the :doc:`install` document and prepare the nodes. Then it's time |
33 | c71a1a3d | Iustin Pop | to initialise the cluster:: |
34 | c71a1a3d | Iustin Pop | |
35 | c71a1a3d | Iustin Pop | node1# gnt-cluster init -s 192.168.2.1 --enabled-hypervisors=xen-pvm cluster |
36 | c71a1a3d | Iustin Pop | node1# |
37 | c71a1a3d | Iustin Pop | |
38 | c71a1a3d | Iustin Pop | The creation was fine. Let's check that one node we have is functioning |
39 | c71a1a3d | Iustin Pop | correctly:: |
40 | c71a1a3d | Iustin Pop | |
41 | c71a1a3d | Iustin Pop | node1# gnt-node list |
42 | c71a1a3d | Iustin Pop | Node DTotal DFree MTotal MNode MFree Pinst Sinst |
43 | c71a1a3d | Iustin Pop | node1 1.3T 1.3T 32.0G 1.0G 30.5G 0 0 |
44 | c71a1a3d | Iustin Pop | node1# gnt-cluster verify |
45 | c71a1a3d | Iustin Pop | Mon Oct 26 02:08:51 2009 * Verifying global settings |
46 | c71a1a3d | Iustin Pop | Mon Oct 26 02:08:51 2009 * Gathering data (1 nodes) |
47 | c71a1a3d | Iustin Pop | Mon Oct 26 02:08:52 2009 * Verifying node status |
48 | c71a1a3d | Iustin Pop | Mon Oct 26 02:08:52 2009 * Verifying instance status |
49 | c71a1a3d | Iustin Pop | Mon Oct 26 02:08:52 2009 * Verifying orphan volumes |
50 | c71a1a3d | Iustin Pop | Mon Oct 26 02:08:52 2009 * Verifying remaining instances |
51 | c71a1a3d | Iustin Pop | Mon Oct 26 02:08:52 2009 * Verifying N+1 Memory redundancy |
52 | c71a1a3d | Iustin Pop | Mon Oct 26 02:08:52 2009 * Other Notes |
53 | c71a1a3d | Iustin Pop | Mon Oct 26 02:08:52 2009 * Hooks Results |
54 | c71a1a3d | Iustin Pop | node1# |
55 | c71a1a3d | Iustin Pop | |
56 | c71a1a3d | Iustin Pop | Since this proceeded correctly, let's add the other two nodes:: |
57 | c71a1a3d | Iustin Pop | |
58 | c71a1a3d | Iustin Pop | node1# gnt-node add -s 192.168.2.2 node2 |
59 | c71a1a3d | Iustin Pop | -- WARNING -- |
60 | c71a1a3d | Iustin Pop | Performing this operation is going to replace the ssh daemon keypair |
61 | c71a1a3d | Iustin Pop | on the target machine (node2) with the ones of the current one |
62 | c71a1a3d | Iustin Pop | and grant full intra-cluster ssh root access to/from it |
63 | c71a1a3d | Iustin Pop | |
64 | c71a1a3d | Iustin Pop | The authenticity of host 'node2 (192.168.1.2)' can't be established. |
65 | c71a1a3d | Iustin Pop | RSA key fingerprint is 9f:… |
66 | c71a1a3d | Iustin Pop | Are you sure you want to continue connecting (yes/no)? yes |
67 | c71a1a3d | Iustin Pop | root@node2's password: |
68 | c71a1a3d | Iustin Pop | Mon Oct 26 02:11:54 2009 - INFO: Node will be a master candidate |
69 | c71a1a3d | Iustin Pop | node1# gnt-node add -s 192.168.2.3 node3 |
70 | c71a1a3d | Iustin Pop | -- WARNING -- |
71 | c71a1a3d | Iustin Pop | Performing this operation is going to replace the ssh daemon keypair |
72 | c71a1a3d | Iustin Pop | on the target machine (node2) with the ones of the current one |
73 | c71a1a3d | Iustin Pop | and grant full intra-cluster ssh root access to/from it |
74 | c71a1a3d | Iustin Pop | |
75 | c71a1a3d | Iustin Pop | The authenticity of host 'node3 (192.168.1.3)' can't be established. |
76 | c71a1a3d | Iustin Pop | RSA key fingerprint is 9f:… |
77 | c71a1a3d | Iustin Pop | Are you sure you want to continue connecting (yes/no)? yes |
78 | c71a1a3d | Iustin Pop | root@node2's password: |
79 | c71a1a3d | Iustin Pop | Mon Oct 26 02:11:54 2009 - INFO: Node will be a master candidate |
80 | c71a1a3d | Iustin Pop | |
81 | c71a1a3d | Iustin Pop | Checking the cluster status again:: |
82 | c71a1a3d | Iustin Pop | |
83 | c71a1a3d | Iustin Pop | node1# gnt-node list |
84 | c71a1a3d | Iustin Pop | Node DTotal DFree MTotal MNode MFree Pinst Sinst |
85 | c71a1a3d | Iustin Pop | node1 1.3T 1.3T 32.0G 1.0G 30.5G 0 0 |
86 | c71a1a3d | Iustin Pop | node2 1.3T 1.3T 32.0G 1.0G 30.5G 0 0 |
87 | c71a1a3d | Iustin Pop | node3 1.3T 1.3T 32.0G 1.0G 30.5G 0 0 |
88 | c71a1a3d | Iustin Pop | node1# gnt-cluster verify |
89 | c71a1a3d | Iustin Pop | Mon Oct 26 02:15:14 2009 * Verifying global settings |
90 | c71a1a3d | Iustin Pop | Mon Oct 26 02:15:14 2009 * Gathering data (3 nodes) |
91 | c71a1a3d | Iustin Pop | Mon Oct 26 02:15:16 2009 * Verifying node status |
92 | c71a1a3d | Iustin Pop | Mon Oct 26 02:15:16 2009 * Verifying instance status |
93 | c71a1a3d | Iustin Pop | Mon Oct 26 02:15:16 2009 * Verifying orphan volumes |
94 | c71a1a3d | Iustin Pop | Mon Oct 26 02:15:16 2009 * Verifying remaining instances |
95 | c71a1a3d | Iustin Pop | Mon Oct 26 02:15:16 2009 * Verifying N+1 Memory redundancy |
96 | c71a1a3d | Iustin Pop | Mon Oct 26 02:15:16 2009 * Other Notes |
97 | c71a1a3d | Iustin Pop | Mon Oct 26 02:15:16 2009 * Hooks Results |
98 | c71a1a3d | Iustin Pop | node1# |
99 | c71a1a3d | Iustin Pop | |
100 | c71a1a3d | Iustin Pop | And let's check that we have a valid OS:: |
101 | c71a1a3d | Iustin Pop | |
102 | c71a1a3d | Iustin Pop | node1# gnt-os list |
103 | c71a1a3d | Iustin Pop | Name |
104 | c71a1a3d | Iustin Pop | debootstrap |
105 | c71a1a3d | Iustin Pop | node1# |
106 | c71a1a3d | Iustin Pop | |
107 | c71a1a3d | Iustin Pop | Running a burnin |
108 | c71a1a3d | Iustin Pop | ---------------- |
109 | c71a1a3d | Iustin Pop | |
110 | c71a1a3d | Iustin Pop | Now that the cluster is created, it is time to check that the hardware |
111 | c71a1a3d | Iustin Pop | works correctly, that the hypervisor can actually create instances, |
112 | c71a1a3d | Iustin Pop | etc. This is done via the debootstrap tool as described in the admin |
113 | c71a1a3d | Iustin Pop | guide. Similar output lines are replaced with ``…`` in the below log:: |
114 | c71a1a3d | Iustin Pop | |
115 | c71a1a3d | Iustin Pop | node1# /usr/lib/ganeti/tools/burnin -o debootstrap -p instance{1..5} |
116 | c71a1a3d | Iustin Pop | - Testing global parameters |
117 | c71a1a3d | Iustin Pop | - Creating instances |
118 | c71a1a3d | Iustin Pop | * instance instance1 |
119 | c71a1a3d | Iustin Pop | on node1, node2 |
120 | c71a1a3d | Iustin Pop | * instance instance2 |
121 | c71a1a3d | Iustin Pop | on node2, node3 |
122 | c71a1a3d | Iustin Pop | … |
123 | c71a1a3d | Iustin Pop | * instance instance5 |
124 | c71a1a3d | Iustin Pop | on node2, node3 |
125 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 157, 158, 159, 160, 161 |
126 | c71a1a3d | Iustin Pop | waiting for job 157 for instance1 |
127 | c71a1a3d | Iustin Pop | … |
128 | c71a1a3d | Iustin Pop | waiting for job 161 for instance5 |
129 | c71a1a3d | Iustin Pop | - Replacing disks on the same nodes |
130 | c71a1a3d | Iustin Pop | * instance instance1 |
131 | c71a1a3d | Iustin Pop | run replace_on_secondary |
132 | c71a1a3d | Iustin Pop | run replace_on_primary |
133 | c71a1a3d | Iustin Pop | … |
134 | c71a1a3d | Iustin Pop | * instance instance5 |
135 | c71a1a3d | Iustin Pop | run replace_on_secondary |
136 | c71a1a3d | Iustin Pop | run replace_on_primary |
137 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 162, 163, 164, 165, 166 |
138 | c71a1a3d | Iustin Pop | waiting for job 162 for instance1 |
139 | c71a1a3d | Iustin Pop | … |
140 | c71a1a3d | Iustin Pop | - Changing the secondary node |
141 | c71a1a3d | Iustin Pop | * instance instance1 |
142 | c71a1a3d | Iustin Pop | run replace_new_secondary node3 |
143 | c71a1a3d | Iustin Pop | * instance instance2 |
144 | c71a1a3d | Iustin Pop | run replace_new_secondary node1 |
145 | c71a1a3d | Iustin Pop | … |
146 | c71a1a3d | Iustin Pop | * instance instance5 |
147 | c71a1a3d | Iustin Pop | run replace_new_secondary node1 |
148 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 167, 168, 169, 170, 171 |
149 | c71a1a3d | Iustin Pop | waiting for job 167 for instance1 |
150 | c71a1a3d | Iustin Pop | … |
151 | c71a1a3d | Iustin Pop | - Growing disks |
152 | c71a1a3d | Iustin Pop | * instance instance1 |
153 | c71a1a3d | Iustin Pop | increase disk/0 by 128 MB |
154 | c71a1a3d | Iustin Pop | … |
155 | c71a1a3d | Iustin Pop | * instance instance5 |
156 | c71a1a3d | Iustin Pop | increase disk/0 by 128 MB |
157 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 173, 174, 175, 176, 177 |
158 | c71a1a3d | Iustin Pop | waiting for job 173 for instance1 |
159 | c71a1a3d | Iustin Pop | … |
160 | c71a1a3d | Iustin Pop | - Failing over instances |
161 | c71a1a3d | Iustin Pop | * instance instance1 |
162 | c71a1a3d | Iustin Pop | … |
163 | c71a1a3d | Iustin Pop | * instance instance5 |
164 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 179, 180, 181, 182, 183 |
165 | c71a1a3d | Iustin Pop | waiting for job 179 for instance1 |
166 | c71a1a3d | Iustin Pop | … |
167 | c71a1a3d | Iustin Pop | - Migrating instances |
168 | c71a1a3d | Iustin Pop | * instance instance1 |
169 | c71a1a3d | Iustin Pop | migration and migration cleanup |
170 | c71a1a3d | Iustin Pop | … |
171 | c71a1a3d | Iustin Pop | * instance instance5 |
172 | c71a1a3d | Iustin Pop | migration and migration cleanup |
173 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 184, 185, 186, 187, 188 |
174 | c71a1a3d | Iustin Pop | waiting for job 184 for instance1 |
175 | c71a1a3d | Iustin Pop | … |
176 | c71a1a3d | Iustin Pop | - Exporting and re-importing instances |
177 | c71a1a3d | Iustin Pop | * instance instance1 |
178 | c71a1a3d | Iustin Pop | export to node node3 |
179 | c71a1a3d | Iustin Pop | remove instance |
180 | c71a1a3d | Iustin Pop | import from node3 to node1, node2 |
181 | c71a1a3d | Iustin Pop | remove export |
182 | c71a1a3d | Iustin Pop | … |
183 | c71a1a3d | Iustin Pop | * instance instance5 |
184 | c71a1a3d | Iustin Pop | export to node node1 |
185 | c71a1a3d | Iustin Pop | remove instance |
186 | c71a1a3d | Iustin Pop | import from node1 to node2, node3 |
187 | c71a1a3d | Iustin Pop | remove export |
188 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 196, 197, 198, 199, 200 |
189 | c71a1a3d | Iustin Pop | waiting for job 196 for instance1 |
190 | c71a1a3d | Iustin Pop | … |
191 | c71a1a3d | Iustin Pop | - Reinstalling instances |
192 | c71a1a3d | Iustin Pop | * instance instance1 |
193 | c71a1a3d | Iustin Pop | reinstall without passing the OS |
194 | c71a1a3d | Iustin Pop | reinstall specifying the OS |
195 | c71a1a3d | Iustin Pop | … |
196 | c71a1a3d | Iustin Pop | * instance instance5 |
197 | c71a1a3d | Iustin Pop | reinstall without passing the OS |
198 | c71a1a3d | Iustin Pop | reinstall specifying the OS |
199 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 203, 204, 205, 206, 207 |
200 | c71a1a3d | Iustin Pop | waiting for job 203 for instance1 |
201 | c71a1a3d | Iustin Pop | … |
202 | c71a1a3d | Iustin Pop | - Rebooting instances |
203 | c71a1a3d | Iustin Pop | * instance instance1 |
204 | c71a1a3d | Iustin Pop | reboot with type 'hard' |
205 | c71a1a3d | Iustin Pop | reboot with type 'soft' |
206 | c71a1a3d | Iustin Pop | reboot with type 'full' |
207 | c71a1a3d | Iustin Pop | … |
208 | c71a1a3d | Iustin Pop | * instance instance5 |
209 | c71a1a3d | Iustin Pop | reboot with type 'hard' |
210 | c71a1a3d | Iustin Pop | reboot with type 'soft' |
211 | c71a1a3d | Iustin Pop | reboot with type 'full' |
212 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 208, 209, 210, 211, 212 |
213 | c71a1a3d | Iustin Pop | waiting for job 208 for instance1 |
214 | c71a1a3d | Iustin Pop | … |
215 | c71a1a3d | Iustin Pop | - Adding and removing disks |
216 | c71a1a3d | Iustin Pop | * instance instance1 |
217 | c71a1a3d | Iustin Pop | adding a disk |
218 | c71a1a3d | Iustin Pop | removing last disk |
219 | c71a1a3d | Iustin Pop | … |
220 | c71a1a3d | Iustin Pop | * instance instance5 |
221 | c71a1a3d | Iustin Pop | adding a disk |
222 | c71a1a3d | Iustin Pop | removing last disk |
223 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 213, 214, 215, 216, 217 |
224 | c71a1a3d | Iustin Pop | waiting for job 213 for instance1 |
225 | c71a1a3d | Iustin Pop | … |
226 | c71a1a3d | Iustin Pop | - Adding and removing NICs |
227 | c71a1a3d | Iustin Pop | * instance instance1 |
228 | c71a1a3d | Iustin Pop | adding a NIC |
229 | c71a1a3d | Iustin Pop | removing last NIC |
230 | c71a1a3d | Iustin Pop | … |
231 | c71a1a3d | Iustin Pop | * instance instance5 |
232 | c71a1a3d | Iustin Pop | adding a NIC |
233 | c71a1a3d | Iustin Pop | removing last NIC |
234 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 218, 219, 220, 221, 222 |
235 | c71a1a3d | Iustin Pop | waiting for job 218 for instance1 |
236 | c71a1a3d | Iustin Pop | … |
237 | c71a1a3d | Iustin Pop | - Activating/deactivating disks |
238 | c71a1a3d | Iustin Pop | * instance instance1 |
239 | c71a1a3d | Iustin Pop | activate disks when online |
240 | c71a1a3d | Iustin Pop | activate disks when offline |
241 | c71a1a3d | Iustin Pop | deactivate disks (when offline) |
242 | c71a1a3d | Iustin Pop | … |
243 | c71a1a3d | Iustin Pop | * instance instance5 |
244 | c71a1a3d | Iustin Pop | activate disks when online |
245 | c71a1a3d | Iustin Pop | activate disks when offline |
246 | c71a1a3d | Iustin Pop | deactivate disks (when offline) |
247 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 223, 224, 225, 226, 227 |
248 | c71a1a3d | Iustin Pop | waiting for job 223 for instance1 |
249 | c71a1a3d | Iustin Pop | … |
250 | c71a1a3d | Iustin Pop | - Stopping and starting instances |
251 | c71a1a3d | Iustin Pop | * instance instance1 |
252 | c71a1a3d | Iustin Pop | … |
253 | c71a1a3d | Iustin Pop | * instance instance5 |
254 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 230, 231, 232, 233, 234 |
255 | c71a1a3d | Iustin Pop | waiting for job 230 for instance1 |
256 | c71a1a3d | Iustin Pop | … |
257 | c71a1a3d | Iustin Pop | - Removing instances |
258 | c71a1a3d | Iustin Pop | * instance instance1 |
259 | c71a1a3d | Iustin Pop | … |
260 | c71a1a3d | Iustin Pop | * instance instance5 |
261 | c71a1a3d | Iustin Pop | * Submitted job ID(s) 235, 236, 237, 238, 239 |
262 | c71a1a3d | Iustin Pop | waiting for job 235 for instance1 |
263 | c71a1a3d | Iustin Pop | … |
264 | c71a1a3d | Iustin Pop | node1# |
265 | c71a1a3d | Iustin Pop | |
266 | c71a1a3d | Iustin Pop | You can see in the above what operations the burnin does. Ideally, the |
267 | c71a1a3d | Iustin Pop | burnin log would proceed successfully through all the steps and end |
268 | c71a1a3d | Iustin Pop | cleanly, without throwing errors. |
269 | c71a1a3d | Iustin Pop | |
270 | c71a1a3d | Iustin Pop | Instance operations |
271 | c71a1a3d | Iustin Pop | ------------------- |
272 | c71a1a3d | Iustin Pop | |
273 | c71a1a3d | Iustin Pop | Creation |
274 | c71a1a3d | Iustin Pop | ++++++++ |
275 | c71a1a3d | Iustin Pop | |
276 | c71a1a3d | Iustin Pop | At this point, Ganeti and the hardware seems to be functioning |
277 | c71a1a3d | Iustin Pop | correctly, so we'll follow up with creating the instances manually:: |
278 | c71a1a3d | Iustin Pop | |
279 | c71a1a3d | Iustin Pop | node1# gnt-instance add -t drbd -o debootstrap -s 256m -n node1:node2 instance3 |
280 | c71a1a3d | Iustin Pop | Mon Oct 26 04:06:52 2009 - INFO: Selected nodes for instance instance1 via iallocator hail: node2, node3 |
281 | c71a1a3d | Iustin Pop | Mon Oct 26 04:06:53 2009 * creating instance disks... |
282 | c71a1a3d | Iustin Pop | Mon Oct 26 04:06:57 2009 adding instance instance1 to cluster config |
283 | c71a1a3d | Iustin Pop | Mon Oct 26 04:06:57 2009 - INFO: Waiting for instance instance1 to sync disks. |
284 | c71a1a3d | Iustin Pop | Mon Oct 26 04:06:57 2009 - INFO: - device disk/0: 20.00% done, 4 estimated seconds remaining |
285 | c71a1a3d | Iustin Pop | Mon Oct 26 04:07:01 2009 - INFO: Instance instance1's disks are in sync. |
286 | c71a1a3d | Iustin Pop | Mon Oct 26 04:07:01 2009 creating os for instance instance1 on node node2 |
287 | c71a1a3d | Iustin Pop | Mon Oct 26 04:07:01 2009 * running the instance OS create scripts... |
288 | c71a1a3d | Iustin Pop | Mon Oct 26 04:07:14 2009 * starting instance... |
289 | c71a1a3d | Iustin Pop | node1# gnt-instance add -t drbd -o debootstrap -s 256m -n node1:node2 instanc<drbd -o debootstrap -s 256m -n node1:node2 instance2 |
290 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:37 2009 * creating instance disks... |
291 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:40 2009 adding instance instance2 to cluster config |
292 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:41 2009 - INFO: Waiting for instance instance2 to sync disks. |
293 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:41 2009 - INFO: - device disk/0: 35.40% done, 1 estimated seconds remaining |
294 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:42 2009 - INFO: - device disk/0: 58.50% done, 1 estimated seconds remaining |
295 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:43 2009 - INFO: - device disk/0: 86.20% done, 0 estimated seconds remaining |
296 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:44 2009 - INFO: - device disk/0: 92.40% done, 0 estimated seconds remaining |
297 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:44 2009 - INFO: - device disk/0: 97.00% done, 0 estimated seconds remaining |
298 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:44 2009 - INFO: Instance instance2's disks are in sync. |
299 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:44 2009 creating os for instance instance2 on node node1 |
300 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:44 2009 * running the instance OS create scripts... |
301 | c71a1a3d | Iustin Pop | Mon Oct 26 04:11:57 2009 * starting instance... |
302 | c71a1a3d | Iustin Pop | node1# |
303 | c71a1a3d | Iustin Pop | |
304 | c71a1a3d | Iustin Pop | The above shows one instance created via an iallocator script, and one |
305 | c71a1a3d | Iustin Pop | being created with manual node assignment. The other three instances |
306 | c71a1a3d | Iustin Pop | were also created and now it's time to check them:: |
307 | c71a1a3d | Iustin Pop | |
308 | c71a1a3d | Iustin Pop | node1# gnt-instance list |
309 | c71a1a3d | Iustin Pop | Instance Hypervisor OS Primary_node Status Memory |
310 | c71a1a3d | Iustin Pop | instance1 xen-pvm debootstrap node2 running 128M |
311 | c71a1a3d | Iustin Pop | instance2 xen-pvm debootstrap node1 running 128M |
312 | c71a1a3d | Iustin Pop | instance3 xen-pvm debootstrap node1 running 128M |
313 | c71a1a3d | Iustin Pop | instance4 xen-pvm debootstrap node3 running 128M |
314 | c71a1a3d | Iustin Pop | instance5 xen-pvm debootstrap node2 running 128M |
315 | c71a1a3d | Iustin Pop | |
316 | c71a1a3d | Iustin Pop | Accessing instances |
317 | c71a1a3d | Iustin Pop | +++++++++++++++++++ |
318 | c71a1a3d | Iustin Pop | |
319 | c71a1a3d | Iustin Pop | Accessing an instance's console is easy:: |
320 | c71a1a3d | Iustin Pop | |
321 | c71a1a3d | Iustin Pop | node1# gnt-instance console instance2 |
322 | c71a1a3d | Iustin Pop | [ 0.000000] Bootdata ok (command line is root=/dev/sda1 ro) |
323 | c71a1a3d | Iustin Pop | [ 0.000000] Linux version 2.6… |
324 | c71a1a3d | Iustin Pop | [ 0.000000] BIOS-provided physical RAM map: |
325 | c71a1a3d | Iustin Pop | [ 0.000000] Xen: 0000000000000000 - 0000000008800000 (usable) |
326 | c71a1a3d | Iustin Pop | [13138176.018071] Built 1 zonelists. Total pages: 34816 |
327 | c71a1a3d | Iustin Pop | [13138176.018074] Kernel command line: root=/dev/sda1 ro |
328 | c71a1a3d | Iustin Pop | [13138176.018694] Initializing CPU#0 |
329 | c71a1a3d | Iustin Pop | … |
330 | c71a1a3d | Iustin Pop | Checking file systems...fsck 1.41.3 (12-Oct-2008) |
331 | c71a1a3d | Iustin Pop | done. |
332 | c71a1a3d | Iustin Pop | Setting kernel variables (/etc/sysctl.conf)...done. |
333 | c71a1a3d | Iustin Pop | Mounting local filesystems...done. |
334 | c71a1a3d | Iustin Pop | Activating swapfile swap...done. |
335 | c71a1a3d | Iustin Pop | Setting up networking.... |
336 | c71a1a3d | Iustin Pop | Configuring network interfaces...done. |
337 | c71a1a3d | Iustin Pop | Setting console screen modes and fonts. |
338 | c71a1a3d | Iustin Pop | INIT: Entering runlevel: 2 |
339 | c71a1a3d | Iustin Pop | Starting enhanced syslogd: rsyslogd. |
340 | c71a1a3d | Iustin Pop | Starting periodic command scheduler: crond. |
341 | c71a1a3d | Iustin Pop | |
342 | c71a1a3d | Iustin Pop | Debian GNU/Linux 5.0 instance2 tty1 |
343 | c71a1a3d | Iustin Pop | |
344 | c71a1a3d | Iustin Pop | instance2 login: |
345 | c71a1a3d | Iustin Pop | |
346 | c71a1a3d | Iustin Pop | At this moment you can login to the instance and, after configuring the |
347 | c71a1a3d | Iustin Pop | network (and doing this on all instances), we can check their |
348 | c71a1a3d | Iustin Pop | connectivity:: |
349 | c71a1a3d | Iustin Pop | |
350 | c71a1a3d | Iustin Pop | node1# fping instance{1..5} |
351 | c71a1a3d | Iustin Pop | instance1 is alive |
352 | c71a1a3d | Iustin Pop | instance2 is alive |
353 | c71a1a3d | Iustin Pop | instance3 is alive |
354 | c71a1a3d | Iustin Pop | instance4 is alive |
355 | c71a1a3d | Iustin Pop | instance5 is alive |
356 | c71a1a3d | Iustin Pop | node1# |
357 | c71a1a3d | Iustin Pop | |
358 | c71a1a3d | Iustin Pop | Removal |
359 | c71a1a3d | Iustin Pop | +++++++ |
360 | c71a1a3d | Iustin Pop | |
361 | c71a1a3d | Iustin Pop | Removing unwanted instances is also easy:: |
362 | c71a1a3d | Iustin Pop | |
363 | c71a1a3d | Iustin Pop | node1# gnt-instance remove instance5 |
364 | c71a1a3d | Iustin Pop | This will remove the volumes of the instance instance5 (including |
365 | c71a1a3d | Iustin Pop | mirrors), thus removing all the data of the instance. Continue? |
366 | c71a1a3d | Iustin Pop | y/[n]/?: y |
367 | c71a1a3d | Iustin Pop | node1# |
368 | c71a1a3d | Iustin Pop | |
369 | c71a1a3d | Iustin Pop | |
370 | c71a1a3d | Iustin Pop | Recovering from hardware failures |
371 | c71a1a3d | Iustin Pop | --------------------------------- |
372 | c71a1a3d | Iustin Pop | |
373 | c71a1a3d | Iustin Pop | Recovering from node failure |
374 | c71a1a3d | Iustin Pop | ++++++++++++++++++++++++++++ |
375 | c71a1a3d | Iustin Pop | |
376 | c71a1a3d | Iustin Pop | We are now left with four instances. Assume that at this point, node3, |
377 | c71a1a3d | Iustin Pop | which has one primary and one secondary instance, crashes:: |
378 | c71a1a3d | Iustin Pop | |
379 | c71a1a3d | Iustin Pop | node1# gnt-node info node3 |
380 | c71a1a3d | Iustin Pop | Node name: node3 |
381 | c71a1a3d | Iustin Pop | primary ip: 172.24.227.1 |
382 | c71a1a3d | Iustin Pop | secondary ip: 192.168.2.3 |
383 | c71a1a3d | Iustin Pop | master candidate: True |
384 | c71a1a3d | Iustin Pop | drained: False |
385 | c71a1a3d | Iustin Pop | offline: False |
386 | c71a1a3d | Iustin Pop | primary for instances: |
387 | c71a1a3d | Iustin Pop | - instance4 |
388 | c71a1a3d | Iustin Pop | secondary for instances: |
389 | c71a1a3d | Iustin Pop | - instance1 |
390 | c71a1a3d | Iustin Pop | node1# fping node3 |
391 | c71a1a3d | Iustin Pop | node3 is unreachable |
392 | c71a1a3d | Iustin Pop | |
393 | c71a1a3d | Iustin Pop | At this point, the primary instance of that node (instance4) is down, |
394 | c71a1a3d | Iustin Pop | but the secondary instance (instance1) is not affected except it has |
395 | c71a1a3d | Iustin Pop | lost disk redundancy:: |
396 | c71a1a3d | Iustin Pop | |
397 | c71a1a3d | Iustin Pop | node1# fping instance{1,4} |
398 | c71a1a3d | Iustin Pop | instance1 is alive |
399 | c71a1a3d | Iustin Pop | instance4 is unreachable |
400 | c71a1a3d | Iustin Pop | node1# |
401 | c71a1a3d | Iustin Pop | |
402 | c71a1a3d | Iustin Pop | If we try to check the status of instance4 via the instance info |
403 | c71a1a3d | Iustin Pop | command, it fails because it tries to contact node3 which is down:: |
404 | c71a1a3d | Iustin Pop | |
405 | c71a1a3d | Iustin Pop | node1# gnt-instance info instance4 |
406 | c71a1a3d | Iustin Pop | Failure: command execution error: |
407 | c71a1a3d | Iustin Pop | Error checking node node3: Connection failed (113: No route to host) |
408 | c71a1a3d | Iustin Pop | node1# |
409 | c71a1a3d | Iustin Pop | |
410 | c71a1a3d | Iustin Pop | So we need to mark node3 as being *offline*, and thus Ganeti won't talk |
411 | c71a1a3d | Iustin Pop | to it anymore:: |
412 | c71a1a3d | Iustin Pop | |
413 | c71a1a3d | Iustin Pop | node1# gnt-node modify -O yes -f node3 |
414 | c71a1a3d | Iustin Pop | Mon Oct 26 04:34:12 2009 - WARNING: Not enough master candidates (desired 10, new value will be 2) |
415 | c71a1a3d | Iustin Pop | Mon Oct 26 04:34:15 2009 - WARNING: Communication failure to node node3: Connection failed (113: No route to host) |
416 | c71a1a3d | Iustin Pop | Modified node node3 |
417 | c71a1a3d | Iustin Pop | - offline -> True |
418 | c71a1a3d | Iustin Pop | - master_candidate -> auto-demotion due to offline |
419 | c71a1a3d | Iustin Pop | node1# |
420 | c71a1a3d | Iustin Pop | |
421 | c71a1a3d | Iustin Pop | And now we can failover the instance:: |
422 | c71a1a3d | Iustin Pop | |
423 | c71a1a3d | Iustin Pop | node1# gnt-instance failover --ignore-consistency instance4 |
424 | c71a1a3d | Iustin Pop | Failover will happen to image instance4. This requires a shutdown of |
425 | c71a1a3d | Iustin Pop | the instance. Continue? |
426 | c71a1a3d | Iustin Pop | y/[n]/?: y |
427 | c71a1a3d | Iustin Pop | Mon Oct 26 04:35:34 2009 * checking disk consistency between source and target |
428 | c71a1a3d | Iustin Pop | Failure: command execution error: |
429 | c71a1a3d | Iustin Pop | Disk disk/0 is degraded on target node, aborting failover. |
430 | c71a1a3d | Iustin Pop | node1# gnt-instance failover --ignore-consistency instance4 |
431 | c71a1a3d | Iustin Pop | Failover will happen to image instance4. This requires a shutdown of |
432 | c71a1a3d | Iustin Pop | the instance. Continue? |
433 | c71a1a3d | Iustin Pop | y/[n]/?: y |
434 | c71a1a3d | Iustin Pop | Mon Oct 26 04:35:47 2009 * checking disk consistency between source and target |
435 | c71a1a3d | Iustin Pop | Mon Oct 26 04:35:47 2009 * shutting down instance on source node |
436 | c71a1a3d | Iustin Pop | Mon Oct 26 04:35:47 2009 - WARNING: Could not shutdown instance instance4 on node node3. Proceeding anyway. Please make sure node node3 is down. Error details: Node is marked offline |
437 | c71a1a3d | Iustin Pop | Mon Oct 26 04:35:47 2009 * deactivating the instance's disks on source node |
438 | c71a1a3d | Iustin Pop | Mon Oct 26 04:35:47 2009 - WARNING: Could not shutdown block device disk/0 on node node3: Node is marked offline |
439 | c71a1a3d | Iustin Pop | Mon Oct 26 04:35:47 2009 * activating the instance's disks on target node |
440 | c71a1a3d | Iustin Pop | Mon Oct 26 04:35:47 2009 - WARNING: Could not prepare block device disk/0 on node node3 (is_primary=False, pass=1): Node is marked offline |
441 | c71a1a3d | Iustin Pop | Mon Oct 26 04:35:48 2009 * starting the instance on the target node |
442 | c71a1a3d | Iustin Pop | node1# |
443 | c71a1a3d | Iustin Pop | |
444 | c71a1a3d | Iustin Pop | Note in our first attempt, Ganeti refused to do the failover since it |
445 | c71a1a3d | Iustin Pop | wasn't sure what is the status of the instance's disks. We pass the |
446 | c71a1a3d | Iustin Pop | ``--ignore-consistency`` flag and then we can failover:: |
447 | c71a1a3d | Iustin Pop | |
448 | c71a1a3d | Iustin Pop | node1# gnt-instance list |
449 | c71a1a3d | Iustin Pop | Instance Hypervisor OS Primary_node Status Memory |
450 | c71a1a3d | Iustin Pop | instance1 xen-pvm debootstrap node2 running 128M |
451 | c71a1a3d | Iustin Pop | instance2 xen-pvm debootstrap node1 running 128M |
452 | c71a1a3d | Iustin Pop | instance3 xen-pvm debootstrap node1 running 128M |
453 | c71a1a3d | Iustin Pop | instance4 xen-pvm debootstrap node1 running 128M |
454 | c71a1a3d | Iustin Pop | node1# |
455 | c71a1a3d | Iustin Pop | |
456 | c71a1a3d | Iustin Pop | But at this point, both instance1 and instance4 are without disk |
457 | c71a1a3d | Iustin Pop | redundancy:: |
458 | c71a1a3d | Iustin Pop | |
459 | c71a1a3d | Iustin Pop | node1# gnt-instance info instance1 |
460 | c71a1a3d | Iustin Pop | Instance name: instance1 |
461 | c71a1a3d | Iustin Pop | UUID: 45173e82-d1fa-417c-8758-7d582ab7eef4 |
462 | c71a1a3d | Iustin Pop | Serial number: 2 |
463 | c71a1a3d | Iustin Pop | Creation time: 2009-10-26 04:06:57 |
464 | c71a1a3d | Iustin Pop | Modification time: 2009-10-26 04:07:14 |
465 | c71a1a3d | Iustin Pop | State: configured to be up, actual state is up |
466 | c71a1a3d | Iustin Pop | Nodes: |
467 | c71a1a3d | Iustin Pop | - primary: node2 |
468 | c71a1a3d | Iustin Pop | - secondaries: node3 |
469 | c71a1a3d | Iustin Pop | Operating system: debootstrap |
470 | c71a1a3d | Iustin Pop | Allocated network port: None |
471 | c71a1a3d | Iustin Pop | Hypervisor: xen-pvm |
472 | c71a1a3d | Iustin Pop | - root_path: default (/dev/sda1) |
473 | c71a1a3d | Iustin Pop | - kernel_args: default (ro) |
474 | c71a1a3d | Iustin Pop | - use_bootloader: default (False) |
475 | c71a1a3d | Iustin Pop | - bootloader_args: default () |
476 | c71a1a3d | Iustin Pop | - bootloader_path: default () |
477 | c71a1a3d | Iustin Pop | - kernel_path: default (/boot/vmlinuz-2.6-xenU) |
478 | c71a1a3d | Iustin Pop | - initrd_path: default () |
479 | c71a1a3d | Iustin Pop | Hardware: |
480 | c71a1a3d | Iustin Pop | - VCPUs: 1 |
481 | c71a1a3d | Iustin Pop | - memory: 128MiB |
482 | c71a1a3d | Iustin Pop | - NICs: |
483 | c71a1a3d | Iustin Pop | - nic/0: MAC: aa:00:00:78:da:63, IP: None, mode: bridged, link: xen-br0 |
484 | c71a1a3d | Iustin Pop | Disks: |
485 | c71a1a3d | Iustin Pop | - disk/0: drbd8, size 256M |
486 | c71a1a3d | Iustin Pop | access mode: rw |
487 | c71a1a3d | Iustin Pop | nodeA: node2, minor=0 |
488 | c71a1a3d | Iustin Pop | nodeB: node3, minor=0 |
489 | c71a1a3d | Iustin Pop | port: 11035 |
490 | c71a1a3d | Iustin Pop | auth key: 8e950e3cec6854b0181fbc3a6058657701f2d458 |
491 | c71a1a3d | Iustin Pop | on primary: /dev/drbd0 (147:0) in sync, status *DEGRADED* |
492 | c71a1a3d | Iustin Pop | child devices: |
493 | c71a1a3d | Iustin Pop | - child 0: lvm, size 256M |
494 | c71a1a3d | Iustin Pop | logical_id: xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_data |
495 | c71a1a3d | Iustin Pop | on primary: /dev/xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_data (254:0) |
496 | c71a1a3d | Iustin Pop | - child 1: lvm, size 128M |
497 | c71a1a3d | Iustin Pop | logical_id: xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta |
498 | c71a1a3d | Iustin Pop | on primary: /dev/xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta (254:1) |
499 | c71a1a3d | Iustin Pop | |
500 | c71a1a3d | Iustin Pop | The output is similar for instance4. In order to recover this, we need |
501 | c71a1a3d | Iustin Pop | to run the node evacuate command which will change from the current |
502 | c71a1a3d | Iustin Pop | secondary node to a new one (in this case, we only have two working |
503 | c71a1a3d | Iustin Pop | nodes, so all instances will be end on nodes one and two):: |
504 | c71a1a3d | Iustin Pop | |
505 | c71a1a3d | Iustin Pop | node1# gnt-node evacuate -I hail node3 |
506 | c71a1a3d | Iustin Pop | Relocate instance(s) 'instance1','instance4' from node |
507 | c71a1a3d | Iustin Pop | node3 using iallocator hail? |
508 | c71a1a3d | Iustin Pop | y/[n]/?: y |
509 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:39 2009 - INFO: Selected new secondary for instance 'instance1': node1 |
510 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:40 2009 - INFO: Selected new secondary for instance 'instance4': node2 |
511 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:40 2009 Replacing disk(s) 0 for instance1 |
512 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:40 2009 STEP 1/6 Check device existence |
513 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:40 2009 - INFO: Checking disk/0 on node2 |
514 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:40 2009 - INFO: Checking volume groups |
515 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:40 2009 STEP 2/6 Check peer consistency |
516 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:40 2009 - INFO: Checking disk/0 consistency on node node2 |
517 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:40 2009 STEP 3/6 Allocate new storage |
518 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:40 2009 - INFO: Adding new local storage on node1 for disk/0 |
519 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:41 2009 STEP 4/6 Changing drbd configuration |
520 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:41 2009 - INFO: activating a new drbd on node1 for disk/0 |
521 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:42 2009 - INFO: Shutting down drbd for disk/0 on old node |
522 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:42 2009 - WARNING: Failed to shutdown drbd for disk/0 on oldnode: Node is marked offline |
523 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:42 2009 Hint: Please cleanup this device manually as soon as possible |
524 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:42 2009 - INFO: Detaching primary drbds from the network (=> standalone) |
525 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:42 2009 - INFO: Updating instance configuration |
526 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:45 2009 - INFO: Attaching primary drbds to new secondary (standalone => connected) |
527 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:46 2009 STEP 5/6 Sync devices |
528 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:46 2009 - INFO: Waiting for instance instance1 to sync disks. |
529 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:46 2009 - INFO: - device disk/0: 13.90% done, 7 estimated seconds remaining |
530 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 - INFO: Instance instance1's disks are in sync. |
531 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 STEP 6/6 Removing old storage |
532 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 - INFO: Remove logical volumes for 0 |
533 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 - WARNING: Can't remove old LV: Node is marked offline |
534 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 Hint: remove unused LVs manually |
535 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 - WARNING: Can't remove old LV: Node is marked offline |
536 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 Hint: remove unused LVs manually |
537 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 Replacing disk(s) 0 for instance4 |
538 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 STEP 1/6 Check device existence |
539 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 - INFO: Checking disk/0 on node1 |
540 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 - INFO: Checking volume groups |
541 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 STEP 2/6 Check peer consistency |
542 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:53 2009 - INFO: Checking disk/0 consistency on node node1 |
543 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:54 2009 STEP 3/6 Allocate new storage |
544 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:54 2009 - INFO: Adding new local storage on node2 for disk/0 |
545 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:54 2009 STEP 4/6 Changing drbd configuration |
546 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:54 2009 - INFO: activating a new drbd on node2 for disk/0 |
547 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:55 2009 - INFO: Shutting down drbd for disk/0 on old node |
548 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:55 2009 - WARNING: Failed to shutdown drbd for disk/0 on oldnode: Node is marked offline |
549 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:55 2009 Hint: Please cleanup this device manually as soon as possible |
550 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:55 2009 - INFO: Detaching primary drbds from the network (=> standalone) |
551 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:55 2009 - INFO: Updating instance configuration |
552 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:55 2009 - INFO: Attaching primary drbds to new secondary (standalone => connected) |
553 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:56 2009 STEP 5/6 Sync devices |
554 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:56 2009 - INFO: Waiting for instance instance4 to sync disks. |
555 | c71a1a3d | Iustin Pop | Mon Oct 26 05:05:56 2009 - INFO: - device disk/0: 12.40% done, 8 estimated seconds remaining |
556 | c71a1a3d | Iustin Pop | Mon Oct 26 05:06:04 2009 - INFO: Instance instance4's disks are in sync. |
557 | c71a1a3d | Iustin Pop | Mon Oct 26 05:06:04 2009 STEP 6/6 Removing old storage |
558 | c71a1a3d | Iustin Pop | Mon Oct 26 05:06:04 2009 - INFO: Remove logical volumes for 0 |
559 | c71a1a3d | Iustin Pop | Mon Oct 26 05:06:04 2009 - WARNING: Can't remove old LV: Node is marked offline |
560 | c71a1a3d | Iustin Pop | Mon Oct 26 05:06:04 2009 Hint: remove unused LVs manually |
561 | c71a1a3d | Iustin Pop | Mon Oct 26 05:06:04 2009 - WARNING: Can't remove old LV: Node is marked offline |
562 | c71a1a3d | Iustin Pop | Mon Oct 26 05:06:04 2009 Hint: remove unused LVs manually |
563 | c71a1a3d | Iustin Pop | node1# |
564 | c71a1a3d | Iustin Pop | |
565 | c71a1a3d | Iustin Pop | And now node3 is completely free of instances and can be repaired:: |
566 | c71a1a3d | Iustin Pop | |
567 | c71a1a3d | Iustin Pop | node1# gnt-node list |
568 | c71a1a3d | Iustin Pop | Node DTotal DFree MTotal MNode MFree Pinst Sinst |
569 | c71a1a3d | Iustin Pop | node1 1.3T 1.3T 32.0G 1.0G 30.2G 3 1 |
570 | c71a1a3d | Iustin Pop | node2 1.3T 1.3T 32.0G 1.0G 30.4G 1 3 |
571 | c71a1a3d | Iustin Pop | node3 ? ? ? ? ? 0 0 |
572 | c71a1a3d | Iustin Pop | |
573 | c71a1a3d | Iustin Pop | Re-adding a node to the cluster |
574 | c71a1a3d | Iustin Pop | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
575 | c71a1a3d | Iustin Pop | |
576 | c71a1a3d | Iustin Pop | |
577 | c71a1a3d | Iustin Pop | Let's say node3 has been repaired and is now ready to be |
578 | c71a1a3d | Iustin Pop | reused. Re-adding it is simple:: |
579 | c71a1a3d | Iustin Pop | |
580 | c71a1a3d | Iustin Pop | node1# gnt-node add --readd node3 |
581 | c71a1a3d | Iustin Pop | The authenticity of host 'node3 (172.24.227.1)' can't be established. |
582 | c71a1a3d | Iustin Pop | RSA key fingerprint is 9f:2e:5a:2e:e0:bd:00:09:e4:5c:32:f2:27:57:7a:f4. |
583 | c71a1a3d | Iustin Pop | Are you sure you want to continue connecting (yes/no)? yes |
584 | c71a1a3d | Iustin Pop | Mon Oct 26 05:27:39 2009 - INFO: Readding a node, the offline/drained flags were reset |
585 | c71a1a3d | Iustin Pop | Mon Oct 26 05:27:39 2009 - INFO: Node will be a master candidate |
586 | c71a1a3d | Iustin Pop | |
587 | c71a1a3d | Iustin Pop | And is now working again:: |
588 | c71a1a3d | Iustin Pop | |
589 | c71a1a3d | Iustin Pop | node1# gnt-node list |
590 | c71a1a3d | Iustin Pop | Node DTotal DFree MTotal MNode MFree Pinst Sinst |
591 | c71a1a3d | Iustin Pop | node1 1.3T 1.3T 32.0G 1.0G 30.2G 3 1 |
592 | c71a1a3d | Iustin Pop | node2 1.3T 1.3T 32.0G 1.0G 30.4G 1 3 |
593 | c71a1a3d | Iustin Pop | node3 1.3T 1.3T 32.0G 1.0G 30.4G 0 0 |
594 | c71a1a3d | Iustin Pop | |
595 | c71a1a3d | Iustin Pop | .. note:: If you have the ganeti-htools package installed, you can |
596 | c71a1a3d | Iustin Pop | shuffle the instances around to have a better use of the nodes. |
597 | c71a1a3d | Iustin Pop | |
598 | c71a1a3d | Iustin Pop | Disk failures |
599 | c71a1a3d | Iustin Pop | +++++++++++++ |
600 | c71a1a3d | Iustin Pop | |
601 | c71a1a3d | Iustin Pop | A disk failure is simpler than a full node failure. First, a single disk |
602 | c71a1a3d | Iustin Pop | failure should not cause data-loss for any redundant instance; only the |
603 | c71a1a3d | Iustin Pop | performance of some instances might be reduced due to more network |
604 | c71a1a3d | Iustin Pop | traffic. |
605 | c71a1a3d | Iustin Pop | |
606 | c71a1a3d | Iustin Pop | Let take the cluster status in the above listing, and check what volumes |
607 | c71a1a3d | Iustin Pop | are in use:: |
608 | c71a1a3d | Iustin Pop | |
609 | c71a1a3d | Iustin Pop | node1# gnt-node volumes -o phys,instance node2 |
610 | c71a1a3d | Iustin Pop | PhysDev Instance |
611 | c71a1a3d | Iustin Pop | /dev/sdb1 instance4 |
612 | c71a1a3d | Iustin Pop | /dev/sdb1 instance4 |
613 | c71a1a3d | Iustin Pop | /dev/sdb1 instance1 |
614 | c71a1a3d | Iustin Pop | /dev/sdb1 instance1 |
615 | c71a1a3d | Iustin Pop | /dev/sdb1 instance3 |
616 | c71a1a3d | Iustin Pop | /dev/sdb1 instance3 |
617 | c71a1a3d | Iustin Pop | /dev/sdb1 instance2 |
618 | c71a1a3d | Iustin Pop | /dev/sdb1 instance2 |
619 | c71a1a3d | Iustin Pop | node1# |
620 | c71a1a3d | Iustin Pop | |
621 | c71a1a3d | Iustin Pop | You can see that all instances on node2 have logical volumes on |
622 | c71a1a3d | Iustin Pop | ``/dev/sdb1``. Let's simulate a disk failure on that disk:: |
623 | c71a1a3d | Iustin Pop | |
624 | c71a1a3d | Iustin Pop | node1# ssh node2 |
625 | c71a1a3d | Iustin Pop | node2# echo offline > /sys/block/sdb/device/state |
626 | c71a1a3d | Iustin Pop | node2# vgs |
627 | c71a1a3d | Iustin Pop | /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error |
628 | c71a1a3d | Iustin Pop | /dev/sdb1: read failed after 0 of 4096 at 750153695232: Input/output error |
629 | c71a1a3d | Iustin Pop | /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error |
630 | c71a1a3d | Iustin Pop | Couldn't find device with uuid '954bJA-mNL0-7ydj-sdpW-nc2C-ZrCi-zFp91c'. |
631 | c71a1a3d | Iustin Pop | Couldn't find all physical volumes for volume group xenvg. |
632 | c71a1a3d | Iustin Pop | /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error |
633 | c71a1a3d | Iustin Pop | /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error |
634 | c71a1a3d | Iustin Pop | Couldn't find device with uuid '954bJA-mNL0-7ydj-sdpW-nc2C-ZrCi-zFp91c'. |
635 | c71a1a3d | Iustin Pop | Couldn't find all physical volumes for volume group xenvg. |
636 | c71a1a3d | Iustin Pop | Volume group xenvg not found |
637 | c71a1a3d | Iustin Pop | node2# |
638 | c71a1a3d | Iustin Pop | |
639 | c71a1a3d | Iustin Pop | At this point, the node is broken and if we are to examine |
640 | c71a1a3d | Iustin Pop | instance2 we get (simplified output shown):: |
641 | c71a1a3d | Iustin Pop | |
642 | c71a1a3d | Iustin Pop | node1# gnt-instance info instance2 |
643 | c71a1a3d | Iustin Pop | Instance name: instance2 |
644 | c71a1a3d | Iustin Pop | State: configured to be up, actual state is up |
645 | c71a1a3d | Iustin Pop | Nodes: |
646 | c71a1a3d | Iustin Pop | - primary: node1 |
647 | c71a1a3d | Iustin Pop | - secondaries: node2 |
648 | c71a1a3d | Iustin Pop | Disks: |
649 | c71a1a3d | Iustin Pop | - disk/0: drbd8, size 256M |
650 | c71a1a3d | Iustin Pop | on primary: /dev/drbd0 (147:0) in sync, status ok |
651 | c71a1a3d | Iustin Pop | on secondary: /dev/drbd1 (147:1) in sync, status *DEGRADED* *MISSING DISK* |
652 | c71a1a3d | Iustin Pop | |
653 | c71a1a3d | Iustin Pop | This instance has a secondary only on node2. Let's verify a primary |
654 | c71a1a3d | Iustin Pop | instance of node2:: |
655 | c71a1a3d | Iustin Pop | |
656 | c71a1a3d | Iustin Pop | node1# gnt-instance info instance1 |
657 | c71a1a3d | Iustin Pop | Instance name: instance1 |
658 | c71a1a3d | Iustin Pop | State: configured to be up, actual state is up |
659 | c71a1a3d | Iustin Pop | Nodes: |
660 | c71a1a3d | Iustin Pop | - primary: node2 |
661 | c71a1a3d | Iustin Pop | - secondaries: node1 |
662 | c71a1a3d | Iustin Pop | Disks: |
663 | c71a1a3d | Iustin Pop | - disk/0: drbd8, size 256M |
664 | c71a1a3d | Iustin Pop | on primary: /dev/drbd0 (147:0) in sync, status *DEGRADED* *MISSING DISK* |
665 | c71a1a3d | Iustin Pop | on secondary: /dev/drbd3 (147:3) in sync, status ok |
666 | c71a1a3d | Iustin Pop | node1# gnt-instance console instance1 |
667 | c71a1a3d | Iustin Pop | |
668 | c71a1a3d | Iustin Pop | Debian GNU/Linux 5.0 instance1 tty1 |
669 | c71a1a3d | Iustin Pop | |
670 | c71a1a3d | Iustin Pop | instance1 login: root |
671 | c71a1a3d | Iustin Pop | Last login: Tue Oct 27 01:24:09 UTC 2009 on tty1 |
672 | c71a1a3d | Iustin Pop | instance1:~# date > test |
673 | c71a1a3d | Iustin Pop | instance1:~# sync |
674 | c71a1a3d | Iustin Pop | instance1:~# cat test |
675 | c71a1a3d | Iustin Pop | Tue Oct 27 01:25:20 UTC 2009 |
676 | c71a1a3d | Iustin Pop | instance1:~# dmesg|tail |
677 | c71a1a3d | Iustin Pop | [5439785.235448] NET: Registered protocol family 15 |
678 | c71a1a3d | Iustin Pop | [5439785.235489] 802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com> |
679 | c71a1a3d | Iustin Pop | [5439785.235495] All bugs added by David S. Miller <davem@redhat.com> |
680 | c71a1a3d | Iustin Pop | [5439785.235517] XENBUS: Device with no driver: device/console/0 |
681 | c71a1a3d | Iustin Pop | [5439785.236576] kjournald starting. Commit interval 5 seconds |
682 | c71a1a3d | Iustin Pop | [5439785.236588] EXT3-fs: mounted filesystem with ordered data mode. |
683 | c71a1a3d | Iustin Pop | [5439785.236625] VFS: Mounted root (ext3 filesystem) readonly. |
684 | c71a1a3d | Iustin Pop | [5439785.236663] Freeing unused kernel memory: 172k freed |
685 | c71a1a3d | Iustin Pop | [5439787.533779] EXT3 FS on sda1, internal journal |
686 | c71a1a3d | Iustin Pop | [5440655.065431] eth0: no IPv6 routers present |
687 | c71a1a3d | Iustin Pop | instance1:~# |
688 | c71a1a3d | Iustin Pop | |
689 | c71a1a3d | Iustin Pop | As you can see, the instance is running fine and doesn't see any disk |
690 | c71a1a3d | Iustin Pop | issues. It is now time to fix node2 and re-establish redundancy for the |
691 | c71a1a3d | Iustin Pop | involved instances. |
692 | c71a1a3d | Iustin Pop | |
693 | c71a1a3d | Iustin Pop | .. note:: For Ganeti 2.0 we need to fix manually the volume group on |
694 | c71a1a3d | Iustin Pop | node2 by running ``vgreduce --removemissing xenvg`` |
695 | c71a1a3d | Iustin Pop | |
696 | c71a1a3d | Iustin Pop | :: |
697 | c71a1a3d | Iustin Pop | |
698 | c71a1a3d | Iustin Pop | node1# gnt-node repair-storage node2 lvm-vg xenvg |
699 | c71a1a3d | Iustin Pop | Mon Oct 26 18:14:03 2009 Repairing storage unit 'xenvg' on node2 ... |
700 | c71a1a3d | Iustin Pop | node1# ssh node2 vgs |
701 | c71a1a3d | Iustin Pop | VG #PV #LV #SN Attr VSize VFree |
702 | c71a1a3d | Iustin Pop | xenvg 1 8 0 wz--n- 673.84G 673.84G |
703 | c71a1a3d | Iustin Pop | node1# |
704 | c71a1a3d | Iustin Pop | |
705 | c71a1a3d | Iustin Pop | This has removed the 'bad' disk from the volume group, which is now left |
706 | c71a1a3d | Iustin Pop | with only one PV. We can now replace the disks for the involved |
707 | c71a1a3d | Iustin Pop | instances:: |
708 | c71a1a3d | Iustin Pop | |
709 | c71a1a3d | Iustin Pop | node1# for i in instance{1..4}; do gnt-instance replace-disks -a $i; done |
710 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:38 2009 Replacing disk(s) 0 for instance1 |
711 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:38 2009 STEP 1/6 Check device existence |
712 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:38 2009 - INFO: Checking disk/0 on node1 |
713 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:38 2009 - INFO: Checking disk/0 on node2 |
714 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:38 2009 - INFO: Checking volume groups |
715 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:38 2009 STEP 2/6 Check peer consistency |
716 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:38 2009 - INFO: Checking disk/0 consistency on node node1 |
717 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:39 2009 STEP 3/6 Allocate new storage |
718 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:39 2009 - INFO: Adding storage on node2 for disk/0 |
719 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:39 2009 STEP 4/6 Changing drbd configuration |
720 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:39 2009 - INFO: Detaching disk/0 drbd from local storage |
721 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:40 2009 - INFO: Renaming the old LVs on the target node |
722 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:40 2009 - INFO: Renaming the new LVs on the target node |
723 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:40 2009 - INFO: Adding new mirror component on node2 |
724 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:41 2009 STEP 5/6 Sync devices |
725 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:41 2009 - INFO: Waiting for instance instance1 to sync disks. |
726 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:41 2009 - INFO: - device disk/0: 12.40% done, 9 estimated seconds remaining |
727 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:50 2009 - INFO: Instance instance1's disks are in sync. |
728 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:50 2009 STEP 6/6 Removing old storage |
729 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:50 2009 - INFO: Remove logical volumes for disk/0 |
730 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:52 2009 Replacing disk(s) 0 for instance2 |
731 | c71a1a3d | Iustin Pop | Mon Oct 26 18:15:52 2009 STEP 1/6 Check device existence |
732 | c71a1a3d | Iustin Pop | … |
733 | c71a1a3d | Iustin Pop | Mon Oct 26 18:16:01 2009 STEP 6/6 Removing old storage |
734 | c71a1a3d | Iustin Pop | Mon Oct 26 18:16:01 2009 - INFO: Remove logical volumes for disk/0 |
735 | c71a1a3d | Iustin Pop | Mon Oct 26 18:16:02 2009 Replacing disk(s) 0 for instance3 |
736 | c71a1a3d | Iustin Pop | Mon Oct 26 18:16:02 2009 STEP 1/6 Check device existence |
737 | c71a1a3d | Iustin Pop | … |
738 | c71a1a3d | Iustin Pop | Mon Oct 26 18:16:09 2009 STEP 6/6 Removing old storage |
739 | c71a1a3d | Iustin Pop | Mon Oct 26 18:16:09 2009 - INFO: Remove logical volumes for disk/0 |
740 | c71a1a3d | Iustin Pop | Mon Oct 26 18:16:10 2009 Replacing disk(s) 0 for instance4 |
741 | c71a1a3d | Iustin Pop | Mon Oct 26 18:16:10 2009 STEP 1/6 Check device existence |
742 | c71a1a3d | Iustin Pop | … |
743 | c71a1a3d | Iustin Pop | Mon Oct 26 18:16:18 2009 STEP 6/6 Removing old storage |
744 | c71a1a3d | Iustin Pop | Mon Oct 26 18:16:18 2009 - INFO: Remove logical volumes for disk/0 |
745 | c71a1a3d | Iustin Pop | node1# |
746 | c71a1a3d | Iustin Pop | |
747 | c71a1a3d | Iustin Pop | As this point, all instances should be healthy again. |
748 | c71a1a3d | Iustin Pop | |
749 | c71a1a3d | Iustin Pop | .. note:: Ganeti 2.0 doesn't have the ``-a`` option to replace-disks, so |
750 | c71a1a3d | Iustin Pop | for it you have to run the loop twice, once over primary instances |
751 | c71a1a3d | Iustin Pop | with argument ``-p`` and once secondary instances with argument |
752 | c71a1a3d | Iustin Pop | ``-s``, but otherwise the operations are similar:: |
753 | c71a1a3d | Iustin Pop | |
754 | c71a1a3d | Iustin Pop | node1# gnt-instance replace-disks -p instance1 |
755 | c71a1a3d | Iustin Pop | … |
756 | c71a1a3d | Iustin Pop | node1# for i in instance{2..4}; do gnt-instance replace-disks -s $i; done |
757 | c71a1a3d | Iustin Pop | |
758 | c71a1a3d | Iustin Pop | Common cluster problems |
759 | c71a1a3d | Iustin Pop | ----------------------- |
760 | c71a1a3d | Iustin Pop | |
761 | c71a1a3d | Iustin Pop | There are a number of small issues that might appear on a cluster that |
762 | c71a1a3d | Iustin Pop | can be solved easily as long as the issue is properly identified. For |
763 | c71a1a3d | Iustin Pop | this exercise we will consider the case of node3, which was broken |
764 | c71a1a3d | Iustin Pop | previously and re-added to the cluster without reinstallation. Running |
765 | c71a1a3d | Iustin Pop | cluster verify on the cluster reports:: |
766 | c71a1a3d | Iustin Pop | |
767 | c71a1a3d | Iustin Pop | node1# gnt-cluster verify |
768 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:08 2009 * Verifying global settings |
769 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:08 2009 * Gathering data (3 nodes) |
770 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 * Verifying node status |
771 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 - ERROR: node node3: unallocated drbd minor 0 is in use |
772 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 - ERROR: node node3: unallocated drbd minor 1 is in use |
773 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 * Verifying instance status |
774 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 - ERROR: instance instance4: instance should not run on node node3 |
775 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 * Verifying orphan volumes |
776 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 - ERROR: node node3: volume 22459cf8-117d-4bea-a1aa-791667d07800.disk0_data is unknown |
777 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 - ERROR: node node3: volume 1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data is unknown |
778 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 - ERROR: node node3: volume 1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta is unknown |
779 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 - ERROR: node node3: volume 22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta is unknown |
780 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 * Verifying remaining instances |
781 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 * Verifying N+1 Memory redundancy |
782 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 * Other Notes |
783 | c71a1a3d | Iustin Pop | Mon Oct 26 18:30:10 2009 * Hooks Results |
784 | c71a1a3d | Iustin Pop | node1# |
785 | c71a1a3d | Iustin Pop | |
786 | c71a1a3d | Iustin Pop | Instance status |
787 | c71a1a3d | Iustin Pop | +++++++++++++++ |
788 | c71a1a3d | Iustin Pop | |
789 | c71a1a3d | Iustin Pop | As you can see, *instance4* has a copy running on node3, because we |
790 | c71a1a3d | Iustin Pop | forced the failover when node3 failed. This case is dangerous as the |
791 | c71a1a3d | Iustin Pop | instance will have the same IP and MAC address, wreaking havok on the |
792 | c71a1a3d | Iustin Pop | network environment and anyone who tries to use it. |
793 | c71a1a3d | Iustin Pop | |
794 | c71a1a3d | Iustin Pop | Ganeti doesn't directly handle this case. It is recommended to logon to |
795 | c71a1a3d | Iustin Pop | node3 and run:: |
796 | c71a1a3d | Iustin Pop | |
797 | c71a1a3d | Iustin Pop | node3# xm destroy instance4 |
798 | c71a1a3d | Iustin Pop | |
799 | c71a1a3d | Iustin Pop | Unallocated DRBD minors |
800 | c71a1a3d | Iustin Pop | +++++++++++++++++++++++ |
801 | c71a1a3d | Iustin Pop | |
802 | c71a1a3d | Iustin Pop | There are still unallocated DRBD minors on node3. Again, these are not |
803 | c71a1a3d | Iustin Pop | handled by Ganeti directly and need to be cleaned up via DRBD commands:: |
804 | c71a1a3d | Iustin Pop | |
805 | c71a1a3d | Iustin Pop | node3# drbdsetup /dev/drbd0 down |
806 | c71a1a3d | Iustin Pop | node3# drbdsetup /dev/drbd1 down |
807 | c71a1a3d | Iustin Pop | node3# |
808 | c71a1a3d | Iustin Pop | |
809 | c71a1a3d | Iustin Pop | Orphan volumes |
810 | c71a1a3d | Iustin Pop | ++++++++++++++ |
811 | c71a1a3d | Iustin Pop | |
812 | c71a1a3d | Iustin Pop | At this point, the only remaining problem should be the so-called |
813 | c71a1a3d | Iustin Pop | *orphan* volumes. This can happen also in the case of an aborted |
814 | c71a1a3d | Iustin Pop | disk-replace, or similar situation where Ganeti was not able to recover |
815 | c71a1a3d | Iustin Pop | automatically. Here you need to remove them manually via LVM commands:: |
816 | c71a1a3d | Iustin Pop | |
817 | c71a1a3d | Iustin Pop | node3# lvremove xenvg |
818 | c71a1a3d | Iustin Pop | Do you really want to remove active logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_data"? [y/n]: y |
819 | c71a1a3d | Iustin Pop | Logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_data" successfully removed |
820 | c71a1a3d | Iustin Pop | Do you really want to remove active logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta"? [y/n]: y |
821 | c71a1a3d | Iustin Pop | Logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta" successfully removed |
822 | c71a1a3d | Iustin Pop | Do you really want to remove active logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data"? [y/n]: y |
823 | c71a1a3d | Iustin Pop | Logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data" successfully removed |
824 | c71a1a3d | Iustin Pop | Do you really want to remove active logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta"? [y/n]: y |
825 | c71a1a3d | Iustin Pop | Logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta" successfully removed |
826 | c71a1a3d | Iustin Pop | node3# |
827 | c71a1a3d | Iustin Pop | |
828 | c71a1a3d | Iustin Pop | At this point cluster verify shouldn't complain anymore:: |
829 | c71a1a3d | Iustin Pop | |
830 | c71a1a3d | Iustin Pop | node1# gnt-cluster verify |
831 | c71a1a3d | Iustin Pop | Mon Oct 26 18:37:51 2009 * Verifying global settings |
832 | c71a1a3d | Iustin Pop | Mon Oct 26 18:37:51 2009 * Gathering data (3 nodes) |
833 | c71a1a3d | Iustin Pop | Mon Oct 26 18:37:53 2009 * Verifying node status |
834 | c71a1a3d | Iustin Pop | Mon Oct 26 18:37:53 2009 * Verifying instance status |
835 | c71a1a3d | Iustin Pop | Mon Oct 26 18:37:53 2009 * Verifying orphan volumes |
836 | c71a1a3d | Iustin Pop | Mon Oct 26 18:37:53 2009 * Verifying remaining instances |
837 | c71a1a3d | Iustin Pop | Mon Oct 26 18:37:53 2009 * Verifying N+1 Memory redundancy |
838 | c71a1a3d | Iustin Pop | Mon Oct 26 18:37:53 2009 * Other Notes |
839 | c71a1a3d | Iustin Pop | Mon Oct 26 18:37:53 2009 * Hooks Results |
840 | c71a1a3d | Iustin Pop | node1# |
841 | c71a1a3d | Iustin Pop | |
842 | c71a1a3d | Iustin Pop | N+1 errors |
843 | c71a1a3d | Iustin Pop | ++++++++++ |
844 | c71a1a3d | Iustin Pop | |
845 | c71a1a3d | Iustin Pop | Since redundant instances in Ganeti have a primary/secondary model, it |
846 | c71a1a3d | Iustin Pop | is needed to leave aside on each node enough memory so that if one of |
847 | c71a1a3d | Iustin Pop | its peer node fails, all the secondary instances that have that node as |
848 | c71a1a3d | Iustin Pop | primary can be relocated. More specifically, if instance2 has node1 as |
849 | c71a1a3d | Iustin Pop | primary and node2 as secondary (and node1 and node2 do not have any |
850 | c71a1a3d | Iustin Pop | other instances in this layout), then it means that node2 must have |
851 | c71a1a3d | Iustin Pop | enough free memory so that if node1 fails, we can failover instance2 |
852 | c71a1a3d | Iustin Pop | without any other operations (for reducing the downtime window). Let's |
853 | c71a1a3d | Iustin Pop | increase the memory of the current instances to 4G, and add three new |
854 | c71a1a3d | Iustin Pop | instances, two on node2:node3 with 8GB of RAM and one on node1:node2, |
855 | c71a1a3d | Iustin Pop | with 12GB of RAM (numbers chosen so that we run out of memory):: |
856 | c71a1a3d | Iustin Pop | |
857 | c71a1a3d | Iustin Pop | node1# gnt-instance modify -B memory=4G instance1 |
858 | c71a1a3d | Iustin Pop | Modified instance instance1 |
859 | c71a1a3d | Iustin Pop | - be/memory -> 4096 |
860 | c71a1a3d | Iustin Pop | Please don't forget that these parameters take effect only at the next start of the instance. |
861 | c71a1a3d | Iustin Pop | node1# gnt-instance modify … |
862 | c71a1a3d | Iustin Pop | |
863 | c71a1a3d | Iustin Pop | node1# gnt-instance add -t drbd -n node2:node3 -s 512m -B memory=8G -o debootstrap instance5 |
864 | c71a1a3d | Iustin Pop | … |
865 | c71a1a3d | Iustin Pop | node1# gnt-instance add -t drbd -n node2:node3 -s 512m -B memory=8G -o debootstrap instance6 |
866 | c71a1a3d | Iustin Pop | … |
867 | c71a1a3d | Iustin Pop | node1# gnt-instance add -t drbd -n node1:node2 -s 512m -B memory=8G -o debootstrap instance7 |
868 | c71a1a3d | Iustin Pop | node1# gnt-instance reboot --all |
869 | c71a1a3d | Iustin Pop | The reboot will operate on 7 instances. |
870 | c71a1a3d | Iustin Pop | Do you want to continue? |
871 | c71a1a3d | Iustin Pop | Affected instances: |
872 | c71a1a3d | Iustin Pop | instance1 |
873 | c71a1a3d | Iustin Pop | instance2 |
874 | c71a1a3d | Iustin Pop | instance3 |
875 | c71a1a3d | Iustin Pop | instance4 |
876 | c71a1a3d | Iustin Pop | instance5 |
877 | c71a1a3d | Iustin Pop | instance6 |
878 | c71a1a3d | Iustin Pop | instance7 |
879 | c71a1a3d | Iustin Pop | y/[n]/?: y |
880 | c71a1a3d | Iustin Pop | Submitted jobs 677, 678, 679, 680, 681, 682, 683 |
881 | c71a1a3d | Iustin Pop | Waiting for job 677 for instance1... |
882 | c71a1a3d | Iustin Pop | Waiting for job 678 for instance2... |
883 | c71a1a3d | Iustin Pop | Waiting for job 679 for instance3... |
884 | c71a1a3d | Iustin Pop | Waiting for job 680 for instance4... |
885 | c71a1a3d | Iustin Pop | Waiting for job 681 for instance5... |
886 | c71a1a3d | Iustin Pop | Waiting for job 682 for instance6... |
887 | c71a1a3d | Iustin Pop | Waiting for job 683 for instance7... |
888 | c71a1a3d | Iustin Pop | node1# |
889 | c71a1a3d | Iustin Pop | |
890 | c71a1a3d | Iustin Pop | We rebooted instances for the memory changes to have effect. Now the |
891 | c71a1a3d | Iustin Pop | cluster looks like:: |
892 | c71a1a3d | Iustin Pop | |
893 | c71a1a3d | Iustin Pop | node1# gnt-node list |
894 | c71a1a3d | Iustin Pop | Node DTotal DFree MTotal MNode MFree Pinst Sinst |
895 | c71a1a3d | Iustin Pop | node1 1.3T 1.3T 32.0G 1.0G 6.5G 4 1 |
896 | c71a1a3d | Iustin Pop | node2 1.3T 1.3T 32.0G 1.0G 10.5G 3 4 |
897 | c71a1a3d | Iustin Pop | node3 1.3T 1.3T 32.0G 1.0G 30.5G 0 2 |
898 | c71a1a3d | Iustin Pop | node1# gnt-cluster verify |
899 | c71a1a3d | Iustin Pop | Mon Oct 26 18:59:36 2009 * Verifying global settings |
900 | c71a1a3d | Iustin Pop | Mon Oct 26 18:59:36 2009 * Gathering data (3 nodes) |
901 | c71a1a3d | Iustin Pop | Mon Oct 26 18:59:37 2009 * Verifying node status |
902 | c71a1a3d | Iustin Pop | Mon Oct 26 18:59:37 2009 * Verifying instance status |
903 | c71a1a3d | Iustin Pop | Mon Oct 26 18:59:37 2009 * Verifying orphan volumes |
904 | c71a1a3d | Iustin Pop | Mon Oct 26 18:59:37 2009 * Verifying remaining instances |
905 | c71a1a3d | Iustin Pop | Mon Oct 26 18:59:37 2009 * Verifying N+1 Memory redundancy |
906 | c71a1a3d | Iustin Pop | Mon Oct 26 18:59:37 2009 - ERROR: node node2: not enough memory on to accommodate failovers should peer node node1 fail |
907 | c71a1a3d | Iustin Pop | Mon Oct 26 18:59:37 2009 * Other Notes |
908 | c71a1a3d | Iustin Pop | Mon Oct 26 18:59:37 2009 * Hooks Results |
909 | c71a1a3d | Iustin Pop | node1# |
910 | c71a1a3d | Iustin Pop | |
911 | c71a1a3d | Iustin Pop | The cluster verify error above shows that if node1 fails, node2 will not |
912 | c71a1a3d | Iustin Pop | have enough memory to failover all primary instances on node1 to it. To |
913 | c71a1a3d | Iustin Pop | solve this, you have a number of options: |
914 | c71a1a3d | Iustin Pop | |
915 | c71a1a3d | Iustin Pop | - try to manually move instances around (but this can become complicated |
916 | c71a1a3d | Iustin Pop | for any non-trivial cluster) |
917 | c71a1a3d | Iustin Pop | - try to reduce memory of some instances to accommodate the available |
918 | c71a1a3d | Iustin Pop | node memory |
919 | c71a1a3d | Iustin Pop | - if you have the ganeti-htools package installed, you can run the |
920 | c71a1a3d | Iustin Pop | ``hbal`` tool which will try to compute an automated cluster solution |
921 | c71a1a3d | Iustin Pop | that complies with the N+1 rule |
922 | c71a1a3d | Iustin Pop | |
923 | c71a1a3d | Iustin Pop | Network issues |
924 | c71a1a3d | Iustin Pop | ++++++++++++++ |
925 | c71a1a3d | Iustin Pop | |
926 | c71a1a3d | Iustin Pop | In case a node has problems with the network (usually the secondary |
927 | c71a1a3d | Iustin Pop | network, as problems with the primary network will render the node |
928 | c71a1a3d | Iustin Pop | unusable for ganeti commands), it will show up in cluster verify as:: |
929 | c71a1a3d | Iustin Pop | |
930 | c71a1a3d | Iustin Pop | node1# gnt-cluster verify |
931 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:19 2009 * Verifying global settings |
932 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:19 2009 * Gathering data (3 nodes) |
933 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 * Verifying node status |
934 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 - ERROR: node node1: tcp communication with node 'node3': failure using the secondary interface(s) |
935 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 - ERROR: node node2: tcp communication with node 'node3': failure using the secondary interface(s) |
936 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 - ERROR: node node3: tcp communication with node 'node1': failure using the secondary interface(s) |
937 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 - ERROR: node node3: tcp communication with node 'node2': failure using the secondary interface(s) |
938 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 - ERROR: node node3: tcp communication with node 'node3': failure using the secondary interface(s) |
939 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 * Verifying instance status |
940 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 * Verifying orphan volumes |
941 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 * Verifying remaining instances |
942 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 * Verifying N+1 Memory redundancy |
943 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 * Other Notes |
944 | c71a1a3d | Iustin Pop | Mon Oct 26 19:07:23 2009 * Hooks Results |
945 | c71a1a3d | Iustin Pop | node1# |
946 | c71a1a3d | Iustin Pop | |
947 | c71a1a3d | Iustin Pop | This shows that both node1 and node2 have problems contacting node3 over |
948 | c71a1a3d | Iustin Pop | the secondary network, and node3 has problems contacting them. From this |
949 | c71a1a3d | Iustin Pop | output is can be deduced that since node1 and node2 can communicate |
950 | c71a1a3d | Iustin Pop | between themselves, node3 is the one having problems, and you need to |
951 | c71a1a3d | Iustin Pop | investigate its network settings/connection. |
952 | c71a1a3d | Iustin Pop | |
953 | c71a1a3d | Iustin Pop | Migration problems |
954 | c71a1a3d | Iustin Pop | ++++++++++++++++++ |
955 | c71a1a3d | Iustin Pop | |
956 | c71a1a3d | Iustin Pop | Since live migration can sometimes fail and leave the instance in an |
957 | c71a1a3d | Iustin Pop | inconsistent state, Ganeti provides a ``--cleanup`` argument to the |
958 | c71a1a3d | Iustin Pop | migrate command that does: |
959 | c71a1a3d | Iustin Pop | |
960 | c71a1a3d | Iustin Pop | - check on which node the instance is actually running (has the |
961 | c71a1a3d | Iustin Pop | command failed before or after the actual migration?) |
962 | c71a1a3d | Iustin Pop | - reconfigure the DRBD disks accordingly |
963 | c71a1a3d | Iustin Pop | |
964 | c71a1a3d | Iustin Pop | It is always safe to run this command as long as the instance has good |
965 | c71a1a3d | Iustin Pop | data on its primary node (i.e. not showing as degraded). If so, you can |
966 | c71a1a3d | Iustin Pop | simply run:: |
967 | c71a1a3d | Iustin Pop | |
968 | c71a1a3d | Iustin Pop | node1# gnt-instance migrate --cleanup instance1 |
969 | c71a1a3d | Iustin Pop | Instance instance1 will be recovered from a failed migration. Note |
970 | c71a1a3d | Iustin Pop | that the migration procedure (including cleanup) is **experimental** |
971 | c71a1a3d | Iustin Pop | in this version. This might impact the instance if anything goes |
972 | c71a1a3d | Iustin Pop | wrong. Continue? |
973 | c71a1a3d | Iustin Pop | y/[n]/?: y |
974 | c71a1a3d | Iustin Pop | Mon Oct 26 19:13:49 2009 Migrating instance instance1 |
975 | c71a1a3d | Iustin Pop | Mon Oct 26 19:13:49 2009 * checking where the instance actually runs (if this hangs, the hypervisor might be in a bad state) |
976 | c71a1a3d | Iustin Pop | Mon Oct 26 19:13:49 2009 * instance confirmed to be running on its primary node (node2) |
977 | c71a1a3d | Iustin Pop | Mon Oct 26 19:13:49 2009 * switching node node1 to secondary mode |
978 | c71a1a3d | Iustin Pop | Mon Oct 26 19:13:50 2009 * wait until resync is done |
979 | c71a1a3d | Iustin Pop | Mon Oct 26 19:13:50 2009 * changing into standalone mode |
980 | c71a1a3d | Iustin Pop | Mon Oct 26 19:13:50 2009 * changing disks into single-master mode |
981 | c71a1a3d | Iustin Pop | Mon Oct 26 19:13:50 2009 * wait until resync is done |
982 | c71a1a3d | Iustin Pop | Mon Oct 26 19:13:51 2009 * done |
983 | c71a1a3d | Iustin Pop | node1# |
984 | c71a1a3d | Iustin Pop | |
985 | c71a1a3d | Iustin Pop | In use disks at instance shutdown |
986 | c71a1a3d | Iustin Pop | +++++++++++++++++++++++++++++++++ |
987 | c71a1a3d | Iustin Pop | |
988 | c71a1a3d | Iustin Pop | If you see something like the following when trying to shutdown or |
989 | c71a1a3d | Iustin Pop | deactivate disks for an instance:: |
990 | c71a1a3d | Iustin Pop | |
991 | c71a1a3d | Iustin Pop | node1# gnt-instance shutdown instance1 |
992 | c71a1a3d | Iustin Pop | Mon Oct 26 19:16:23 2009 - WARNING: Could not shutdown block device disk/0 on node node2: drbd0: can't shutdown drbd device: /dev/drbd0: State change failed: (-12) Device is held open by someone\n |
993 | c71a1a3d | Iustin Pop | |
994 | c71a1a3d | Iustin Pop | It most likely means something is holding open the underlying DRBD |
995 | c71a1a3d | Iustin Pop | device. This can be bad if the instance is not running, as it might mean |
996 | c71a1a3d | Iustin Pop | that there was concurrent access from both the node and the instance to |
997 | c71a1a3d | Iustin Pop | the disks, but not always (e.g. you could only have had the partitions |
998 | c71a1a3d | Iustin Pop | activated via ``kpartx``). |
999 | c71a1a3d | Iustin Pop | |
1000 | c71a1a3d | Iustin Pop | To troubleshoot this issue you need to follow standard Linux practices, |
1001 | c71a1a3d | Iustin Pop | and pay attention to the hypervisor being used: |
1002 | c71a1a3d | Iustin Pop | |
1003 | c71a1a3d | Iustin Pop | - check if (in the above example) ``/dev/drbd0`` on node2 is being |
1004 | c71a1a3d | Iustin Pop | mounted somewhere (``cat /proc/mounts``) |
1005 | c71a1a3d | Iustin Pop | - check if the device is not being used by device mapper itself: |
1006 | c71a1a3d | Iustin Pop | ``dmsetup ls`` and look for entries of the form ``drbd0pX``, and if so |
1007 | c71a1a3d | Iustin Pop | remove them with either ``kpartx -d`` or ``dmsetup remove`` |
1008 | c71a1a3d | Iustin Pop | |
1009 | c71a1a3d | Iustin Pop | For Xen, check if it's not using the disks itself:: |
1010 | c71a1a3d | Iustin Pop | |
1011 | c71a1a3d | Iustin Pop | node1# xenstore-ls /local/domain/0/backend/vbd|grep -e "domain =" -e physical-device |
1012 | c71a1a3d | Iustin Pop | domain = "instance2" |
1013 | c71a1a3d | Iustin Pop | physical-device = "93:0" |
1014 | c71a1a3d | Iustin Pop | domain = "instance3" |
1015 | c71a1a3d | Iustin Pop | physical-device = "93:1" |
1016 | c71a1a3d | Iustin Pop | domain = "instance4" |
1017 | c71a1a3d | Iustin Pop | physical-device = "93:2" |
1018 | c71a1a3d | Iustin Pop | node1# |
1019 | c71a1a3d | Iustin Pop | |
1020 | c71a1a3d | Iustin Pop | You can see in the above output that the node exports three disks, to |
1021 | c71a1a3d | Iustin Pop | three instances. The ``physical-device`` key is in major:minor format in |
1022 | c71a1a3d | Iustin Pop | hexadecimal, and 0x93 represents DRBD's major number. Thus we can see |
1023 | c71a1a3d | Iustin Pop | from the above that instance2 has /dev/drbd0, instance3 /dev/drbd1, and |
1024 | c71a1a3d | Iustin Pop | instance4 /dev/drbd2. |
1025 | c71a1a3d | Iustin Pop | |
1026 | c71a1a3d | Iustin Pop | .. vim: set textwidth=72 : |
1027 | c71a1a3d | Iustin Pop | .. Local Variables: |
1028 | c71a1a3d | Iustin Pop | .. mode: rst |
1029 | c71a1a3d | Iustin Pop | .. fill-column: 72 |
1030 | c71a1a3d | Iustin Pop | .. End: |