Revision 395aa879

b/doc/design-2.1.rst
285 285
doesn't have a ganeti provided script, so nothing will be done for that
286 286
hypervisor)
287 287

  
288

  
289
Automated disk repairs infrastructure
290
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
291

  
292
Replacing defective disks in an automated fashion is quite difficult with the
293
current version of Ganeti. These changes will introduce additional
294
functionality and interfaces to simplify automating disk replacements on a
295
Ganeti node.
296

  
297
Fix node volume group
298
+++++++++++++++++++++
299

  
300
This is the most difficult addition, as it can lead to dataloss if it's not
301
properly safeguarded.
302

  
303
The operation must be done only when all the other nodes that have instances in
304
common with the target node are fine, i.e. this is the only node with problems,
305
and also we have to double-check that all instances on this node have at least
306
a good copy of the data.
307

  
308
This might mean that we have to enhance the GetMirrorStatus calls, and
309
introduce and a smarter version that can tell us more about the status of an
310
instance.
311

  
312
Stop allocation on a given PV
313
+++++++++++++++++++++++++++++
314

  
315
This is somewhat simple. First we need a "list PVs" opcode (and its associated
316
logical unit) and then a set PV status opcode/LU. These in combination should
317
allow both checking and changing the disk/PV status.
318

  
319
Instance disk status
320
++++++++++++++++++++
321

  
322
This new opcode or opcode change must list the instance-disk-index and node
323
combinations of the instance together with their status. This will allow
324
determining what part of the instance is broken (if any).
325

  
326
Repair instance
327
+++++++++++++++
328

  
329
This new opcode/LU/RAPI call will run ``replace-disks -p`` as needed, in order
330
to fix the instance status. It only affects primary instances; secondaries can
331
just be moved away.
332

  
333
Migrate node
334
++++++++++++
335

  
336
This new opcode/LU/RAPI call will take over the current ``gnt-node migrate``
337
code and run migrate for all instances on the node.
338

  
339
Evacuate node
340
++++++++++++++
341

  
342
This new opcode/LU/RAPI call will take over the current ``gnt-node evacuate``
343
code and run replace-secondary with an iallocator script for all instances on
344
the node.
345

  
346

  
288 347
External interface changes
289 348
--------------------------
290 349

  

Also available in: Unified diff