/doc/walkthrough.rst - Annotate - snf-ganeti - Greek Research and Technology Network's projects

Iustin Pop

Ganeti walk-through

2

c71a1a3d

Iustin Pop

===================

3

c71a1a3d

Iustin Pop

4

c71a1a3d

Iustin Pop

Documents Ganeti version |version|

5

c71a1a3d

Iustin Pop

6

c71a1a3d

Iustin Pop

.. contents::

7

c71a1a3d

Iustin Pop

8

c71a1a3d

Iustin Pop

.. highlight:: text

9

c71a1a3d

Iustin Pop

10

c71a1a3d

Iustin Pop

Introduction

11

c71a1a3d

Iustin Pop

------------

12

c71a1a3d

Iustin Pop

13

c71a1a3d

Iustin Pop

This document serves as a more example-oriented guide to Ganeti; while

14

c71a1a3d

Iustin Pop

the administration guide shows a conceptual approach, here you will find

15

c71a1a3d

Iustin Pop

a step-by-step example to managing instances and the cluster.

16

c71a1a3d

Iustin Pop

17

c71a1a3d

Iustin Pop

Our simulated, example cluster will have three machines, named

18

c71a1a3d

Iustin Pop

``node1``, ``node2``, ``node3``. Note that in real life machines will

19

c71a1a3d

Iustin Pop

usually FQDNs but here we use short names for brevity. We will use a

20

c71a1a3d

Iustin Pop

secondary network for replication data, ``192.168.2.0/24``, with nodes

21

c71a1a3d

Iustin Pop

having the last octet the same as their index. The cluster name will be

22

c71a1a3d

Iustin Pop

``example-cluster``. All nodes have the same simulated hardware

23

c71a1a3d

Iustin Pop

configuration, two disks of 750GB, 32GB of memory and 4 CPUs.

24

c71a1a3d

Iustin Pop

25

c71a1a3d

Iustin Pop

On this cluster, we will create up to seven instances, named

26

c71a1a3d

Iustin Pop

``instance1`` to ``instance7``.

27

c71a1a3d

Iustin Pop

28

c71a1a3d

Iustin Pop

29

c71a1a3d

Iustin Pop

Cluster creation

30

c71a1a3d

Iustin Pop

----------------

31

c71a1a3d

Iustin Pop

32

c71a1a3d

Iustin Pop

Follow the :doc:`install` document and prepare the nodes. Then it's time

33

c71a1a3d

Iustin Pop

to initialise the cluster::

34

c71a1a3d

Iustin Pop

35

c71a1a3d

Iustin Pop

  node1# gnt-cluster init -s 192.168.2.1 --enabled-hypervisors=xen-pvm cluster

36

c71a1a3d

Iustin Pop

  node1#

37

c71a1a3d

Iustin Pop

38

c71a1a3d

Iustin Pop

The creation was fine. Let's check that one node we have is functioning

39

c71a1a3d

Iustin Pop

correctly::

40

c71a1a3d

Iustin Pop

41

c71a1a3d

Iustin Pop

  node1# gnt-node list

42

c71a1a3d

Iustin Pop

  Node  DTotal DFree MTotal MNode MFree Pinst Sinst

43

c71a1a3d

Iustin Pop

  node1   1.3T  1.3T  32.0G  1.0G 30.5G     0     0

44

c71a1a3d

Iustin Pop

  node1# gnt-cluster verify

45

c71a1a3d

Iustin Pop

  Mon Oct 26 02:08:51 2009 * Verifying global settings

46

c71a1a3d

Iustin Pop

  Mon Oct 26 02:08:51 2009 * Gathering data (1 nodes)

47

c71a1a3d

Iustin Pop

  Mon Oct 26 02:08:52 2009 * Verifying node status

48

c71a1a3d

Iustin Pop

  Mon Oct 26 02:08:52 2009 * Verifying instance status

49

c71a1a3d

Iustin Pop

  Mon Oct 26 02:08:52 2009 * Verifying orphan volumes

50

c71a1a3d

Iustin Pop

  Mon Oct 26 02:08:52 2009 * Verifying remaining instances

51

c71a1a3d

Iustin Pop

  Mon Oct 26 02:08:52 2009 * Verifying N+1 Memory redundancy

52

c71a1a3d

Iustin Pop

  Mon Oct 26 02:08:52 2009 * Other Notes

53

c71a1a3d

Iustin Pop

  Mon Oct 26 02:08:52 2009 * Hooks Results

54

c71a1a3d

Iustin Pop

  node1#

55

c71a1a3d

Iustin Pop

56

c71a1a3d

Iustin Pop

Since this proceeded correctly, let's add the other two nodes::

57

c71a1a3d

Iustin Pop

58

c71a1a3d

Iustin Pop

  node1# gnt-node add -s 192.168.2.2 node2

59

c71a1a3d

Iustin Pop

  -- WARNING --

60

c71a1a3d

Iustin Pop

  Performing this operation is going to replace the ssh daemon keypair

61

c71a1a3d

Iustin Pop

  on the target machine (node2) with the ones of the current one

62

c71a1a3d

Iustin Pop

  and grant full intra-cluster ssh root access to/from it

63

c71a1a3d

Iustin Pop

64

c71a1a3d

Iustin Pop

  The authenticity of host 'node2 (192.168.1.2)' can't be established.

65

c71a1a3d

Iustin Pop

  RSA key fingerprint is 9f:…

66

c71a1a3d

Iustin Pop

  Are you sure you want to continue connecting (yes/no)? yes

67

c71a1a3d

Iustin Pop

  root@node2's password:

68

c71a1a3d

Iustin Pop

  Mon Oct 26 02:11:54 2009  - INFO: Node will be a master candidate

69

c71a1a3d

Iustin Pop

  node1# gnt-node add -s 192.168.2.3 node3

70

c71a1a3d

Iustin Pop

  -- WARNING --

71

c71a1a3d

Iustin Pop

  Performing this operation is going to replace the ssh daemon keypair

72

c71a1a3d

Iustin Pop

  on the target machine (node2) with the ones of the current one

73

c71a1a3d

Iustin Pop

  and grant full intra-cluster ssh root access to/from it

74

c71a1a3d

Iustin Pop

75

c71a1a3d

Iustin Pop

  The authenticity of host 'node3 (192.168.1.3)' can't be established.

76

c71a1a3d

Iustin Pop

  RSA key fingerprint is 9f:…

77

c71a1a3d

Iustin Pop

  Are you sure you want to continue connecting (yes/no)? yes

78

c71a1a3d

Iustin Pop

  root@node2's password:

79

c71a1a3d

Iustin Pop

  Mon Oct 26 02:11:54 2009  - INFO: Node will be a master candidate

80

c71a1a3d

Iustin Pop

81

c71a1a3d

Iustin Pop

Checking the cluster status again::

82

c71a1a3d

Iustin Pop

83

c71a1a3d

Iustin Pop

  node1# gnt-node list

84

c71a1a3d

Iustin Pop

  Node  DTotal DFree MTotal MNode MFree Pinst Sinst

85

c71a1a3d

Iustin Pop

  node1   1.3T  1.3T  32.0G  1.0G 30.5G     0     0

86

c71a1a3d

Iustin Pop

  node2   1.3T  1.3T  32.0G  1.0G 30.5G     0     0

87

c71a1a3d

Iustin Pop

  node3   1.3T  1.3T  32.0G  1.0G 30.5G     0     0

88

c71a1a3d

Iustin Pop

  node1# gnt-cluster verify

89

c71a1a3d

Iustin Pop

  Mon Oct 26 02:15:14 2009 * Verifying global settings

90

c71a1a3d

Iustin Pop

  Mon Oct 26 02:15:14 2009 * Gathering data (3 nodes)

91

c71a1a3d

Iustin Pop

  Mon Oct 26 02:15:16 2009 * Verifying node status

92

c71a1a3d

Iustin Pop

  Mon Oct 26 02:15:16 2009 * Verifying instance status

93

c71a1a3d

Iustin Pop

  Mon Oct 26 02:15:16 2009 * Verifying orphan volumes

94

c71a1a3d

Iustin Pop

  Mon Oct 26 02:15:16 2009 * Verifying remaining instances

95

c71a1a3d

Iustin Pop

  Mon Oct 26 02:15:16 2009 * Verifying N+1 Memory redundancy

96

c71a1a3d

Iustin Pop

  Mon Oct 26 02:15:16 2009 * Other Notes

97

c71a1a3d

Iustin Pop

  Mon Oct 26 02:15:16 2009 * Hooks Results

98

c71a1a3d

Iustin Pop

  node1#

99

c71a1a3d

Iustin Pop

100

c71a1a3d

Iustin Pop

And let's check that we have a valid OS::

101

c71a1a3d

Iustin Pop

102

c71a1a3d

Iustin Pop

  node1# gnt-os list

103

c71a1a3d

Iustin Pop

  Name

104

c71a1a3d

Iustin Pop

  debootstrap

105

c71a1a3d

Iustin Pop

  node1#

106

c71a1a3d

Iustin Pop

107

c71a1a3d

Iustin Pop

Running a burnin

108

c71a1a3d

Iustin Pop

----------------

109

c71a1a3d

Iustin Pop

110

c71a1a3d

Iustin Pop

Now that the cluster is created, it is time to check that the hardware

111

c71a1a3d

Iustin Pop

works correctly, that the hypervisor can actually create instances,

112

c71a1a3d

Iustin Pop

etc. This is done via the debootstrap tool as described in the admin

113

c71a1a3d

Iustin Pop

guide. Similar output lines are replaced with ``…`` in the below log::

114

c71a1a3d

Iustin Pop

115

c71a1a3d

Iustin Pop

  node1# /usr/lib/ganeti/tools/burnin -o debootstrap -p instance{1..5}

116

c71a1a3d

Iustin Pop

  - Testing global parameters

117

c71a1a3d

Iustin Pop

  - Creating instances

118

c71a1a3d

Iustin Pop

    * instance instance1

119

c71a1a3d

Iustin Pop

      on node1, node2

120

c71a1a3d

Iustin Pop

    * instance instance2

121

c71a1a3d

Iustin Pop

      on node2, node3

122

c71a1a3d

Iustin Pop

…

123

c71a1a3d

Iustin Pop

    * instance instance5

124

c71a1a3d

Iustin Pop

      on node2, node3

125

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 157, 158, 159, 160, 161

126

c71a1a3d

Iustin Pop

      waiting for job 157 for instance1

127

c71a1a3d

Iustin Pop

…

128

c71a1a3d

Iustin Pop

      waiting for job 161 for instance5

129

c71a1a3d

Iustin Pop

  - Replacing disks on the same nodes

130

c71a1a3d

Iustin Pop

    * instance instance1

131

c71a1a3d

Iustin Pop

      run replace_on_secondary

132

c71a1a3d

Iustin Pop

      run replace_on_primary

133

c71a1a3d

Iustin Pop

…

134

c71a1a3d

Iustin Pop

    * instance instance5

135

c71a1a3d

Iustin Pop

      run replace_on_secondary

136

c71a1a3d

Iustin Pop

      run replace_on_primary

137

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 162, 163, 164, 165, 166

138

c71a1a3d

Iustin Pop

      waiting for job 162 for instance1

139

c71a1a3d

Iustin Pop

…

140

c71a1a3d

Iustin Pop

  - Changing the secondary node

141

c71a1a3d

Iustin Pop

    * instance instance1

142

c71a1a3d

Iustin Pop

      run replace_new_secondary node3

143

c71a1a3d

Iustin Pop

    * instance instance2

144

c71a1a3d

Iustin Pop

      run replace_new_secondary node1

145

c71a1a3d

Iustin Pop

…

146

c71a1a3d

Iustin Pop

    * instance instance5

147

c71a1a3d

Iustin Pop

      run replace_new_secondary node1

148

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 167, 168, 169, 170, 171

149

c71a1a3d

Iustin Pop

      waiting for job 167 for instance1

150

c71a1a3d

Iustin Pop

…

151

c71a1a3d

Iustin Pop

  - Growing disks

152

c71a1a3d

Iustin Pop

    * instance instance1

153

c71a1a3d

Iustin Pop

      increase disk/0 by 128 MB

154

c71a1a3d

Iustin Pop

…

155

c71a1a3d

Iustin Pop

    * instance instance5

156

c71a1a3d

Iustin Pop

      increase disk/0 by 128 MB

157

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 173, 174, 175, 176, 177

158

c71a1a3d

Iustin Pop

      waiting for job 173 for instance1

159

c71a1a3d

Iustin Pop

…

160

c71a1a3d

Iustin Pop

  - Failing over instances

161

c71a1a3d

Iustin Pop

    * instance instance1

162

c71a1a3d

Iustin Pop

…

163

c71a1a3d

Iustin Pop

    * instance instance5

164

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 179, 180, 181, 182, 183

165

c71a1a3d

Iustin Pop

      waiting for job 179 for instance1

166

c71a1a3d

Iustin Pop

…

167

c71a1a3d

Iustin Pop

  - Migrating instances

168

c71a1a3d

Iustin Pop

    * instance instance1

169

c71a1a3d

Iustin Pop

      migration and migration cleanup

170

c71a1a3d

Iustin Pop

…

171

c71a1a3d

Iustin Pop

    * instance instance5

172

c71a1a3d

Iustin Pop

      migration and migration cleanup

173

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 184, 185, 186, 187, 188

174

c71a1a3d

Iustin Pop

      waiting for job 184 for instance1

175

c71a1a3d

Iustin Pop

…

176

c71a1a3d

Iustin Pop

  - Exporting and re-importing instances

177

c71a1a3d

Iustin Pop

    * instance instance1

178

c71a1a3d

Iustin Pop

      export to node node3

179

c71a1a3d

Iustin Pop

      remove instance

180

c71a1a3d

Iustin Pop

      import from node3 to node1, node2

181

c71a1a3d

Iustin Pop

      remove export

182

c71a1a3d

Iustin Pop

…

183

c71a1a3d

Iustin Pop

    * instance instance5

184

c71a1a3d

Iustin Pop

      export to node node1

185

c71a1a3d

Iustin Pop

      remove instance

186

c71a1a3d

Iustin Pop

      import from node1 to node2, node3

187

c71a1a3d

Iustin Pop

      remove export

188

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 196, 197, 198, 199, 200

189

c71a1a3d

Iustin Pop

      waiting for job 196 for instance1

190

c71a1a3d

Iustin Pop

…

191

c71a1a3d

Iustin Pop

  - Reinstalling instances

192

c71a1a3d

Iustin Pop

    * instance instance1

193

c71a1a3d

Iustin Pop

      reinstall without passing the OS

194

c71a1a3d

Iustin Pop

      reinstall specifying the OS

195

c71a1a3d

Iustin Pop

…

196

c71a1a3d

Iustin Pop

    * instance instance5

197

c71a1a3d

Iustin Pop

      reinstall without passing the OS

198

c71a1a3d

Iustin Pop

      reinstall specifying the OS

199

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 203, 204, 205, 206, 207

200

c71a1a3d

Iustin Pop

      waiting for job 203 for instance1

201

c71a1a3d

Iustin Pop

…

202

c71a1a3d

Iustin Pop

  - Rebooting instances

203

c71a1a3d

Iustin Pop

    * instance instance1

204

c71a1a3d

Iustin Pop

      reboot with type 'hard'

205

c71a1a3d

Iustin Pop

      reboot with type 'soft'

206

c71a1a3d

Iustin Pop

      reboot with type 'full'

207

c71a1a3d

Iustin Pop

…

208

c71a1a3d

Iustin Pop

    * instance instance5

209

c71a1a3d

Iustin Pop

      reboot with type 'hard'

210

c71a1a3d

Iustin Pop

      reboot with type 'soft'

211

c71a1a3d

Iustin Pop

      reboot with type 'full'

212

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 208, 209, 210, 211, 212

213

c71a1a3d

Iustin Pop

      waiting for job 208 for instance1

214

c71a1a3d

Iustin Pop

…

215

c71a1a3d

Iustin Pop

  - Adding and removing disks

216

c71a1a3d

Iustin Pop

    * instance instance1

217

c71a1a3d

Iustin Pop

      adding a disk

218

c71a1a3d

Iustin Pop

      removing last disk

219

c71a1a3d

Iustin Pop

…

220

c71a1a3d

Iustin Pop

    * instance instance5

221

c71a1a3d

Iustin Pop

      adding a disk

222

c71a1a3d

Iustin Pop

      removing last disk

223

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 213, 214, 215, 216, 217

224

c71a1a3d

Iustin Pop

      waiting for job 213 for instance1

225

c71a1a3d

Iustin Pop

…

226

c71a1a3d

Iustin Pop

  - Adding and removing NICs

227

c71a1a3d

Iustin Pop

    * instance instance1

228

c71a1a3d

Iustin Pop

      adding a NIC

229

c71a1a3d

Iustin Pop

      removing last NIC

230

c71a1a3d

Iustin Pop

…

231

c71a1a3d

Iustin Pop

    * instance instance5

232

c71a1a3d

Iustin Pop

      adding a NIC

233

c71a1a3d

Iustin Pop

      removing last NIC

234

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 218, 219, 220, 221, 222

235

c71a1a3d

Iustin Pop

      waiting for job 218 for instance1

236

c71a1a3d

Iustin Pop

…

237

c71a1a3d

Iustin Pop

  - Activating/deactivating disks

238

c71a1a3d

Iustin Pop

    * instance instance1

239

c71a1a3d

Iustin Pop

      activate disks when online

240

c71a1a3d

Iustin Pop

      activate disks when offline

241

c71a1a3d

Iustin Pop

      deactivate disks (when offline)

242

c71a1a3d

Iustin Pop

…

243

c71a1a3d

Iustin Pop

    * instance instance5

244

c71a1a3d

Iustin Pop

      activate disks when online

245

c71a1a3d

Iustin Pop

      activate disks when offline

246

c71a1a3d

Iustin Pop

      deactivate disks (when offline)

247

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 223, 224, 225, 226, 227

248

c71a1a3d

Iustin Pop

      waiting for job 223 for instance1

249

c71a1a3d

Iustin Pop

…

250

c71a1a3d

Iustin Pop

  - Stopping and starting instances

251

c71a1a3d

Iustin Pop

    * instance instance1

252

c71a1a3d

Iustin Pop

…

253

c71a1a3d

Iustin Pop

    * instance instance5

254

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 230, 231, 232, 233, 234

255

c71a1a3d

Iustin Pop

      waiting for job 230 for instance1

256

c71a1a3d

Iustin Pop

…

257

c71a1a3d

Iustin Pop

  - Removing instances

258

c71a1a3d

Iustin Pop

    * instance instance1

259

c71a1a3d

Iustin Pop

…

260

c71a1a3d

Iustin Pop

    * instance instance5

261

c71a1a3d

Iustin Pop

    * Submitted job ID(s) 235, 236, 237, 238, 239

262

c71a1a3d

Iustin Pop

      waiting for job 235 for instance1

263

c71a1a3d

Iustin Pop

…

264

c71a1a3d

Iustin Pop

  node1#

265

c71a1a3d

Iustin Pop

266

c71a1a3d

Iustin Pop

You can see in the above what operations the burnin does. Ideally, the

267

c71a1a3d

Iustin Pop

burnin log would proceed successfully through all the steps and end

268

c71a1a3d

Iustin Pop

cleanly, without throwing errors.

269

c71a1a3d

Iustin Pop

270

c71a1a3d

Iustin Pop

Instance operations

271

c71a1a3d

Iustin Pop

-------------------

272

c71a1a3d

Iustin Pop

273

c71a1a3d

Iustin Pop

Creation

274

c71a1a3d

Iustin Pop

++++++++

275

c71a1a3d

Iustin Pop

276

c71a1a3d

Iustin Pop

At this point, Ganeti and the hardware seems to be functioning

277

c71a1a3d

Iustin Pop

correctly, so we'll follow up with creating the instances manually::

278

c71a1a3d

Iustin Pop

279

c71a1a3d

Iustin Pop

  node1# gnt-instance add -t drbd -o debootstrap -s 256m -n node1:node2 instance3

280

c71a1a3d

Iustin Pop

  Mon Oct 26 04:06:52 2009  - INFO: Selected nodes for instance instance1 via iallocator hail: node2, node3

281

c71a1a3d

Iustin Pop

  Mon Oct 26 04:06:53 2009 * creating instance disks...

282

c71a1a3d

Iustin Pop

  Mon Oct 26 04:06:57 2009 adding instance instance1 to cluster config

283

c71a1a3d

Iustin Pop

  Mon Oct 26 04:06:57 2009  - INFO: Waiting for instance instance1 to sync disks.

284

c71a1a3d

Iustin Pop

  Mon Oct 26 04:06:57 2009  - INFO: - device disk/0: 20.00% done, 4 estimated seconds remaining

285

c71a1a3d

Iustin Pop

  Mon Oct 26 04:07:01 2009  - INFO: Instance instance1's disks are in sync.

286

c71a1a3d

Iustin Pop

  Mon Oct 26 04:07:01 2009 creating os for instance instance1 on node node2

287

c71a1a3d

Iustin Pop

  Mon Oct 26 04:07:01 2009 * running the instance OS create scripts...

288

c71a1a3d

Iustin Pop

  Mon Oct 26 04:07:14 2009 * starting instance...

289

c71a1a3d

Iustin Pop

  node1# gnt-instance add -t drbd -o debootstrap -s 256m -n node1:node2 instanc<drbd -o debootstrap -s 256m -n node1:node2 instance2

290

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:37 2009 * creating instance disks...

291

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:40 2009 adding instance instance2 to cluster config

292

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:41 2009  - INFO: Waiting for instance instance2 to sync disks.

293

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:41 2009  - INFO: - device disk/0: 35.40% done, 1 estimated seconds remaining

294

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:42 2009  - INFO: - device disk/0: 58.50% done, 1 estimated seconds remaining

295

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:43 2009  - INFO: - device disk/0: 86.20% done, 0 estimated seconds remaining

296

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:44 2009  - INFO: - device disk/0: 92.40% done, 0 estimated seconds remaining

297

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:44 2009  - INFO: - device disk/0: 97.00% done, 0 estimated seconds remaining

298

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:44 2009  - INFO: Instance instance2's disks are in sync.

299

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:44 2009 creating os for instance instance2 on node node1

300

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:44 2009 * running the instance OS create scripts...

301

c71a1a3d

Iustin Pop

  Mon Oct 26 04:11:57 2009 * starting instance...

302

c71a1a3d

Iustin Pop

  node1#

303

c71a1a3d

Iustin Pop

304

c71a1a3d

Iustin Pop

The above shows one instance created via an iallocator script, and one

305

c71a1a3d

Iustin Pop

being created with manual node assignment. The other three instances

306

c71a1a3d

Iustin Pop

were also created and now it's time to check them::

307

c71a1a3d

Iustin Pop

308

c71a1a3d

Iustin Pop

  node1# gnt-instance list

309

c71a1a3d

Iustin Pop

  Instance  Hypervisor OS          Primary_node Status  Memory

310

c71a1a3d

Iustin Pop

  instance1 xen-pvm    debootstrap node2        running   128M

311

c71a1a3d

Iustin Pop

  instance2 xen-pvm    debootstrap node1        running   128M

312

c71a1a3d

Iustin Pop

  instance3 xen-pvm    debootstrap node1        running   128M

313

c71a1a3d

Iustin Pop

  instance4 xen-pvm    debootstrap node3        running   128M

314

c71a1a3d

Iustin Pop

  instance5 xen-pvm    debootstrap node2        running   128M

315

c71a1a3d

Iustin Pop

316

c71a1a3d

Iustin Pop

Accessing instances

317

c71a1a3d

Iustin Pop

+++++++++++++++++++

318

c71a1a3d

Iustin Pop

319

c71a1a3d

Iustin Pop

Accessing an instance's console is easy::

320

c71a1a3d

Iustin Pop

321

c71a1a3d

Iustin Pop

  node1# gnt-instance console instance2

322

c71a1a3d

Iustin Pop

  [    0.000000] Bootdata ok (command line is root=/dev/sda1 ro)

323

c71a1a3d

Iustin Pop

  [    0.000000] Linux version 2.6…

324

c71a1a3d

Iustin Pop

  [    0.000000] BIOS-provided physical RAM map:

325

c71a1a3d

Iustin Pop

  [    0.000000]  Xen: 0000000000000000 - 0000000008800000 (usable)

326

c71a1a3d

Iustin Pop

  [13138176.018071] Built 1 zonelists.  Total pages: 34816

327

c71a1a3d

Iustin Pop

  [13138176.018074] Kernel command line: root=/dev/sda1 ro

328

c71a1a3d

Iustin Pop

  [13138176.018694] Initializing CPU#0

329

c71a1a3d

Iustin Pop

…

330

c71a1a3d

Iustin Pop

  Checking file systems...fsck 1.41.3 (12-Oct-2008)

331

c71a1a3d

Iustin Pop

  done.

332

c71a1a3d

Iustin Pop

  Setting kernel variables (/etc/sysctl.conf)...done.

333

c71a1a3d

Iustin Pop

  Mounting local filesystems...done.

334

c71a1a3d

Iustin Pop

  Activating swapfile swap...done.

335

c71a1a3d

Iustin Pop

  Setting up networking....

336

c71a1a3d

Iustin Pop

  Configuring network interfaces...done.

337

c71a1a3d

Iustin Pop

  Setting console screen modes and fonts.

338

c71a1a3d

Iustin Pop

  INIT: Entering runlevel: 2

339

c71a1a3d

Iustin Pop

  Starting enhanced syslogd: rsyslogd.

340

c71a1a3d

Iustin Pop

  Starting periodic command scheduler: crond.

341

c71a1a3d

Iustin Pop

342

c71a1a3d

Iustin Pop

  Debian GNU/Linux 5.0 instance2 tty1

343

c71a1a3d

Iustin Pop

344

c71a1a3d

Iustin Pop

  instance2 login:

345

c71a1a3d

Iustin Pop

346

c71a1a3d

Iustin Pop

At this moment you can login to the instance and, after configuring the

347

c71a1a3d

Iustin Pop

network (and doing this on all instances), we can check their

348

c71a1a3d

Iustin Pop

connectivity::

349

c71a1a3d

Iustin Pop

350

c71a1a3d

Iustin Pop

  node1# fping instance{1..5}

351

c71a1a3d

Iustin Pop

  instance1 is alive

352

c71a1a3d

Iustin Pop

  instance2 is alive

353

c71a1a3d

Iustin Pop

  instance3 is alive

354

c71a1a3d

Iustin Pop

  instance4 is alive

355

c71a1a3d

Iustin Pop

  instance5 is alive

356

c71a1a3d

Iustin Pop

  node1#

357

c71a1a3d

Iustin Pop

358

c71a1a3d

Iustin Pop

Removal

359

c71a1a3d

Iustin Pop

+++++++

360

c71a1a3d

Iustin Pop

361

c71a1a3d

Iustin Pop

Removing unwanted instances is also easy::

362

c71a1a3d

Iustin Pop

363

c71a1a3d

Iustin Pop

  node1# gnt-instance remove instance5

364

c71a1a3d

Iustin Pop

  This will remove the volumes of the instance instance5 (including

365

c71a1a3d

Iustin Pop

  mirrors), thus removing all the data of the instance. Continue?

366

c71a1a3d

Iustin Pop

  y/[n]/?: y

367

c71a1a3d

Iustin Pop

  node1#

368

c71a1a3d

Iustin Pop

369

c71a1a3d

Iustin Pop

370

c71a1a3d

Iustin Pop

Recovering from hardware failures

371

c71a1a3d

Iustin Pop

---------------------------------

372

c71a1a3d

Iustin Pop

373

c71a1a3d

Iustin Pop

Recovering from node failure

374

c71a1a3d

Iustin Pop

++++++++++++++++++++++++++++

375

c71a1a3d

Iustin Pop

376

c71a1a3d

Iustin Pop

We are now left with four instances. Assume that at this point, node3,

377

c71a1a3d

Iustin Pop

which has one primary and one secondary instance, crashes::

378

c71a1a3d

Iustin Pop

379

c71a1a3d

Iustin Pop

  node1# gnt-node info node3

380

c71a1a3d

Iustin Pop

  Node name: node3

381

c71a1a3d

Iustin Pop

    primary ip: 172.24.227.1

382

c71a1a3d

Iustin Pop

    secondary ip: 192.168.2.3

383

c71a1a3d

Iustin Pop

    master candidate: True

384

c71a1a3d

Iustin Pop

    drained: False

385

c71a1a3d

Iustin Pop

    offline: False

386

c71a1a3d

Iustin Pop

    primary for instances:

387

c71a1a3d

Iustin Pop

      - instance4

388

c71a1a3d

Iustin Pop

    secondary for instances:

389

c71a1a3d

Iustin Pop

      - instance1

390

c71a1a3d

Iustin Pop

  node1# fping node3

391

c71a1a3d

Iustin Pop

  node3 is unreachable

392

c71a1a3d

Iustin Pop

393

c71a1a3d

Iustin Pop

At this point, the primary instance of that node (instance4) is down,

394

c71a1a3d

Iustin Pop

but the secondary instance (instance1) is not affected except it has

395

c71a1a3d

Iustin Pop

lost disk redundancy::

396

c71a1a3d

Iustin Pop

397

c71a1a3d

Iustin Pop

  node1# fping instance{1,4}

398

c71a1a3d

Iustin Pop

  instance1 is alive

399

c71a1a3d

Iustin Pop

  instance4 is unreachable

400

c71a1a3d

Iustin Pop

  node1#

401

c71a1a3d

Iustin Pop

402

c71a1a3d

Iustin Pop

If we try to check the status of instance4 via the instance info

403

c71a1a3d

Iustin Pop

command, it fails because it tries to contact node3 which is down::

404

c71a1a3d

Iustin Pop

405

c71a1a3d

Iustin Pop

  node1# gnt-instance info instance4

406

c71a1a3d

Iustin Pop

  Failure: command execution error:

407

c71a1a3d

Iustin Pop

  Error checking node node3: Connection failed (113: No route to host)

408

c71a1a3d

Iustin Pop

  node1#

409

c71a1a3d

Iustin Pop

410

c71a1a3d

Iustin Pop

So we need to mark node3 as being *offline*, and thus Ganeti won't talk

411

c71a1a3d

Iustin Pop

to it anymore::

412

c71a1a3d

Iustin Pop

413

c71a1a3d

Iustin Pop

  node1# gnt-node modify -O yes -f node3

414

c71a1a3d

Iustin Pop

  Mon Oct 26 04:34:12 2009  - WARNING: Not enough master candidates (desired 10, new value will be 2)

415

c71a1a3d

Iustin Pop

  Mon Oct 26 04:34:15 2009  - WARNING: Communication failure to node node3: Connection failed (113: No route to host)

416

c71a1a3d

Iustin Pop

  Modified node node3

417

c71a1a3d

Iustin Pop

   - offline -> True

418

c71a1a3d

Iustin Pop

   - master_candidate -> auto-demotion due to offline

419

c71a1a3d

Iustin Pop

  node1#

420

c71a1a3d

Iustin Pop

421

c71a1a3d

Iustin Pop

And now we can failover the instance::

422

c71a1a3d

Iustin Pop

423

c71a1a3d

Iustin Pop

  node1# gnt-instance failover --ignore-consistency instance4

424

c71a1a3d

Iustin Pop

  Failover will happen to image instance4. This requires a shutdown of

425

c71a1a3d

Iustin Pop

  the instance. Continue?

426

c71a1a3d

Iustin Pop

  y/[n]/?: y

427

c71a1a3d

Iustin Pop

  Mon Oct 26 04:35:34 2009 * checking disk consistency between source and target

428

c71a1a3d

Iustin Pop

  Failure: command execution error:

429

c71a1a3d

Iustin Pop

  Disk disk/0 is degraded on target node, aborting failover.

430

c71a1a3d

Iustin Pop

  node1# gnt-instance failover --ignore-consistency instance4

431

c71a1a3d

Iustin Pop

  Failover will happen to image instance4. This requires a shutdown of

432

c71a1a3d

Iustin Pop

  the instance. Continue?

433

c71a1a3d

Iustin Pop

  y/[n]/?: y

434

c71a1a3d

Iustin Pop

  Mon Oct 26 04:35:47 2009 * checking disk consistency between source and target

435

c71a1a3d

Iustin Pop

  Mon Oct 26 04:35:47 2009 * shutting down instance on source node

436

c71a1a3d

Iustin Pop

  Mon Oct 26 04:35:47 2009  - WARNING: Could not shutdown instance instance4 on node node3. Proceeding anyway. Please make sure node node3 is down. Error details: Node is marked offline

437

c71a1a3d

Iustin Pop

  Mon Oct 26 04:35:47 2009 * deactivating the instance's disks on source node

438

c71a1a3d

Iustin Pop

  Mon Oct 26 04:35:47 2009  - WARNING: Could not shutdown block device disk/0 on node node3: Node is marked offline

439

c71a1a3d

Iustin Pop

  Mon Oct 26 04:35:47 2009 * activating the instance's disks on target node

440

c71a1a3d

Iustin Pop

  Mon Oct 26 04:35:47 2009  - WARNING: Could not prepare block device disk/0 on node node3 (is_primary=False, pass=1): Node is marked offline

441

c71a1a3d

Iustin Pop

  Mon Oct 26 04:35:48 2009 * starting the instance on the target node

442

c71a1a3d

Iustin Pop

  node1#

443

c71a1a3d

Iustin Pop

444

c71a1a3d

Iustin Pop

Note in our first attempt, Ganeti refused to do the failover since it

445

c71a1a3d

Iustin Pop

wasn't sure what is the status of the instance's disks. We pass the

446

c71a1a3d

Iustin Pop

``--ignore-consistency`` flag and then we can failover::

447

c71a1a3d

Iustin Pop

448

c71a1a3d

Iustin Pop

  node1# gnt-instance list

449

c71a1a3d

Iustin Pop

  Instance  Hypervisor OS          Primary_node Status  Memory

450

c71a1a3d

Iustin Pop

  instance1 xen-pvm    debootstrap node2        running   128M

451

c71a1a3d

Iustin Pop

  instance2 xen-pvm    debootstrap node1        running   128M

452

c71a1a3d

Iustin Pop

  instance3 xen-pvm    debootstrap node1        running   128M

453

c71a1a3d

Iustin Pop

  instance4 xen-pvm    debootstrap node1        running   128M

454

c71a1a3d

Iustin Pop

  node1#

455

c71a1a3d

Iustin Pop

456

c71a1a3d

Iustin Pop

But at this point, both instance1 and instance4 are without disk

457

c71a1a3d

Iustin Pop

redundancy::

458

c71a1a3d

Iustin Pop

459

c71a1a3d

Iustin Pop

  node1# gnt-instance info instance1

460

c71a1a3d

Iustin Pop

  Instance name: instance1

461

c71a1a3d

Iustin Pop

  UUID: 45173e82-d1fa-417c-8758-7d582ab7eef4

462

c71a1a3d

Iustin Pop

  Serial number: 2

463

c71a1a3d

Iustin Pop

  Creation time: 2009-10-26 04:06:57

464

c71a1a3d

Iustin Pop

  Modification time: 2009-10-26 04:07:14

465

c71a1a3d

Iustin Pop

  State: configured to be up, actual state is up

466

c71a1a3d

Iustin Pop

    Nodes:

467

c71a1a3d

Iustin Pop

      - primary: node2

468

c71a1a3d

Iustin Pop

      - secondaries: node3

469

c71a1a3d

Iustin Pop

    Operating system: debootstrap

470

c71a1a3d

Iustin Pop

    Allocated network port: None

471

c71a1a3d

Iustin Pop

    Hypervisor: xen-pvm

472

c71a1a3d

Iustin Pop

      - root_path: default (/dev/sda1)

473

c71a1a3d

Iustin Pop

      - kernel_args: default (ro)

474

c71a1a3d

Iustin Pop

      - use_bootloader: default (False)

475

c71a1a3d

Iustin Pop

      - bootloader_args: default ()

476

c71a1a3d

Iustin Pop

      - bootloader_path: default ()

477

c71a1a3d

Iustin Pop

      - kernel_path: default (/boot/vmlinuz-2.6-xenU)

478

c71a1a3d

Iustin Pop

      - initrd_path: default ()

479

c71a1a3d

Iustin Pop

    Hardware:

480

c71a1a3d

Iustin Pop

      - VCPUs: 1

481

c71a1a3d

Iustin Pop

      - memory: 128MiB

482

c71a1a3d

Iustin Pop

      - NICs:

483

c71a1a3d

Iustin Pop

        - nic/0: MAC: aa:00:00:78:da:63, IP: None, mode: bridged, link: xen-br0

484

c71a1a3d

Iustin Pop

    Disks:

485

c71a1a3d

Iustin Pop

      - disk/0: drbd8, size 256M

486

c71a1a3d

Iustin Pop

        access mode: rw

487

c71a1a3d

Iustin Pop

        nodeA:       node2, minor=0

488

c71a1a3d

Iustin Pop

        nodeB:       node3, minor=0

489

c71a1a3d

Iustin Pop

        port:        11035

490

c71a1a3d

Iustin Pop

        auth key:    8e950e3cec6854b0181fbc3a6058657701f2d458

491

c71a1a3d

Iustin Pop

        on primary:  /dev/drbd0 (147:0) in sync, status *DEGRADED*

492

c71a1a3d

Iustin Pop

        child devices:

493

c71a1a3d

Iustin Pop

          - child 0: lvm, size 256M

494

c71a1a3d

Iustin Pop

            logical_id: xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_data

495

c71a1a3d

Iustin Pop

            on primary: /dev/xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_data (254:0)

496

c71a1a3d

Iustin Pop

          - child 1: lvm, size 128M

497

c71a1a3d

Iustin Pop

            logical_id: xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta

498

c71a1a3d

Iustin Pop

            on primary: /dev/xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta (254:1)

499

c71a1a3d

Iustin Pop

500

c71a1a3d

Iustin Pop

The output is similar for instance4. In order to recover this, we need

501

c71a1a3d

Iustin Pop

to run the node evacuate command which will change from the current

502

c71a1a3d

Iustin Pop

secondary node to a new one (in this case, we only have two working

503

c71a1a3d

Iustin Pop

nodes, so all instances will be end on nodes one and two)::

504

c71a1a3d

Iustin Pop

505

c71a1a3d

Iustin Pop

  node1# gnt-node evacuate -I hail node3

506

c71a1a3d

Iustin Pop

  Relocate instance(s) 'instance1','instance4' from node

507

c71a1a3d

Iustin Pop

   node3 using iallocator hail?

508

c71a1a3d

Iustin Pop

  y/[n]/?: y

509

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:39 2009  - INFO: Selected new secondary for instance 'instance1': node1

510

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:40 2009  - INFO: Selected new secondary for instance 'instance4': node2

511

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:40 2009 Replacing disk(s) 0 for instance1

512

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:40 2009 STEP 1/6 Check device existence

513

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:40 2009  - INFO: Checking disk/0 on node2

514

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:40 2009  - INFO: Checking volume groups

515

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:40 2009 STEP 2/6 Check peer consistency

516

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:40 2009  - INFO: Checking disk/0 consistency on node node2

517

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:40 2009 STEP 3/6 Allocate new storage

518

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:40 2009  - INFO: Adding new local storage on node1 for disk/0

519

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:41 2009 STEP 4/6 Changing drbd configuration

520

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:41 2009  - INFO: activating a new drbd on node1 for disk/0

521

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:42 2009  - INFO: Shutting down drbd for disk/0 on old node

522

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:42 2009  - WARNING: Failed to shutdown drbd for disk/0 on oldnode: Node is marked offline

523

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:42 2009       Hint: Please cleanup this device manually as soon as possible

524

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:42 2009  - INFO: Detaching primary drbds from the network (=> standalone)

525

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:42 2009  - INFO: Updating instance configuration

526

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:45 2009  - INFO: Attaching primary drbds to new secondary (standalone => connected)

527

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:46 2009 STEP 5/6 Sync devices

528

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:46 2009  - INFO: Waiting for instance instance1 to sync disks.

529

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:46 2009  - INFO: - device disk/0: 13.90% done, 7 estimated seconds remaining

530

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009  - INFO: Instance instance1's disks are in sync.

531

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009 STEP 6/6 Removing old storage

532

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009  - INFO: Remove logical volumes for 0

533

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009  - WARNING: Can't remove old LV: Node is marked offline

534

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009       Hint: remove unused LVs manually

535

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009  - WARNING: Can't remove old LV: Node is marked offline

536

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009       Hint: remove unused LVs manually

537

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009 Replacing disk(s) 0 for instance4

538

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009 STEP 1/6 Check device existence

539

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009  - INFO: Checking disk/0 on node1

540

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009  - INFO: Checking volume groups

541

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009 STEP 2/6 Check peer consistency

542

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:53 2009  - INFO: Checking disk/0 consistency on node node1

543

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:54 2009 STEP 3/6 Allocate new storage

544

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:54 2009  - INFO: Adding new local storage on node2 for disk/0

545

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:54 2009 STEP 4/6 Changing drbd configuration

546

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:54 2009  - INFO: activating a new drbd on node2 for disk/0

547

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:55 2009  - INFO: Shutting down drbd for disk/0 on old node

548

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:55 2009  - WARNING: Failed to shutdown drbd for disk/0 on oldnode: Node is marked offline

549

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:55 2009       Hint: Please cleanup this device manually as soon as possible

550

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:55 2009  - INFO: Detaching primary drbds from the network (=> standalone)

551

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:55 2009  - INFO: Updating instance configuration

552

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:55 2009  - INFO: Attaching primary drbds to new secondary (standalone => connected)

553

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:56 2009 STEP 5/6 Sync devices

554

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:56 2009  - INFO: Waiting for instance instance4 to sync disks.

555

c71a1a3d

Iustin Pop

  Mon Oct 26 05:05:56 2009  - INFO: - device disk/0: 12.40% done, 8 estimated seconds remaining

556

c71a1a3d

Iustin Pop

  Mon Oct 26 05:06:04 2009  - INFO: Instance instance4's disks are in sync.

557

c71a1a3d

Iustin Pop

  Mon Oct 26 05:06:04 2009 STEP 6/6 Removing old storage

558

c71a1a3d

Iustin Pop

  Mon Oct 26 05:06:04 2009  - INFO: Remove logical volumes for 0

559

c71a1a3d

Iustin Pop

  Mon Oct 26 05:06:04 2009  - WARNING: Can't remove old LV: Node is marked offline

560

c71a1a3d

Iustin Pop

  Mon Oct 26 05:06:04 2009       Hint: remove unused LVs manually

561

c71a1a3d

Iustin Pop

  Mon Oct 26 05:06:04 2009  - WARNING: Can't remove old LV: Node is marked offline

562

c71a1a3d

Iustin Pop

  Mon Oct 26 05:06:04 2009       Hint: remove unused LVs manually

563

c71a1a3d

Iustin Pop

  node1#

564

c71a1a3d

Iustin Pop

565

c71a1a3d

Iustin Pop

And now node3 is completely free of instances and can be repaired::

566

c71a1a3d

Iustin Pop

567

c71a1a3d

Iustin Pop

  node1# gnt-node list

568

c71a1a3d

Iustin Pop

  Node  DTotal DFree MTotal MNode MFree Pinst Sinst

569

c71a1a3d

Iustin Pop

  node1   1.3T  1.3T  32.0G  1.0G 30.2G     3     1

570

c71a1a3d

Iustin Pop

  node2   1.3T  1.3T  32.0G  1.0G 30.4G     1     3

571

c71a1a3d

Iustin Pop

  node3      ?     ?      ?     ?     ?     0     0

572

c71a1a3d

Iustin Pop

573

c71a1a3d

Iustin Pop

Re-adding a node to the cluster

574

c71a1a3d

Iustin Pop

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

575

c71a1a3d

Iustin Pop

576

c71a1a3d

Iustin Pop

577

c71a1a3d

Iustin Pop

Let's say node3 has been repaired and is now ready to be

578

c71a1a3d

Iustin Pop

reused. Re-adding it is simple::

579

c71a1a3d

Iustin Pop

580

c71a1a3d

Iustin Pop

  node1# gnt-node add --readd node3

581

c71a1a3d

Iustin Pop

  The authenticity of host 'node3 (172.24.227.1)' can't be established.

582

c71a1a3d

Iustin Pop

  RSA key fingerprint is 9f:2e:5a:2e:e0:bd:00:09:e4:5c:32:f2:27:57:7a:f4.

583

c71a1a3d

Iustin Pop

  Are you sure you want to continue connecting (yes/no)? yes

584

c71a1a3d

Iustin Pop

  Mon Oct 26 05:27:39 2009  - INFO: Readding a node, the offline/drained flags were reset

585

c71a1a3d

Iustin Pop

  Mon Oct 26 05:27:39 2009  - INFO: Node will be a master candidate

586

c71a1a3d

Iustin Pop

587

c71a1a3d

Iustin Pop

And is now working again::

588

c71a1a3d

Iustin Pop

589

c71a1a3d

Iustin Pop

  node1# gnt-node list

590

c71a1a3d

Iustin Pop

  Node  DTotal DFree MTotal MNode MFree Pinst Sinst

591

c71a1a3d

Iustin Pop

  node1   1.3T  1.3T  32.0G  1.0G 30.2G     3     1

592

c71a1a3d

Iustin Pop

  node2   1.3T  1.3T  32.0G  1.0G 30.4G     1     3

593

c71a1a3d

Iustin Pop

  node3   1.3T  1.3T  32.0G  1.0G 30.4G     0     0

594

c71a1a3d

Iustin Pop

595

c71a1a3d

Iustin Pop

.. note:: If you have the ganeti-htools package installed, you can

596

c71a1a3d

Iustin Pop

   shuffle the instances around to have a better use of the nodes.

597

c71a1a3d

Iustin Pop

598

c71a1a3d

Iustin Pop

Disk failures

599

c71a1a3d

Iustin Pop

+++++++++++++

600

c71a1a3d

Iustin Pop

601

c71a1a3d

Iustin Pop

A disk failure is simpler than a full node failure. First, a single disk

602

c71a1a3d

Iustin Pop

failure should not cause data-loss for any redundant instance; only the

603

c71a1a3d

Iustin Pop

performance of some instances might be reduced due to more network

604

c71a1a3d

Iustin Pop

traffic.

605

c71a1a3d

Iustin Pop

606

c71a1a3d

Iustin Pop

Let take the cluster status in the above listing, and check what volumes

607

c71a1a3d

Iustin Pop

are in use::

608

c71a1a3d

Iustin Pop

609

c71a1a3d

Iustin Pop

  node1# gnt-node volumes -o phys,instance node2

610

c71a1a3d

Iustin Pop

  PhysDev   Instance

611

c71a1a3d

Iustin Pop

  /dev/sdb1 instance4

612

c71a1a3d

Iustin Pop

  /dev/sdb1 instance4

613

c71a1a3d

Iustin Pop

  /dev/sdb1 instance1

614

c71a1a3d

Iustin Pop

  /dev/sdb1 instance1

615

c71a1a3d

Iustin Pop

  /dev/sdb1 instance3

616

c71a1a3d

Iustin Pop

  /dev/sdb1 instance3

617

c71a1a3d

Iustin Pop

  /dev/sdb1 instance2

618

c71a1a3d

Iustin Pop

  /dev/sdb1 instance2

619

c71a1a3d

Iustin Pop

  node1#

620

c71a1a3d

Iustin Pop

621

c71a1a3d

Iustin Pop

You can see that all instances on node2 have logical volumes on

622

c71a1a3d

Iustin Pop

``/dev/sdb1``. Let's simulate a disk failure on that disk::

623

c71a1a3d

Iustin Pop

624

c71a1a3d

Iustin Pop

  node1# ssh node2

625

c71a1a3d

Iustin Pop

  node2# echo offline > /sys/block/sdb/device/state

626

c71a1a3d

Iustin Pop

  node2# vgs

627

c71a1a3d

Iustin Pop

    /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error

628

c71a1a3d

Iustin Pop

    /dev/sdb1: read failed after 0 of 4096 at 750153695232: Input/output error

629

c71a1a3d

Iustin Pop

    /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error

630

c71a1a3d

Iustin Pop

    Couldn't find device with uuid '954bJA-mNL0-7ydj-sdpW-nc2C-ZrCi-zFp91c'.

631

c71a1a3d

Iustin Pop

    Couldn't find all physical volumes for volume group xenvg.

632

c71a1a3d

Iustin Pop

    /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error

633

c71a1a3d

Iustin Pop

    /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error

634

c71a1a3d

Iustin Pop

    Couldn't find device with uuid '954bJA-mNL0-7ydj-sdpW-nc2C-ZrCi-zFp91c'.

635

c71a1a3d

Iustin Pop

    Couldn't find all physical volumes for volume group xenvg.

636

c71a1a3d

Iustin Pop

    Volume group xenvg not found

637

c71a1a3d

Iustin Pop

  node2#

638

c71a1a3d

Iustin Pop

639

c71a1a3d

Iustin Pop

At this point, the node is broken and if we are to examine

640

c71a1a3d

Iustin Pop

instance2 we get (simplified output shown)::

641

c71a1a3d

Iustin Pop

642

c71a1a3d

Iustin Pop

  node1# gnt-instance info instance2

643

c71a1a3d

Iustin Pop

  Instance name: instance2

644

c71a1a3d

Iustin Pop

  State: configured to be up, actual state is up

645

c71a1a3d

Iustin Pop

    Nodes:

646

c71a1a3d

Iustin Pop

      - primary: node1

647

c71a1a3d

Iustin Pop

      - secondaries: node2

648

c71a1a3d

Iustin Pop

    Disks:

649

c71a1a3d

Iustin Pop

      - disk/0: drbd8, size 256M

650

c71a1a3d

Iustin Pop

        on primary:   /dev/drbd0 (147:0) in sync, status ok

651

c71a1a3d

Iustin Pop

        on secondary: /dev/drbd1 (147:1) in sync, status *DEGRADED* *MISSING DISK*

652

c71a1a3d

Iustin Pop

653

c71a1a3d

Iustin Pop

This instance has a secondary only on node2. Let's verify a primary

654

c71a1a3d

Iustin Pop

instance of node2::

655

c71a1a3d

Iustin Pop

656

c71a1a3d

Iustin Pop

  node1# gnt-instance info instance1

657

c71a1a3d

Iustin Pop

  Instance name: instance1

658

c71a1a3d

Iustin Pop

  State: configured to be up, actual state is up

659

c71a1a3d

Iustin Pop

    Nodes:

660

c71a1a3d

Iustin Pop

      - primary: node2

661

c71a1a3d

Iustin Pop

      - secondaries: node1

662

c71a1a3d

Iustin Pop

    Disks:

663

c71a1a3d

Iustin Pop

      - disk/0: drbd8, size 256M

664

c71a1a3d

Iustin Pop

        on primary:   /dev/drbd0 (147:0) in sync, status *DEGRADED* *MISSING DISK*

665

c71a1a3d

Iustin Pop

        on secondary: /dev/drbd3 (147:3) in sync, status ok

666

c71a1a3d

Iustin Pop

  node1# gnt-instance console instance1

667

c71a1a3d

Iustin Pop

668

c71a1a3d

Iustin Pop

  Debian GNU/Linux 5.0 instance1 tty1

669

c71a1a3d

Iustin Pop

670

c71a1a3d

Iustin Pop

  instance1 login: root

671

c71a1a3d

Iustin Pop

  Last login: Tue Oct 27 01:24:09 UTC 2009 on tty1

672

c71a1a3d

Iustin Pop

  instance1:~# date > test

673

c71a1a3d

Iustin Pop

  instance1:~# sync

674

c71a1a3d

Iustin Pop

  instance1:~# cat test

675

c71a1a3d

Iustin Pop

  Tue Oct 27 01:25:20 UTC 2009

676

c71a1a3d

Iustin Pop

  instance1:~# dmesg|tail

677

c71a1a3d

Iustin Pop

  [5439785.235448] NET: Registered protocol family 15

678

c71a1a3d

Iustin Pop

  [5439785.235489] 802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>

679

c71a1a3d

Iustin Pop

  [5439785.235495] All bugs added by David S. Miller <davem@redhat.com>

680

c71a1a3d

Iustin Pop

  [5439785.235517] XENBUS: Device with no driver: device/console/0

681

c71a1a3d

Iustin Pop

  [5439785.236576] kjournald starting.  Commit interval 5 seconds

682

c71a1a3d

Iustin Pop

  [5439785.236588] EXT3-fs: mounted filesystem with ordered data mode.

683

c71a1a3d

Iustin Pop

  [5439785.236625] VFS: Mounted root (ext3 filesystem) readonly.

684

c71a1a3d

Iustin Pop

  [5439785.236663] Freeing unused kernel memory: 172k freed

685

c71a1a3d

Iustin Pop

  [5439787.533779] EXT3 FS on sda1, internal journal

686

c71a1a3d

Iustin Pop

  [5440655.065431] eth0: no IPv6 routers present

687

c71a1a3d

Iustin Pop

  instance1:~#

688

c71a1a3d

Iustin Pop

689

c71a1a3d

Iustin Pop

As you can see, the instance is running fine and doesn't see any disk

690

c71a1a3d

Iustin Pop

issues. It is now time to fix node2 and re-establish redundancy for the

691

c71a1a3d

Iustin Pop

involved instances.

692

c71a1a3d

Iustin Pop

693

c71a1a3d

Iustin Pop

.. note:: For Ganeti 2.0 we need to fix manually the volume group on

694

c71a1a3d

Iustin Pop

   node2 by running ``vgreduce --removemissing xenvg``

695

c71a1a3d

Iustin Pop

696

c71a1a3d

Iustin Pop

::

697

c71a1a3d

Iustin Pop

698

c71a1a3d

Iustin Pop

  node1# gnt-node repair-storage node2 lvm-vg xenvg

699

c71a1a3d

Iustin Pop

  Mon Oct 26 18:14:03 2009 Repairing storage unit 'xenvg' on node2 ...

700

c71a1a3d

Iustin Pop

  node1# ssh node2 vgs

701

c71a1a3d

Iustin Pop

    VG    #PV #LV #SN Attr   VSize   VFree

702

c71a1a3d

Iustin Pop

    xenvg   1   8   0 wz--n- 673.84G 673.84G

703

c71a1a3d

Iustin Pop

  node1#

704

c71a1a3d

Iustin Pop

705

c71a1a3d

Iustin Pop

This has removed the 'bad' disk from the volume group, which is now left

706

c71a1a3d

Iustin Pop

with only one PV. We can now replace the disks for the involved

707

c71a1a3d

Iustin Pop

instances::

708

c71a1a3d

Iustin Pop

709

c71a1a3d

Iustin Pop

  node1# for i in instance{1..4}; do gnt-instance replace-disks -a $i; done

710

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:38 2009 Replacing disk(s) 0 for instance1

711

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:38 2009 STEP 1/6 Check device existence

712

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:38 2009  - INFO: Checking disk/0 on node1

713

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:38 2009  - INFO: Checking disk/0 on node2

714

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:38 2009  - INFO: Checking volume groups

715

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:38 2009 STEP 2/6 Check peer consistency

716

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:38 2009  - INFO: Checking disk/0 consistency on node node1

717

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:39 2009 STEP 3/6 Allocate new storage

718

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:39 2009  - INFO: Adding storage on node2 for disk/0

719

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:39 2009 STEP 4/6 Changing drbd configuration

720

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:39 2009  - INFO: Detaching disk/0 drbd from local storage

721

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:40 2009  - INFO: Renaming the old LVs on the target node

722

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:40 2009  - INFO: Renaming the new LVs on the target node

723

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:40 2009  - INFO: Adding new mirror component on node2

724

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:41 2009 STEP 5/6 Sync devices

725

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:41 2009  - INFO: Waiting for instance instance1 to sync disks.

726

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:41 2009  - INFO: - device disk/0: 12.40% done, 9 estimated seconds remaining

727

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:50 2009  - INFO: Instance instance1's disks are in sync.

728

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:50 2009 STEP 6/6 Removing old storage

729

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:50 2009  - INFO: Remove logical volumes for disk/0

730

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:52 2009 Replacing disk(s) 0 for instance2

731

c71a1a3d

Iustin Pop

  Mon Oct 26 18:15:52 2009 STEP 1/6 Check device existence

732

c71a1a3d

Iustin Pop

…

733

c71a1a3d

Iustin Pop

  Mon Oct 26 18:16:01 2009 STEP 6/6 Removing old storage

734

c71a1a3d

Iustin Pop

  Mon Oct 26 18:16:01 2009  - INFO: Remove logical volumes for disk/0

735

c71a1a3d

Iustin Pop

  Mon Oct 26 18:16:02 2009 Replacing disk(s) 0 for instance3

736

c71a1a3d

Iustin Pop

  Mon Oct 26 18:16:02 2009 STEP 1/6 Check device existence

737

c71a1a3d

Iustin Pop

…

738

c71a1a3d

Iustin Pop

  Mon Oct 26 18:16:09 2009 STEP 6/6 Removing old storage

739

c71a1a3d

Iustin Pop

  Mon Oct 26 18:16:09 2009  - INFO: Remove logical volumes for disk/0

740

c71a1a3d

Iustin Pop

  Mon Oct 26 18:16:10 2009 Replacing disk(s) 0 for instance4

741

c71a1a3d

Iustin Pop

  Mon Oct 26 18:16:10 2009 STEP 1/6 Check device existence

742

c71a1a3d

Iustin Pop

…

743

c71a1a3d

Iustin Pop

  Mon Oct 26 18:16:18 2009 STEP 6/6 Removing old storage

744

c71a1a3d

Iustin Pop

  Mon Oct 26 18:16:18 2009  - INFO: Remove logical volumes for disk/0

745

c71a1a3d

Iustin Pop

  node1#

746

c71a1a3d

Iustin Pop

747

c71a1a3d

Iustin Pop

As this point, all instances should be healthy again.

748

c71a1a3d

Iustin Pop

749

c71a1a3d

Iustin Pop

.. note:: Ganeti 2.0 doesn't have the ``-a`` option to replace-disks, so

750

c71a1a3d

Iustin Pop

   for it you have to run the loop twice, once over primary instances

751

c71a1a3d

Iustin Pop

   with argument ``-p`` and once secondary instances with argument

752

c71a1a3d

Iustin Pop

   ``-s``, but otherwise the operations are similar::

753

c71a1a3d

Iustin Pop

754

c71a1a3d

Iustin Pop

     node1# gnt-instance replace-disks -p instance1

755

c71a1a3d

Iustin Pop

…

756

c71a1a3d

Iustin Pop

     node1# for i in instance{2..4}; do gnt-instance replace-disks -s $i; done

757

c71a1a3d

Iustin Pop

758

c71a1a3d

Iustin Pop

Common cluster problems

759

c71a1a3d

Iustin Pop

-----------------------

760

c71a1a3d

Iustin Pop

761

c71a1a3d

Iustin Pop

There are a number of small issues that might appear on a cluster that

762

c71a1a3d

Iustin Pop

can be solved easily as long as the issue is properly identified. For

763

c71a1a3d

Iustin Pop

this exercise we will consider the case of node3, which was broken

764

c71a1a3d

Iustin Pop

previously and re-added to the cluster without reinstallation. Running

765

c71a1a3d

Iustin Pop

cluster verify on the cluster reports::

766

c71a1a3d

Iustin Pop

767

c71a1a3d

Iustin Pop

  node1# gnt-cluster verify

768

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:08 2009 * Verifying global settings

769

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:08 2009 * Gathering data (3 nodes)

770

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009 * Verifying node status

771

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009   - ERROR: node node3: unallocated drbd minor 0 is in use

772

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009   - ERROR: node node3: unallocated drbd minor 1 is in use

773

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009 * Verifying instance status

774

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009   - ERROR: instance instance4: instance should not run on node node3

775

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009 * Verifying orphan volumes

776

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 22459cf8-117d-4bea-a1aa-791667d07800.disk0_data is unknown

777

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data is unknown

778

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta is unknown

779

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta is unknown

780

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009 * Verifying remaining instances

781

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009 * Verifying N+1 Memory redundancy

782

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009 * Other Notes

783

c71a1a3d

Iustin Pop

  Mon Oct 26 18:30:10 2009 * Hooks Results

784

c71a1a3d

Iustin Pop

  node1#

785

c71a1a3d

Iustin Pop

786

c71a1a3d

Iustin Pop

Instance status

787

c71a1a3d

Iustin Pop

+++++++++++++++

788

c71a1a3d

Iustin Pop

789

c71a1a3d

Iustin Pop

As you can see, *instance4* has a copy running on node3, because we

790

c71a1a3d

Iustin Pop

forced the failover when node3 failed. This case is dangerous as the

791

c71a1a3d

Iustin Pop

instance will have the same IP and MAC address, wreaking havok on the

792

c71a1a3d

Iustin Pop

network environment and anyone who tries to use it.

793

c71a1a3d

Iustin Pop

794

c71a1a3d

Iustin Pop

Ganeti doesn't directly handle this case. It is recommended to logon to

795

c71a1a3d

Iustin Pop

node3 and run::

796

c71a1a3d

Iustin Pop

797

c71a1a3d

Iustin Pop

  node3# xm destroy instance4

798

c71a1a3d

Iustin Pop

799

c71a1a3d

Iustin Pop

Unallocated DRBD minors

800

c71a1a3d

Iustin Pop

+++++++++++++++++++++++

801

c71a1a3d

Iustin Pop

802

c71a1a3d

Iustin Pop

There are still unallocated DRBD minors on node3. Again, these are not

803

c71a1a3d

Iustin Pop

handled by Ganeti directly and need to be cleaned up via DRBD commands::

804

c71a1a3d

Iustin Pop

805

c71a1a3d

Iustin Pop

  node3# drbdsetup /dev/drbd0 down

806

c71a1a3d

Iustin Pop

  node3# drbdsetup /dev/drbd1 down

807

c71a1a3d

Iustin Pop

  node3#

808

c71a1a3d

Iustin Pop

809

c71a1a3d

Iustin Pop

Orphan volumes

810

c71a1a3d

Iustin Pop

++++++++++++++

811

c71a1a3d

Iustin Pop

812

c71a1a3d

Iustin Pop

At this point, the only remaining problem should be the so-called

813

c71a1a3d

Iustin Pop

*orphan* volumes. This can happen also in the case of an aborted

814

c71a1a3d

Iustin Pop

disk-replace, or similar situation where Ganeti was not able to recover

815

c71a1a3d

Iustin Pop

automatically. Here you need to remove them manually via LVM commands::

816

c71a1a3d

Iustin Pop

817

c71a1a3d

Iustin Pop

  node3# lvremove xenvg

818

c71a1a3d

Iustin Pop

  Do you really want to remove active logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_data"? [y/n]: y

819

c71a1a3d

Iustin Pop

    Logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_data" successfully removed

820

c71a1a3d

Iustin Pop

  Do you really want to remove active logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta"? [y/n]: y

821

c71a1a3d

Iustin Pop

    Logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta" successfully removed

822

c71a1a3d

Iustin Pop

  Do you really want to remove active logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data"? [y/n]: y

823

c71a1a3d

Iustin Pop

    Logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data" successfully removed

824

c71a1a3d

Iustin Pop

  Do you really want to remove active logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta"? [y/n]: y

825

c71a1a3d

Iustin Pop

    Logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta" successfully removed

826

c71a1a3d

Iustin Pop

  node3#

827

c71a1a3d

Iustin Pop

828

c71a1a3d

Iustin Pop

At this point cluster verify shouldn't complain anymore::

829

c71a1a3d

Iustin Pop

830

c71a1a3d

Iustin Pop

  node1# gnt-cluster verify

831

c71a1a3d

Iustin Pop

  Mon Oct 26 18:37:51 2009 * Verifying global settings

832

c71a1a3d

Iustin Pop

  Mon Oct 26 18:37:51 2009 * Gathering data (3 nodes)

833

c71a1a3d

Iustin Pop

  Mon Oct 26 18:37:53 2009 * Verifying node status

834

c71a1a3d

Iustin Pop

  Mon Oct 26 18:37:53 2009 * Verifying instance status

835

c71a1a3d

Iustin Pop

  Mon Oct 26 18:37:53 2009 * Verifying orphan volumes

836

c71a1a3d

Iustin Pop

  Mon Oct 26 18:37:53 2009 * Verifying remaining instances

837

c71a1a3d

Iustin Pop

  Mon Oct 26 18:37:53 2009 * Verifying N+1 Memory redundancy

838

c71a1a3d

Iustin Pop

  Mon Oct 26 18:37:53 2009 * Other Notes

839

c71a1a3d

Iustin Pop

  Mon Oct 26 18:37:53 2009 * Hooks Results

840

c71a1a3d

Iustin Pop

  node1#

841

c71a1a3d

Iustin Pop

842

c71a1a3d

Iustin Pop

N+1 errors

843

c71a1a3d

Iustin Pop

++++++++++

844

c71a1a3d

Iustin Pop

845

c71a1a3d

Iustin Pop

Since redundant instances in Ganeti have a primary/secondary model, it

846

c71a1a3d

Iustin Pop

is needed to leave aside on each node enough memory so that if one of

847

c71a1a3d

Iustin Pop

its peer node fails, all the secondary instances that have that node as

848

c71a1a3d

Iustin Pop

primary can be relocated. More specifically, if instance2 has node1 as

849

c71a1a3d

Iustin Pop

primary and node2 as secondary (and node1 and node2 do not have any

850

c71a1a3d

Iustin Pop

other instances in this layout), then it means that node2 must have

851

c71a1a3d

Iustin Pop

enough free memory so that if node1 fails, we can failover instance2

852

c71a1a3d

Iustin Pop

without any other operations (for reducing the downtime window). Let's

853

c71a1a3d

Iustin Pop

increase the memory of the current instances to 4G, and add three new

854

c71a1a3d

Iustin Pop

instances, two on node2:node3 with 8GB of RAM and one on node1:node2,

855

c71a1a3d

Iustin Pop

with 12GB of RAM (numbers chosen so that we run out of memory)::

856

c71a1a3d

Iustin Pop

857

c71a1a3d

Iustin Pop

  node1# gnt-instance modify -B memory=4G instance1

858

c71a1a3d

Iustin Pop

  Modified instance instance1

859

c71a1a3d

Iustin Pop

   - be/memory -> 4096

860

c71a1a3d

Iustin Pop

  Please don't forget that these parameters take effect only at the next start of the instance.

861

c71a1a3d

Iustin Pop

  node1# gnt-instance modify …

862

c71a1a3d

Iustin Pop

863

c71a1a3d

Iustin Pop

  node1# gnt-instance add -t drbd -n node2:node3 -s 512m -B memory=8G -o debootstrap instance5

864

c71a1a3d

Iustin Pop

…

865

c71a1a3d

Iustin Pop

  node1# gnt-instance add -t drbd -n node2:node3 -s 512m -B memory=8G -o debootstrap instance6

866

c71a1a3d

Iustin Pop

…

867

c71a1a3d

Iustin Pop

  node1# gnt-instance add -t drbd -n node1:node2 -s 512m -B memory=8G -o debootstrap instance7

868

c71a1a3d

Iustin Pop

  node1# gnt-instance reboot --all

869

c71a1a3d

Iustin Pop

  The reboot will operate on 7 instances.

870

c71a1a3d

Iustin Pop

  Do you want to continue?

871

c71a1a3d

Iustin Pop

  Affected instances:

872

c71a1a3d

Iustin Pop

    instance1

873

c71a1a3d

Iustin Pop

    instance2

874

c71a1a3d

Iustin Pop

    instance3

875

c71a1a3d

Iustin Pop

    instance4

876

c71a1a3d

Iustin Pop

    instance5

877

c71a1a3d

Iustin Pop

    instance6

878

c71a1a3d

Iustin Pop

    instance7

879

c71a1a3d

Iustin Pop

  y/[n]/?: y

880

c71a1a3d

Iustin Pop

  Submitted jobs 677, 678, 679, 680, 681, 682, 683

881

c71a1a3d

Iustin Pop

  Waiting for job 677 for instance1...

882

c71a1a3d

Iustin Pop

  Waiting for job 678 for instance2...

883

c71a1a3d

Iustin Pop

  Waiting for job 679 for instance3...

884

c71a1a3d

Iustin Pop

  Waiting for job 680 for instance4...

885

c71a1a3d

Iustin Pop

  Waiting for job 681 for instance5...

886

c71a1a3d

Iustin Pop

  Waiting for job 682 for instance6...

887

c71a1a3d

Iustin Pop

  Waiting for job 683 for instance7...

888

c71a1a3d

Iustin Pop

  node1#

889

c71a1a3d

Iustin Pop

890

c71a1a3d

Iustin Pop

We rebooted instances for the memory changes to have effect. Now the

891

c71a1a3d

Iustin Pop

cluster looks like::

892

c71a1a3d

Iustin Pop

893

c71a1a3d

Iustin Pop

  node1# gnt-node list

894

c71a1a3d

Iustin Pop

  Node  DTotal DFree MTotal MNode MFree Pinst Sinst

895

c71a1a3d

Iustin Pop

  node1   1.3T  1.3T  32.0G  1.0G  6.5G     4     1

896

c71a1a3d

Iustin Pop

  node2   1.3T  1.3T  32.0G  1.0G 10.5G     3     4

897

c71a1a3d

Iustin Pop

  node3   1.3T  1.3T  32.0G  1.0G 30.5G     0     2

898

c71a1a3d

Iustin Pop

  node1# gnt-cluster verify

899

c71a1a3d

Iustin Pop

  Mon Oct 26 18:59:36 2009 * Verifying global settings

900

c71a1a3d

Iustin Pop

  Mon Oct 26 18:59:36 2009 * Gathering data (3 nodes)

901

c71a1a3d

Iustin Pop

  Mon Oct 26 18:59:37 2009 * Verifying node status

902

c71a1a3d

Iustin Pop

  Mon Oct 26 18:59:37 2009 * Verifying instance status

903

c71a1a3d

Iustin Pop

  Mon Oct 26 18:59:37 2009 * Verifying orphan volumes

904

c71a1a3d

Iustin Pop

  Mon Oct 26 18:59:37 2009 * Verifying remaining instances

905

c71a1a3d

Iustin Pop

  Mon Oct 26 18:59:37 2009 * Verifying N+1 Memory redundancy

906

c71a1a3d

Iustin Pop

  Mon Oct 26 18:59:37 2009   - ERROR: node node2: not enough memory on to accommodate failovers should peer node node1 fail

907

c71a1a3d

Iustin Pop

  Mon Oct 26 18:59:37 2009 * Other Notes

908

c71a1a3d

Iustin Pop

  Mon Oct 26 18:59:37 2009 * Hooks Results

909

c71a1a3d

Iustin Pop

  node1#

910

c71a1a3d

Iustin Pop

911

c71a1a3d

Iustin Pop

The cluster verify error above shows that if node1 fails, node2 will not

912

c71a1a3d

Iustin Pop

have enough memory to failover all primary instances on node1 to it. To

913

c71a1a3d

Iustin Pop

solve this, you have a number of options:

914

c71a1a3d

Iustin Pop

915

c71a1a3d

Iustin Pop

- try to manually move instances around (but this can become complicated

916

c71a1a3d

Iustin Pop

  for any non-trivial cluster)

917

c71a1a3d

Iustin Pop

- try to reduce memory of some instances to accommodate the available

918

c71a1a3d

Iustin Pop

  node memory

919

c71a1a3d

Iustin Pop

- if you have the ganeti-htools package installed, you can run the

920

c71a1a3d

Iustin Pop

  ``hbal`` tool which will try to compute an automated cluster solution

921

c71a1a3d

Iustin Pop

  that complies with the N+1 rule

922

c71a1a3d

Iustin Pop

923

c71a1a3d

Iustin Pop

Network issues

924

c71a1a3d

Iustin Pop

++++++++++++++

925

c71a1a3d

Iustin Pop

926

c71a1a3d

Iustin Pop

In case a node has problems with the network (usually the secondary

927

c71a1a3d

Iustin Pop

network, as problems with the primary network will render the node

928

c71a1a3d

Iustin Pop

unusable for ganeti commands), it will show up in cluster verify as::

929

c71a1a3d

Iustin Pop

930

c71a1a3d

Iustin Pop

  node1# gnt-cluster verify

931

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:19 2009 * Verifying global settings

932

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:19 2009 * Gathering data (3 nodes)

933

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009 * Verifying node status

934

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009   - ERROR: node node1: tcp communication with node 'node3': failure using the secondary interface(s)

935

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009   - ERROR: node node2: tcp communication with node 'node3': failure using the secondary interface(s)

936

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009   - ERROR: node node3: tcp communication with node 'node1': failure using the secondary interface(s)

937

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009   - ERROR: node node3: tcp communication with node 'node2': failure using the secondary interface(s)

938

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009   - ERROR: node node3: tcp communication with node 'node3': failure using the secondary interface(s)

939

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009 * Verifying instance status

940

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009 * Verifying orphan volumes

941

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009 * Verifying remaining instances

942

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009 * Verifying N+1 Memory redundancy

943

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009 * Other Notes

944

c71a1a3d

Iustin Pop

  Mon Oct 26 19:07:23 2009 * Hooks Results

945

c71a1a3d

Iustin Pop

  node1#

946

c71a1a3d

Iustin Pop

947

c71a1a3d

Iustin Pop

This shows that both node1 and node2 have problems contacting node3 over

948

c71a1a3d

Iustin Pop

the secondary network, and node3 has problems contacting them. From this

949

c71a1a3d

Iustin Pop

output is can be deduced that since node1 and node2 can communicate

950

c71a1a3d

Iustin Pop

between themselves, node3 is the one having problems, and you need to

951

c71a1a3d

Iustin Pop

investigate its network settings/connection.

952

c71a1a3d

Iustin Pop

953

c71a1a3d

Iustin Pop

Migration problems

954

c71a1a3d

Iustin Pop

++++++++++++++++++

955

c71a1a3d

Iustin Pop

956

c71a1a3d

Iustin Pop

Since live migration can sometimes fail and leave the instance in an

957

c71a1a3d

Iustin Pop

inconsistent state, Ganeti provides a ``--cleanup`` argument to the

958

c71a1a3d

Iustin Pop

migrate command that does:

959

c71a1a3d

Iustin Pop

960

c71a1a3d

Iustin Pop

- check on which node the instance is actually running (has the

961

c71a1a3d

Iustin Pop

  command failed before or after the actual migration?)

962

c71a1a3d

Iustin Pop

- reconfigure the DRBD disks accordingly

963

c71a1a3d

Iustin Pop

964

c71a1a3d

Iustin Pop

It is always safe to run this command as long as the instance has good

965

c71a1a3d

Iustin Pop

data on its primary node (i.e. not showing as degraded). If so, you can

966

c71a1a3d

Iustin Pop

simply run::

967

c71a1a3d

Iustin Pop

968

c71a1a3d

Iustin Pop

  node1# gnt-instance migrate --cleanup instance1

969

c71a1a3d

Iustin Pop

  Instance instance1 will be recovered from a failed migration. Note

970

c71a1a3d

Iustin Pop

  that the migration procedure (including cleanup) is **experimental**

971

c71a1a3d

Iustin Pop

  in this version. This might impact the instance if anything goes

972

c71a1a3d

Iustin Pop

  wrong. Continue?

973

c71a1a3d

Iustin Pop

  y/[n]/?: y

974

c71a1a3d

Iustin Pop

  Mon Oct 26 19:13:49 2009 Migrating instance instance1

975

c71a1a3d

Iustin Pop

  Mon Oct 26 19:13:49 2009 * checking where the instance actually runs (if this hangs, the hypervisor might be in a bad state)

976

c71a1a3d

Iustin Pop

  Mon Oct 26 19:13:49 2009 * instance confirmed to be running on its primary node (node2)

977

c71a1a3d

Iustin Pop

  Mon Oct 26 19:13:49 2009 * switching node node1 to secondary mode

978

c71a1a3d

Iustin Pop

  Mon Oct 26 19:13:50 2009 * wait until resync is done

979

c71a1a3d

Iustin Pop

  Mon Oct 26 19:13:50 2009 * changing into standalone mode

980

c71a1a3d

Iustin Pop

  Mon Oct 26 19:13:50 2009 * changing disks into single-master mode

981

c71a1a3d

Iustin Pop

  Mon Oct 26 19:13:50 2009 * wait until resync is done

982

c71a1a3d

Iustin Pop

  Mon Oct 26 19:13:51 2009 * done

983

c71a1a3d

Iustin Pop

  node1#

984

c71a1a3d

Iustin Pop

985

c71a1a3d

Iustin Pop

In use disks at instance shutdown

986

c71a1a3d

Iustin Pop

+++++++++++++++++++++++++++++++++

987

c71a1a3d

Iustin Pop

988

c71a1a3d

Iustin Pop

If you see something like the following when trying to shutdown or

989

c71a1a3d

Iustin Pop

deactivate disks for an instance::

990

c71a1a3d

Iustin Pop

991

c71a1a3d

Iustin Pop

  node1# gnt-instance shutdown instance1

992

c71a1a3d

Iustin Pop

  Mon Oct 26 19:16:23 2009  - WARNING: Could not shutdown block device disk/0 on node node2: drbd0: can't shutdown drbd device: /dev/drbd0: State change failed: (-12) Device is held open by someone\n

993

c71a1a3d

Iustin Pop

994

c71a1a3d

Iustin Pop

It most likely means something is holding open the underlying DRBD

995

c71a1a3d

Iustin Pop

device. This can be bad if the instance is not running, as it might mean

996

c71a1a3d

Iustin Pop

that there was concurrent access from both the node and the instance to

997

c71a1a3d

Iustin Pop

the disks, but not always (e.g. you could only have had the partitions

998

c71a1a3d

Iustin Pop

activated via ``kpartx``).

999

c71a1a3d

Iustin Pop

1000

c71a1a3d

Iustin Pop

To troubleshoot this issue you need to follow standard Linux practices,

1001

c71a1a3d

Iustin Pop

and pay attention to the hypervisor being used:

1002

c71a1a3d

Iustin Pop

1003

c71a1a3d

Iustin Pop

- check if (in the above example) ``/dev/drbd0`` on node2 is being

1004

c71a1a3d

Iustin Pop

  mounted somewhere (``cat /proc/mounts``)

1005

c71a1a3d

Iustin Pop

- check if the device is not being used by device mapper itself:

1006

c71a1a3d

Iustin Pop

  ``dmsetup ls`` and look for entries of the form ``drbd0pX``, and if so

1007

c71a1a3d

Iustin Pop

  remove them with either ``kpartx -d`` or ``dmsetup remove``

1008

c71a1a3d

Iustin Pop

1009

c71a1a3d

Iustin Pop

For Xen, check if it's not using the disks itself::

1010

c71a1a3d

Iustin Pop

1011

c71a1a3d

Iustin Pop

  node1# xenstore-ls /local/domain/0/backend/vbd|grep -e "domain =" -e physical-device

1012

c71a1a3d

Iustin Pop

  domain = "instance2"

1013

c71a1a3d

Iustin Pop

  physical-device = "93:0"

1014

c71a1a3d

Iustin Pop

  domain = "instance3"

1015

c71a1a3d

Iustin Pop

  physical-device = "93:1"

1016

c71a1a3d

Iustin Pop

  domain = "instance4"

1017

c71a1a3d

Iustin Pop

  physical-device = "93:2"

1018

c71a1a3d

Iustin Pop

  node1#

1019

c71a1a3d

Iustin Pop

1020

c71a1a3d

Iustin Pop

You can see in the above output that the node exports three disks, to

1021

c71a1a3d

Iustin Pop

three instances. The ``physical-device`` key is in major:minor format in

1022

c71a1a3d

Iustin Pop

hexadecimal, and 0x93 represents DRBD's major number. Thus we can see

1023

c71a1a3d

Iustin Pop

from the above that instance2 has /dev/drbd0, instance3 /dev/drbd1, and

1024

c71a1a3d

Iustin Pop

instance4 /dev/drbd2.

1025

c71a1a3d

Iustin Pop

1026

c71a1a3d

Iustin Pop

.. vim: set textwidth=72 :

1027

c71a1a3d

Iustin Pop

.. Local Variables:

1028

c71a1a3d

Iustin Pop

.. mode: rst

1029

c71a1a3d

Iustin Pop

.. fill-column: 72

1030

c71a1a3d

Iustin Pop

.. End:

Synnefo » snf-ganeti

root / doc / walkthrough.rst @ 679008e7