root / doc / design-linuxha.rst @ a41a1eec
History | View | Annotate | Download (6.3 kB)
1 |
==================== |
---|---|
2 |
Linux HA integration |
3 |
==================== |
4 |
|
5 |
.. contents:: :depth: 4 |
6 |
|
7 |
This is a design document detailing the integration of Ganeti and Linux HA. |
8 |
|
9 |
|
10 |
Current state and shortcomings |
11 |
============================== |
12 |
|
13 |
Ganeti doesn't currently support any self-healing or self-monitoring. |
14 |
|
15 |
We are now working on trying to improve the situation in this regard: |
16 |
|
17 |
- The :doc:`autorepair system <design-autorepair>` will take care |
18 |
of self repairing a cluster in the presence of offline nodes. |
19 |
- The :doc:`monitoring agent <design-monitoring-agent>` will take care |
20 |
of exporting data to monitoring. |
21 |
|
22 |
What is still missing is a way to self-detect "obvious" failures rapidly |
23 |
and to: |
24 |
|
25 |
- Maintain the master role active. |
26 |
- Offline resource that are obviously faulty so that the autorepair |
27 |
system can perform its work. |
28 |
|
29 |
|
30 |
Proposed changes |
31 |
================ |
32 |
|
33 |
Linux-HA provides software that can be used to provide high availability |
34 |
of services through automatic failover of resources. In particular |
35 |
Pacemaker can be used together with Heartbeat or Corosync to make sure a |
36 |
resource is kept active on a self-monitoring cluster. |
37 |
|
38 |
Ganeti OCF agents |
39 |
----------------- |
40 |
|
41 |
The Ganeti agents will be slightly special in the HA world. The |
42 |
following will apply: |
43 |
|
44 |
- The agents will be able to be configured cluster-wise by tags (which |
45 |
will be read on the nodes via ssconf_cluster_tags) and locally by |
46 |
files on the filesystem that will allow them to "simulate" a |
47 |
particular condition (eg. simulate a failure even if none is |
48 |
detected). |
49 |
- The agents will be able to run in "full" or "partial" mode: in |
50 |
"partial" mode they will always succeed, and thus never fail a |
51 |
resource as long as a node is online, is running the linux HA software |
52 |
and is responding to the network. In "full" mode they will also check |
53 |
resources like the cluster master ip or master daemon, and act if they |
54 |
are missing |
55 |
|
56 |
Note that for what Ganeti does OCF agents are needed: simply relying on |
57 |
the LSB scripts will not work for the Ganeti service. |
58 |
|
59 |
|
60 |
Master role agent |
61 |
----------------- |
62 |
|
63 |
This agent will manage the Ganeti master role. It needs to be configured |
64 |
as a sticky resource (you don't want to flap the master role around, do |
65 |
you?) that is active on only one node. You can require quorum or fencing |
66 |
to protect your cluster from multiple masters. |
67 |
|
68 |
The agent will implement a stateless resource that considers itself |
69 |
"started" only the master node, "stopped" on all master candidates and |
70 |
in error mode for all other nodes. |
71 |
|
72 |
Note that if not all your nodes are master candidates this resource |
73 |
might have problems: |
74 |
|
75 |
- if all nodes are configured to run the resource, heartbeat may decide |
76 |
to "fence" (aka stonith) all your non-master-candidate nodes if told |
77 |
to do so. This might not be what you want. |
78 |
- if only master candidates are configured as nodes for the resource, |
79 |
beware of promotions and demotions, as nothing will update |
80 |
automatically pacemaker should a change happen at the Ganeti level. |
81 |
|
82 |
Other solutions, such as reporting the resource just as "stopped" on non |
83 |
master candidates as well might mean that pacemaker would choose the |
84 |
"wrong" node to promote to master, which is also a bad idea. |
85 |
|
86 |
Future improvements |
87 |
+++++++++++++++++++ |
88 |
|
89 |
- Ability to work better with non-master-candidate nodes |
90 |
- Stateful resource that can "safely" transfer the master role between |
91 |
online nodes (with queue drain and such) |
92 |
- Implement "full" mode, with detection of the cluster IP and the master |
93 |
node daemon. |
94 |
|
95 |
|
96 |
Node role agent |
97 |
--------------- |
98 |
|
99 |
This agent will manage the Ganeti node role. It needs to be configured |
100 |
as a cloned resource that is active on all nodes. |
101 |
|
102 |
In partial mode it will always return success (and thus trigger a |
103 |
failure only upon an HA level or network failure). Full mode, which |
104 |
initially will not be implemented, couls also check for the node daemon |
105 |
being unresponsive or other local conditions (TBD). |
106 |
|
107 |
When a failure happens the HA notification system will trigger on all |
108 |
other nodes, including the master. The master will then be able to |
109 |
offline the node. Any other work to restore instance availability should |
110 |
then be done by the autorepair system. |
111 |
|
112 |
The following cluster tags are supported: |
113 |
|
114 |
- ``ocf:node-offline:use-powercycle``: Try to powercycle a node using |
115 |
``gnt-node powercycle`` when offlining. |
116 |
- ``ocf:node-offline:use-poweroff``: Try to power off a node using |
117 |
``gnt-node power off`` when offlining (requires OOB support). |
118 |
|
119 |
Future improvements |
120 |
+++++++++++++++++++ |
121 |
|
122 |
- Handle draining differently than offlining |
123 |
- Handle different modes of "stopping" the service |
124 |
- Implement "full" mode |
125 |
|
126 |
|
127 |
Risks |
128 |
----- |
129 |
|
130 |
Running Ganeti with Pacemaker increases the risk of stability for your |
131 |
Ganeti Cluster. Events like: |
132 |
|
133 |
- stopping heartbeat or corosync on a node |
134 |
- corosync or heartbeat being killed for any reason |
135 |
- temporary failure in a node's networking |
136 |
|
137 |
will trigger potentially dangerous operations such as node offlining or |
138 |
master role failover. Moreover if the autorepair system will be working |
139 |
they will be able to also trigger instance failovers or migrations, and |
140 |
disk replaces. |
141 |
|
142 |
Also note that operations like: master-failover, or manual node-modify |
143 |
might interact badly with this setup depending on the way your HA system |
144 |
is configured (see below). |
145 |
|
146 |
This of course is an inherent problem with any Linux-HA installation, |
147 |
but is probably more visible with Ganeti given that our resources tend |
148 |
to be more heavyweight than many others managed in HA clusters (eg. an |
149 |
IP address). |
150 |
|
151 |
Code status |
152 |
----------- |
153 |
|
154 |
This code is heavily experimental, and Linux-HA is a very complex |
155 |
subsystem. *We might not be able to help you* if you decide to run this |
156 |
code: please make sure you understand fully high availability on your |
157 |
production machines. Ganeti only ships this code as an example but it |
158 |
might need customization or complex configurations on your side for it |
159 |
to run properly. |
160 |
|
161 |
*Ganeti does not automate HA configuration for your cluster*. You need |
162 |
to do this job by hand. Good luck, don't get it wrong. |
163 |
|
164 |
|
165 |
Future work |
166 |
=========== |
167 |
|
168 |
- Integrate the agents better with the ganeti monitoring |
169 |
- Add hooks for managing HA at node add/remove/modify/master-failover |
170 |
operations |
171 |
- Provide a stonith system through Ganeti's OOB system |
172 |
- Provide an OOB system that does "shunning" of offline nodes, for |
173 |
emulating a real OOB, at least on all nodes |
174 |
|
175 |
.. vim: set textwidth=72 : |
176 |
.. Local Variables: |
177 |
.. mode: rst |
178 |
.. fill-column: 72 |
179 |
.. End: |