root / doc / design-linuxha.rst @ 333bd799
History | View | Annotate | Download (6.3 kB)
1 | d55b80b0 | Guido Trotter | ==================== |
---|---|---|---|
2 | d55b80b0 | Guido Trotter | Linux HA integration |
3 | d55b80b0 | Guido Trotter | ==================== |
4 | d55b80b0 | Guido Trotter | |
5 | d55b80b0 | Guido Trotter | .. contents:: :depth: 4 |
6 | d55b80b0 | Guido Trotter | |
7 | d55b80b0 | Guido Trotter | This is a design document detailing the integration of Ganeti and Linux HA. |
8 | d55b80b0 | Guido Trotter | |
9 | d55b80b0 | Guido Trotter | |
10 | d55b80b0 | Guido Trotter | Current state and shortcomings |
11 | d55b80b0 | Guido Trotter | ============================== |
12 | d55b80b0 | Guido Trotter | |
13 | d55b80b0 | Guido Trotter | Ganeti doesn't currently support any self-healing or self-monitoring. |
14 | d55b80b0 | Guido Trotter | |
15 | d55b80b0 | Guido Trotter | We are now working on trying to improve the situation in this regard: |
16 | d55b80b0 | Guido Trotter | |
17 | d55b80b0 | Guido Trotter | - The :doc:`autorepair system <design-autorepair>` will take care |
18 | d55b80b0 | Guido Trotter | of self repairing a cluster in the presence of offline nodes. |
19 | d55b80b0 | Guido Trotter | - The :doc:`monitoring agent <design-monitoring-agent>` will take care |
20 | d55b80b0 | Guido Trotter | of exporting data to monitoring. |
21 | d55b80b0 | Guido Trotter | |
22 | d55b80b0 | Guido Trotter | What is still missing is a way to self-detect "obvious" failures rapidly |
23 | d55b80b0 | Guido Trotter | and to: |
24 | d55b80b0 | Guido Trotter | |
25 | d55b80b0 | Guido Trotter | - Maintain the master role active. |
26 | d55b80b0 | Guido Trotter | - Offline resource that are obviously faulty so that the autorepair |
27 | d55b80b0 | Guido Trotter | system can perform its work. |
28 | d55b80b0 | Guido Trotter | |
29 | d55b80b0 | Guido Trotter | |
30 | d55b80b0 | Guido Trotter | Proposed changes |
31 | d55b80b0 | Guido Trotter | ================ |
32 | d55b80b0 | Guido Trotter | |
33 | d55b80b0 | Guido Trotter | Linux-HA provides software that can be used to provide high availability |
34 | d55b80b0 | Guido Trotter | of services through automatic failover of resources. In particular |
35 | d55b80b0 | Guido Trotter | Pacemaker can be used together with Heartbeat or Corosync to make sure a |
36 | d55b80b0 | Guido Trotter | resource is kept active on a self-monitoring cluster. |
37 | d55b80b0 | Guido Trotter | |
38 | d55b80b0 | Guido Trotter | Ganeti OCF agents |
39 | d55b80b0 | Guido Trotter | ----------------- |
40 | d55b80b0 | Guido Trotter | |
41 | d55b80b0 | Guido Trotter | The Ganeti agents will be slightly special in the HA world. The |
42 | d55b80b0 | Guido Trotter | following will apply: |
43 | d55b80b0 | Guido Trotter | |
44 | d55b80b0 | Guido Trotter | - The agents will be able to be configured cluster-wise by tags (which |
45 | d55b80b0 | Guido Trotter | will be read on the nodes via ssconf_cluster_tags) and locally by |
46 | d55b80b0 | Guido Trotter | files on the filesystem that will allow them to "simulate" a |
47 | d55b80b0 | Guido Trotter | particular condition (eg. simulate a failure even if none is |
48 | d55b80b0 | Guido Trotter | detected). |
49 | d55b80b0 | Guido Trotter | - The agents will be able to run in "full" or "partial" mode: in |
50 | d55b80b0 | Guido Trotter | "partial" mode they will always succeed, and thus never fail a |
51 | d55b80b0 | Guido Trotter | resource as long as a node is online, is running the linux HA software |
52 | d55b80b0 | Guido Trotter | and is responding to the network. In "full" mode they will also check |
53 | d55b80b0 | Guido Trotter | resources like the cluster master ip or master daemon, and act if they |
54 | d55b80b0 | Guido Trotter | are missing |
55 | d55b80b0 | Guido Trotter | |
56 | d55b80b0 | Guido Trotter | Note that for what Ganeti does OCF agents are needed: simply relying on |
57 | d55b80b0 | Guido Trotter | the LSB scripts will not work for the Ganeti service. |
58 | d55b80b0 | Guido Trotter | |
59 | d55b80b0 | Guido Trotter | |
60 | d55b80b0 | Guido Trotter | Master role agent |
61 | d55b80b0 | Guido Trotter | ----------------- |
62 | d55b80b0 | Guido Trotter | |
63 | d55b80b0 | Guido Trotter | This agent will manage the Ganeti master role. It needs to be configured |
64 | d55b80b0 | Guido Trotter | as a sticky resource (you don't want to flap the master role around, do |
65 | d55b80b0 | Guido Trotter | you?) that is active on only one node. You can require quorum or fencing |
66 | d55b80b0 | Guido Trotter | to protect your cluster from multiple masters. |
67 | d55b80b0 | Guido Trotter | |
68 | d55b80b0 | Guido Trotter | The agent will implement a stateless resource that considers itself |
69 | d55b80b0 | Guido Trotter | "started" only the master node, "stopped" on all master candidates and |
70 | d55b80b0 | Guido Trotter | in error mode for all other nodes. |
71 | d55b80b0 | Guido Trotter | |
72 | d55b80b0 | Guido Trotter | Note that if not all your nodes are master candidates this resource |
73 | d55b80b0 | Guido Trotter | might have problems: |
74 | d55b80b0 | Guido Trotter | |
75 | d55b80b0 | Guido Trotter | - if all nodes are configured to run the resource, heartbeat may decide |
76 | d55b80b0 | Guido Trotter | to "fence" (aka stonith) all your non-master-candidate nodes if told |
77 | d55b80b0 | Guido Trotter | to do so. This might not be what you want. |
78 | d55b80b0 | Guido Trotter | - if only master candidates are configured as nodes for the resource, |
79 | d55b80b0 | Guido Trotter | beware of promotions and demotions, as nothing will update |
80 | d55b80b0 | Guido Trotter | automatically pacemaker should a change happen at the Ganeti level. |
81 | d55b80b0 | Guido Trotter | |
82 | d55b80b0 | Guido Trotter | Other solutions, such as reporting the resource just as "stopped" on non |
83 | d55b80b0 | Guido Trotter | master candidates as well might mean that pacemaker would choose the |
84 | d55b80b0 | Guido Trotter | "wrong" node to promote to master, which is also a bad idea. |
85 | d55b80b0 | Guido Trotter | |
86 | d55b80b0 | Guido Trotter | Future improvements |
87 | d55b80b0 | Guido Trotter | +++++++++++++++++++ |
88 | d55b80b0 | Guido Trotter | |
89 | d55b80b0 | Guido Trotter | - Ability to work better with non-master-candidate nodes |
90 | d55b80b0 | Guido Trotter | - Stateful resource that can "safely" transfer the master role between |
91 | d55b80b0 | Guido Trotter | online nodes (with queue drain and such) |
92 | d55b80b0 | Guido Trotter | - Implement "full" mode, with detection of the cluster IP and the master |
93 | d55b80b0 | Guido Trotter | node daemon. |
94 | d55b80b0 | Guido Trotter | |
95 | d55b80b0 | Guido Trotter | |
96 | d55b80b0 | Guido Trotter | Node role agent |
97 | d55b80b0 | Guido Trotter | --------------- |
98 | d55b80b0 | Guido Trotter | |
99 | d55b80b0 | Guido Trotter | This agent will manage the Ganeti node role. It needs to be configured |
100 | d55b80b0 | Guido Trotter | as a cloned resource that is active on all nodes. |
101 | d55b80b0 | Guido Trotter | |
102 | d55b80b0 | Guido Trotter | In partial mode it will always return success (and thus trigger a |
103 | d55b80b0 | Guido Trotter | failure only upon an HA level or network failure). Full mode, which |
104 | d55b80b0 | Guido Trotter | initially will not be implemented, couls also check for the node daemon |
105 | d55b80b0 | Guido Trotter | being unresponsive or other local conditions (TBD). |
106 | d55b80b0 | Guido Trotter | |
107 | d55b80b0 | Guido Trotter | When a failure happens the HA notification system will trigger on all |
108 | d55b80b0 | Guido Trotter | other nodes, including the master. The master will then be able to |
109 | d55b80b0 | Guido Trotter | offline the node. Any other work to restore instance availability should |
110 | d55b80b0 | Guido Trotter | then be done by the autorepair system. |
111 | d55b80b0 | Guido Trotter | |
112 | d55b80b0 | Guido Trotter | The following cluster tags are supported: |
113 | 9e995e4f | Guido Trotter | |
114 | d55b80b0 | Guido Trotter | - ``ocf:node-offline:use-powercycle``: Try to powercycle a node using |
115 | d55b80b0 | Guido Trotter | ``gnt-node powercycle`` when offlining. |
116 | d55b80b0 | Guido Trotter | - ``ocf:node-offline:use-poweroff``: Try to power off a node using |
117 | d55b80b0 | Guido Trotter | ``gnt-node power off`` when offlining (requires OOB support). |
118 | d55b80b0 | Guido Trotter | |
119 | d55b80b0 | Guido Trotter | Future improvements |
120 | d55b80b0 | Guido Trotter | +++++++++++++++++++ |
121 | d55b80b0 | Guido Trotter | |
122 | d55b80b0 | Guido Trotter | - Handle draining differently than offlining |
123 | d55b80b0 | Guido Trotter | - Handle different modes of "stopping" the service |
124 | d55b80b0 | Guido Trotter | - Implement "full" mode |
125 | d55b80b0 | Guido Trotter | |
126 | d55b80b0 | Guido Trotter | |
127 | d55b80b0 | Guido Trotter | Risks |
128 | d55b80b0 | Guido Trotter | ----- |
129 | d55b80b0 | Guido Trotter | |
130 | d55b80b0 | Guido Trotter | Running Ganeti with Pacemaker increases the risk of stability for your |
131 | d55b80b0 | Guido Trotter | Ganeti Cluster. Events like: |
132 | d55b80b0 | Guido Trotter | |
133 | d55b80b0 | Guido Trotter | - stopping heartbeat or corosync on a node |
134 | d55b80b0 | Guido Trotter | - corosync or heartbeat being killed for any reason |
135 | d55b80b0 | Guido Trotter | - temporary failure in a node's networking |
136 | d55b80b0 | Guido Trotter | |
137 | d55b80b0 | Guido Trotter | will trigger potentially dangerous operations such as node offlining or |
138 | d55b80b0 | Guido Trotter | master role failover. Moreover if the autorepair system will be working |
139 | d55b80b0 | Guido Trotter | they will be able to also trigger instance failovers or migrations, and |
140 | d55b80b0 | Guido Trotter | disk replaces. |
141 | d55b80b0 | Guido Trotter | |
142 | d55b80b0 | Guido Trotter | Also note that operations like: master-failover, or manual node-modify |
143 | d55b80b0 | Guido Trotter | might interact badly with this setup depending on the way your HA system |
144 | d55b80b0 | Guido Trotter | is configured (see below). |
145 | d55b80b0 | Guido Trotter | |
146 | d55b80b0 | Guido Trotter | This of course is an inherent problem with any Linux-HA installation, |
147 | d55b80b0 | Guido Trotter | but is probably more visible with Ganeti given that our resources tend |
148 | d55b80b0 | Guido Trotter | to be more heavyweight than many others managed in HA clusters (eg. an |
149 | d55b80b0 | Guido Trotter | IP address). |
150 | d55b80b0 | Guido Trotter | |
151 | d55b80b0 | Guido Trotter | Code status |
152 | d55b80b0 | Guido Trotter | ----------- |
153 | d55b80b0 | Guido Trotter | |
154 | d55b80b0 | Guido Trotter | This code is heavily experimental, and Linux-HA is a very complex |
155 | d55b80b0 | Guido Trotter | subsystem. *We might not be able to help you* if you decide to run this |
156 | d55b80b0 | Guido Trotter | code: please make sure you understand fully high availability on your |
157 | d55b80b0 | Guido Trotter | production machines. Ganeti only ships this code as an example but it |
158 | d55b80b0 | Guido Trotter | might need customization or complex configurations on your side for it |
159 | d55b80b0 | Guido Trotter | to run properly. |
160 | d55b80b0 | Guido Trotter | |
161 | d55b80b0 | Guido Trotter | *Ganeti does not automate HA configuration for your cluster*. You need |
162 | d55b80b0 | Guido Trotter | to do this job by hand. Good luck, don't get it wrong. |
163 | d55b80b0 | Guido Trotter | |
164 | d55b80b0 | Guido Trotter | |
165 | d55b80b0 | Guido Trotter | Future work |
166 | d55b80b0 | Guido Trotter | =========== |
167 | d55b80b0 | Guido Trotter | |
168 | d55b80b0 | Guido Trotter | - Integrate the agents better with the ganeti monitoring |
169 | d55b80b0 | Guido Trotter | - Add hooks for managing HA at node add/remove/modify/master-failover |
170 | d55b80b0 | Guido Trotter | operations |
171 | d55b80b0 | Guido Trotter | - Provide a stonith system through Ganeti's OOB system |
172 | d55b80b0 | Guido Trotter | - Provide an OOB system that does "shunning" of offline nodes, for |
173 | d55b80b0 | Guido Trotter | emulating a real OOB, at least on all nodes |
174 | d55b80b0 | Guido Trotter | |
175 | d55b80b0 | Guido Trotter | .. vim: set textwidth=72 : |
176 | d55b80b0 | Guido Trotter | .. Local Variables: |
177 | d55b80b0 | Guido Trotter | .. mode: rst |
178 | d55b80b0 | Guido Trotter | .. fill-column: 72 |
179 | d55b80b0 | Guido Trotter | .. End: |