root / doc / design-monitoring-agent.rst @ 11c97d7c
History | View | Annotate | Download (23 kB)
1 | 109e07c2 | Guido Trotter | ======================= |
---|---|---|---|
2 | 109e07c2 | Guido Trotter | Ganeti monitoring agent |
3 | 109e07c2 | Guido Trotter | ======================= |
4 | 109e07c2 | Guido Trotter | |
5 | 109e07c2 | Guido Trotter | .. contents:: :depth: 4 |
6 | 109e07c2 | Guido Trotter | |
7 | 109e07c2 | Guido Trotter | This is a design document detailing the implementation of a Ganeti |
8 | 109e07c2 | Guido Trotter | monitoring agent report system, that can be queried by a monitoring |
9 | 109e07c2 | Guido Trotter | system to calculate health information for a Ganeti cluster. |
10 | 109e07c2 | Guido Trotter | |
11 | 109e07c2 | Guido Trotter | Current state and shortcomings |
12 | 109e07c2 | Guido Trotter | ============================== |
13 | 109e07c2 | Guido Trotter | |
14 | 109e07c2 | Guido Trotter | There is currently no monitoring support in Ganeti. While we don't want |
15 | 109e07c2 | Guido Trotter | to build something like Nagios or Pacemaker as part of Ganeti, it would |
16 | 109e07c2 | Guido Trotter | be useful if such tools could easily extract information from a Ganeti |
17 | 109e07c2 | Guido Trotter | machine in order to take actions (example actions include logging an |
18 | 109e07c2 | Guido Trotter | outage for future reporting or alerting a person or system about it). |
19 | 109e07c2 | Guido Trotter | |
20 | 109e07c2 | Guido Trotter | Proposed changes |
21 | 109e07c2 | Guido Trotter | ================ |
22 | 109e07c2 | Guido Trotter | |
23 | 109e07c2 | Guido Trotter | Each Ganeti node should export a status page that can be queried by a |
24 | 109e07c2 | Guido Trotter | monitoring system. Such status page will be exported on a network port |
25 | 109e07c2 | Guido Trotter | and will be encoded in JSON (simple text) over HTTP. |
26 | 109e07c2 | Guido Trotter | |
27 | 3301805f | Michele Tartara | The choice of JSON is obvious as we already depend on it in Ganeti and |
28 | 109e07c2 | Guido Trotter | thus we don't need to add extra libraries to use it, as opposed to what |
29 | 109e07c2 | Guido Trotter | would happen for XML or some other markup format. |
30 | 109e07c2 | Guido Trotter | |
31 | 109e07c2 | Guido Trotter | Location of agent report |
32 | 109e07c2 | Guido Trotter | ------------------------ |
33 | 109e07c2 | Guido Trotter | |
34 | 109e07c2 | Guido Trotter | The report will be available from all nodes, and be concerned for all |
35 | 109e07c2 | Guido Trotter | node-local resources. This allows more real-time information to be |
36 | 109e07c2 | Guido Trotter | available, at the cost of querying all nodes. |
37 | 109e07c2 | Guido Trotter | |
38 | 109e07c2 | Guido Trotter | Information reported |
39 | 109e07c2 | Guido Trotter | -------------------- |
40 | 109e07c2 | Guido Trotter | |
41 | 109e07c2 | Guido Trotter | The monitoring agent system will report on the following basic information: |
42 | 109e07c2 | Guido Trotter | |
43 | 109e07c2 | Guido Trotter | - Instance status |
44 | 109e07c2 | Guido Trotter | - Instance disk status |
45 | 109e07c2 | Guido Trotter | - Status of storage for instances |
46 | 109e07c2 | Guido Trotter | - Ganeti daemons status, CPU usage, memory footprint |
47 | 109e07c2 | Guido Trotter | - Hypervisor resources report (memory, CPU, network interfaces) |
48 | 109e07c2 | Guido Trotter | - Node OS resources report (memory, CPU, network interfaces) |
49 | 109e07c2 | Guido Trotter | - Information from a plugin system |
50 | 109e07c2 | Guido Trotter | |
51 | 3301805f | Michele Tartara | Format of the report |
52 | 3301805f | Michele Tartara | -------------------- |
53 | 3301805f | Michele Tartara | |
54 | 3301805f | Michele Tartara | The report of the will be in JSON format, and it will present an array |
55 | 3301805f | Michele Tartara | of report objects. |
56 | 3301805f | Michele Tartara | Each report object will be produced by a specific data collector. |
57 | 3301805f | Michele Tartara | Each report object includes some mandatory fields, to be provided by all |
58 | 3301805f | Michele Tartara | the data collectors: |
59 | 3301805f | Michele Tartara | |
60 | 3301805f | Michele Tartara | ``name`` |
61 | 3301805f | Michele Tartara | The name of the data collector that produced this part of the report. |
62 | 3301805f | Michele Tartara | It is supposed to be unique inside a report. |
63 | 3301805f | Michele Tartara | |
64 | 3301805f | Michele Tartara | ``version`` |
65 | 3301805f | Michele Tartara | The version of the data collector that produces this part of the |
66 | 3301805f | Michele Tartara | report. Built-in data collectors (as opposed to those implemented as |
67 | 3301805f | Michele Tartara | plugins) should have "B" as the version number. |
68 | 3301805f | Michele Tartara | |
69 | 834dc290 | Michele Tartara | ``format_version`` |
70 | 3301805f | Michele Tartara | The format of what is represented in the "data" field for each data |
71 | 3301805f | Michele Tartara | collector might change over time. Every time this happens, the |
72 | 3301805f | Michele Tartara | format_version should be changed, so that who reads the report knows |
73 | 3301805f | Michele Tartara | what format to expect, and how to correctly interpret it. |
74 | 3301805f | Michele Tartara | |
75 | 3301805f | Michele Tartara | ``timestamp`` |
76 | 0e8d8384 | Michele Tartara | The time when the reported data were gathered. It has to be expressed |
77 | 3301805f | Michele Tartara | in nanoseconds since the unix epoch (0:00:00 January 01, 1970). If not |
78 | 3301805f | Michele Tartara | enough precision is available (or needed) it can be padded with |
79 | 3301805f | Michele Tartara | zeroes. If a report object needs multiple timestamps, it can add more |
80 | 3301805f | Michele Tartara | and/or override this one inside its own "data" section. |
81 | 3301805f | Michele Tartara | |
82 | 3301805f | Michele Tartara | ``category`` |
83 | 3301805f | Michele Tartara | A collector can belong to a given category of collectors (e.g.: storage |
84 | 3301805f | Michele Tartara | collectors, daemon collector). This means that it will have to provide a |
85 | 3301805f | Michele Tartara | minumum set of prescribed fields, as documented for each category. |
86 | 3301805f | Michele Tartara | This field will contain the name of the category the collector belongs to, |
87 | 3301805f | Michele Tartara | if any, or just the ``null`` value. |
88 | 3301805f | Michele Tartara | |
89 | 3301805f | Michele Tartara | ``kind`` |
90 | 3301805f | Michele Tartara | Two kinds of collectors are possible: |
91 | 3301805f | Michele Tartara | `Performance reporting collectors`_ and `Status reporting collectors`_. |
92 | 3301805f | Michele Tartara | The respective paragraphs will describe them and the value of this field. |
93 | 3301805f | Michele Tartara | |
94 | 3301805f | Michele Tartara | ``data`` |
95 | 3301805f | Michele Tartara | This field contains all the data generated by the specific data collector, |
96 | 3301805f | Michele Tartara | in its own independently defined format. The monitoring agent could check |
97 | 3301805f | Michele Tartara | this syntactically (according to the JSON specifications) but not |
98 | 3301805f | Michele Tartara | semantically. |
99 | 3301805f | Michele Tartara | |
100 | 3301805f | Michele Tartara | Here follows a minimal example of a report:: |
101 | 3301805f | Michele Tartara | |
102 | 3301805f | Michele Tartara | [ |
103 | 3301805f | Michele Tartara | { |
104 | 3301805f | Michele Tartara | "name" : "TheCollectorIdentifier", |
105 | 3301805f | Michele Tartara | "version" : "1.2", |
106 | 834dc290 | Michele Tartara | "format_version" : 1, |
107 | 3301805f | Michele Tartara | "timestamp" : 1351607182000000000, |
108 | 3301805f | Michele Tartara | "category" : null, |
109 | 3301805f | Michele Tartara | "kind" : 0, |
110 | 3301805f | Michele Tartara | "data" : { "plugin_specific_data" : "go_here" } |
111 | 3301805f | Michele Tartara | }, |
112 | 3301805f | Michele Tartara | { |
113 | 3301805f | Michele Tartara | "name" : "AnotherDataCollector", |
114 | 3301805f | Michele Tartara | "version" : "B", |
115 | 834dc290 | Michele Tartara | "format_version" : 7, |
116 | 3301805f | Michele Tartara | "timestamp" : 1351609526123854000, |
117 | 3301805f | Michele Tartara | "category" : "storage", |
118 | 3301805f | Michele Tartara | "kind" : 1, |
119 | 3301805f | Michele Tartara | "data" : { "status" : { "code" : 1, |
120 | 3301805f | Michele Tartara | "message" : "Error on disk 2" |
121 | 3301805f | Michele Tartara | }, |
122 | 3301805f | Michele Tartara | "plugin_specific" : "data", |
123 | 3301805f | Michele Tartara | "some_late_data" : { "timestamp" : 1351609526123942720, |
124 | 3301805f | Michele Tartara | ... |
125 | 3301805f | Michele Tartara | } |
126 | 3301805f | Michele Tartara | } |
127 | 3301805f | Michele Tartara | } |
128 | 3301805f | Michele Tartara | ] |
129 | 3301805f | Michele Tartara | |
130 | 3301805f | Michele Tartara | Performance reporting collectors |
131 | 3301805f | Michele Tartara | ++++++++++++++++++++++++++++++++ |
132 | 3301805f | Michele Tartara | |
133 | 3301805f | Michele Tartara | These collectors only provide data about some component of the system, without |
134 | 3301805f | Michele Tartara | giving any interpretation over their meaning. |
135 | 3301805f | Michele Tartara | |
136 | 3301805f | Michele Tartara | The value of the ``kind`` field of the report will be ``0``. |
137 | 3301805f | Michele Tartara | |
138 | 3301805f | Michele Tartara | Status reporting collectors |
139 | 3301805f | Michele Tartara | +++++++++++++++++++++++++++ |
140 | 3301805f | Michele Tartara | |
141 | 3301805f | Michele Tartara | These collectors will provide information about the status of some |
142 | 3301805f | Michele Tartara | component of ganeti, or managed by ganeti. |
143 | 3301805f | Michele Tartara | |
144 | 3301805f | Michele Tartara | The value of their ``kind`` field will be ``1``. |
145 | 3301805f | Michele Tartara | |
146 | 3301805f | Michele Tartara | The rationale behind this kind of collectors is that there are some situations |
147 | 3301805f | Michele Tartara | where exporting data about the underlying subsystems would expose potential |
148 | 3301805f | Michele Tartara | issues. But if Ganeti itself is able (and going) to fix the problem, conflicts |
149 | 3301805f | Michele Tartara | might arise between Ganeti and something/somebody else trying to fix the same |
150 | 3301805f | Michele Tartara | problem. |
151 | 3301805f | Michele Tartara | Also, some external monitoring systems might not be aware of the internals of a |
152 | 3301805f | Michele Tartara | particular subsystem (e.g.: DRBD) and might only exploit the high level |
153 | 3301805f | Michele Tartara | response of its data collector, alerting an administrator if anything is wrong. |
154 | 3301805f | Michele Tartara | Still, completely hiding the underlying data is not a good idea, as they might |
155 | 3301805f | Michele Tartara | still be of use in some cases. So status reporting plugins will provide two |
156 | 3301805f | Michele Tartara | output modes: one just exporting a high level information about the status, |
157 | 3301805f | Michele Tartara | and one also exporting all the data they gathered. |
158 | 3301805f | Michele Tartara | The default output mode will be the status-only one. Through a command line |
159 | 3301805f | Michele Tartara | parameter (for stand-alone data collectors) or through the HTTP request to the |
160 | 3301805f | Michele Tartara | monitoring agent |
161 | 3301805f | Michele Tartara | (when collectors are executed as part of it) the verbose output mode providing |
162 | 3301805f | Michele Tartara | all the data can be selected. |
163 | 3301805f | Michele Tartara | |
164 | 3301805f | Michele Tartara | When exporting just the status each status reporting collector will provide, |
165 | 3301805f | Michele Tartara | in its ``data`` section, at least the following field: |
166 | 3301805f | Michele Tartara | |
167 | 3301805f | Michele Tartara | ``status`` |
168 | 3301805f | Michele Tartara | summarizes the status of the component being monitored and consists of two |
169 | 3301805f | Michele Tartara | subfields: |
170 | 3301805f | Michele Tartara | |
171 | 3301805f | Michele Tartara | ``code`` |
172 | 3301805f | Michele Tartara | It assumes a numeric value, encoded in such a way to allow using a bitset |
173 | 3301805f | Michele Tartara | to easily distinguish which states are currently present in the whole cluster. |
174 | 3301805f | Michele Tartara | If the bitwise OR of all the ``status`` fields is 0, the cluster is |
175 | 3301805f | Michele Tartara | completely healty. |
176 | 3301805f | Michele Tartara | The status codes are as follows: |
177 | 3301805f | Michele Tartara | |
178 | 3301805f | Michele Tartara | ``0`` |
179 | 3301805f | Michele Tartara | The collector can determine that everything is working as |
180 | 3301805f | Michele Tartara | intended. |
181 | 3301805f | Michele Tartara | |
182 | 3301805f | Michele Tartara | ``1`` |
183 | 3301805f | Michele Tartara | Something is temporarily wrong but it is being automatically fixed by |
184 | 3301805f | Michele Tartara | Ganeti. |
185 | 3301805f | Michele Tartara | There is no need of external intervention. |
186 | 3301805f | Michele Tartara | |
187 | 3301805f | Michele Tartara | ``2`` |
188 | 3301805f | Michele Tartara | The collector has failed to understand whether the status is good or |
189 | 3301805f | Michele Tartara | bad. Further analysis is required. Interpret this status as a |
190 | 3301805f | Michele Tartara | potentially dangerous situation. |
191 | 3301805f | Michele Tartara | |
192 | 82437b28 | Michele Tartara | ``4`` |
193 | 82437b28 | Michele Tartara | The collector can determine that something is wrong and Ganeti has no |
194 | 82437b28 | Michele Tartara | way to fix it autonomously. External intervention is required. |
195 | 82437b28 | Michele Tartara | |
196 | 3301805f | Michele Tartara | ``message`` |
197 | 3301805f | Michele Tartara | A message to better explain the reason of the status. |
198 | 3301805f | Michele Tartara | The exact format of the message string is data collector dependent. |
199 | 3301805f | Michele Tartara | |
200 | debfca88 | Michele Tartara | The field is mandatory, but the content can be an empty string if the |
201 | debfca88 | Michele Tartara | ``code`` is ``0`` (working as intended) or ``1`` (being fixed |
202 | debfca88 | Michele Tartara | automatically). |
203 | 3301805f | Michele Tartara | |
204 | 3301805f | Michele Tartara | If the status code is ``2``, the message should specify what has gone |
205 | 3301805f | Michele Tartara | wrong. |
206 | 3301805f | Michele Tartara | If the status code is ``4``, the message shoud explain why it was not |
207 | 3301805f | Michele Tartara | possible to determine a proper status. |
208 | 3301805f | Michele Tartara | |
209 | 3301805f | Michele Tartara | The ``data`` section will also contain all the fields describing the gathered |
210 | 3301805f | Michele Tartara | data, according to a collector-specific format. |
211 | 3301805f | Michele Tartara | |
212 | 109e07c2 | Guido Trotter | Instance status |
213 | 109e07c2 | Guido Trotter | +++++++++++++++ |
214 | 109e07c2 | Guido Trotter | |
215 | 109e07c2 | Guido Trotter | At the moment each node knows which instances are running on it, which |
216 | 109e07c2 | Guido Trotter | instances it is primary for, but not the cause why an instance might not |
217 | 109e07c2 | Guido Trotter | be running. On the other hand we don't want to distribute full instance |
218 | 109e07c2 | Guido Trotter | "admin" status information to all nodes, because of the performance |
219 | 109e07c2 | Guido Trotter | impact this would have. |
220 | 109e07c2 | Guido Trotter | |
221 | 109e07c2 | Guido Trotter | As such we propose that: |
222 | 109e07c2 | Guido Trotter | |
223 | 109e07c2 | Guido Trotter | - Any operation that can affect instance status will have an optional |
224 | 109e07c2 | Guido Trotter | "reason" attached to it (at opcode level). This can be used for |
225 | 109e07c2 | Guido Trotter | example to distinguish an admin request, from a scheduled maintenance |
226 | 109e07c2 | Guido Trotter | or an automated tool's work. If this reason is not passed, Ganeti will |
227 | 2bd9ec7c | Michele Tartara | just use the information it has about the source of the request. |
228 | 2bd9ec7c | Michele Tartara | This reason information will be structured according to the |
229 | 2bd9ec7c | Michele Tartara | :doc:`Ganeti reason trail <design-reason-trail>` design document. |
230 | 109e07c2 | Guido Trotter | - RPCs that affect the instance status will be changed so that the |
231 | 109e07c2 | Guido Trotter | "reason" and the version of the config object they ran on is passed to |
232 | 109e07c2 | Guido Trotter | them. They will then export the new expected instance status, together |
233 | 109e07c2 | Guido Trotter | with the associated reason and object version to the status report |
234 | 109e07c2 | Guido Trotter | system, which then will export those themselves. |
235 | 109e07c2 | Guido Trotter | |
236 | 109e07c2 | Guido Trotter | Monitoring and auditing systems can then use the reason to understand |
237 | 3301805f | Michele Tartara | the cause of an instance status, and they can use the timestamp to |
238 | 109e07c2 | Guido Trotter | understand the freshness of their data even in the absence of an atomic |
239 | 109e07c2 | Guido Trotter | cross-node reporting: for example if they see an instance "up" on a node |
240 | 109e07c2 | Guido Trotter | after seeing it running on a previous one, they can compare these values |
241 | 109e07c2 | Guido Trotter | to understand which data is freshest, and repoll the "older" node. Of |
242 | 109e07c2 | Guido Trotter | course if they keep seeing this status this represents an error (either |
243 | 109e07c2 | Guido Trotter | an instance continuously "flapping" between nodes, or an instance is |
244 | 109e07c2 | Guido Trotter | constantly up on more than one), which should be reported and acted |
245 | 109e07c2 | Guido Trotter | upon. |
246 | 109e07c2 | Guido Trotter | |
247 | 109e07c2 | Guido Trotter | The instance status will be on each node, for the instances it is |
248 | 3301805f | Michele Tartara | primary for, and its ``data`` section of the report will contain a list |
249 | 3301805f | Michele Tartara | of instances, with at least the following fields for each instance: |
250 | 3301805f | Michele Tartara | |
251 | 3301805f | Michele Tartara | ``name`` |
252 | 3301805f | Michele Tartara | The name of the instance. |
253 | 3301805f | Michele Tartara | |
254 | 3301805f | Michele Tartara | ``uuid`` |
255 | 3301805f | Michele Tartara | The UUID of the instance (stable on name change). |
256 | 3301805f | Michele Tartara | |
257 | 3301805f | Michele Tartara | ``admin_state`` |
258 | 3301805f | Michele Tartara | The status of the instance (up/down/offline) as requested by the admin. |
259 | 3301805f | Michele Tartara | |
260 | 3301805f | Michele Tartara | ``actual_state`` |
261 | 3301805f | Michele Tartara | The actual status of the instance. It can be ``up``, ``down``, or |
262 | 3301805f | Michele Tartara | ``hung`` if the instance is up but it appears to be completely stuck. |
263 | 3301805f | Michele Tartara | |
264 | 3301805f | Michele Tartara | ``uptime`` |
265 | 3301805f | Michele Tartara | The uptime of the instance (if it is up, "null" otherwise). |
266 | 3301805f | Michele Tartara | |
267 | 3301805f | Michele Tartara | ``mtime`` |
268 | 3301805f | Michele Tartara | The timestamp of the last known change to the instance state. |
269 | 3301805f | Michele Tartara | |
270 | 3301805f | Michele Tartara | ``state_reason`` |
271 | 2bd9ec7c | Michele Tartara | The last known reason for state change of the instance, described according |
272 | 2bd9ec7c | Michele Tartara | to the JSON representation of a reason trail, as detailed in the :doc:`reason trail |
273 | 2bd9ec7c | Michele Tartara | design document <design-reason-trail>`. |
274 | 109e07c2 | Guido Trotter | |
275 | 3301805f | Michele Tartara | ``status`` |
276 | 3301805f | Michele Tartara | It represents the status of the instance, and its format is the same as that |
277 | 3301805f | Michele Tartara | of the ``status`` field of `Status reporting collectors`_. |
278 | 3301805f | Michele Tartara | |
279 | 3301805f | Michele Tartara | Each hypervisor should provide its own instance status data collector, possibly |
280 | 3301805f | Michele Tartara | with the addition of more, specific, fields. |
281 | 3301805f | Michele Tartara | The ``category`` field of all of them will be ``instance``. |
282 | 3301805f | Michele Tartara | The ``kind`` field will be ``1``. |
283 | 109e07c2 | Guido Trotter | |
284 | 109e07c2 | Guido Trotter | Note that as soon as a node knows it's not the primary anymore for an |
285 | 109e07c2 | Guido Trotter | instance it will stop reporting status for it: this means the instance |
286 | 109e07c2 | Guido Trotter | will either disappear, if it has been deleted, or appear on another |
287 | 109e07c2 | Guido Trotter | node, if it's been moved. |
288 | 109e07c2 | Guido Trotter | |
289 | 3301805f | Michele Tartara | The ``code`` of the ``status`` field of the report of the Instance status data |
290 | 3301805f | Michele Tartara | collector will be: |
291 | 109e07c2 | Guido Trotter | |
292 | 3301805f | Michele Tartara | ``0`` |
293 | 3301805f | Michele Tartara | if ``status`` is ``0`` for all the instances it is reporting about. |
294 | 109e07c2 | Guido Trotter | |
295 | 3301805f | Michele Tartara | ``1`` |
296 | 3301805f | Michele Tartara | otherwise. |
297 | 3301805f | Michele Tartara | |
298 | 3301805f | Michele Tartara | Storage status |
299 | 3301805f | Michele Tartara | ++++++++++++++ |
300 | 3301805f | Michele Tartara | |
301 | 3301805f | Michele Tartara | The storage status collectors will be a series of data collectors |
302 | 3301805f | Michele Tartara | (drbd, rbd, plain, file) that will gather data about all the storage types |
303 | 3301805f | Michele Tartara | for the current node (this is right now hardcoded to the enabled storage |
304 | 3301805f | Michele Tartara | types, and in the future tied to the enabled storage pools for the nodegroup). |
305 | 3301805f | Michele Tartara | |
306 | 3301805f | Michele Tartara | The ``name`` of each of these collector will reflect what storage type each of |
307 | 3301805f | Michele Tartara | them refers to. |
308 | 3301805f | Michele Tartara | |
309 | 3301805f | Michele Tartara | The ``category`` field of these collector will be ``storage``. |
310 | 3301805f | Michele Tartara | |
311 | 3301805f | Michele Tartara | The ``kind`` field will be ``1`` (`Status reporting collectors`_). |
312 | 3301805f | Michele Tartara | |
313 | 3301805f | Michele Tartara | The ``data`` section of the report will provide at least the following fields: |
314 | 3301805f | Michele Tartara | |
315 | 3301805f | Michele Tartara | ``free`` |
316 | 3301805f | Michele Tartara | The amount of free space (in KBytes). |
317 | 3301805f | Michele Tartara | |
318 | 3301805f | Michele Tartara | ``used`` |
319 | 3301805f | Michele Tartara | The amount of used space (in KBytes). |
320 | 3301805f | Michele Tartara | |
321 | 3301805f | Michele Tartara | ``total`` |
322 | 3301805f | Michele Tartara | The total visible space (in KBytes). |
323 | 3301805f | Michele Tartara | |
324 | 3301805f | Michele Tartara | Each specific storage type might provide more type-specific fields. |
325 | 3301805f | Michele Tartara | |
326 | 3301805f | Michele Tartara | In case of error, the ``message`` subfield of the ``status`` field of the |
327 | 3301805f | Michele Tartara | report of the instance status collector will disclose the nature of the error |
328 | 3301805f | Michele Tartara | as a type specific information. Examples of these are "backend pv unavailable" |
329 | 3301805f | Michele Tartara | for lvm storage, "unreachable" for network based storage or "filesystem error" |
330 | 3301805f | Michele Tartara | for filesystem based implementations. |
331 | 3301805f | Michele Tartara | |
332 | 3301805f | Michele Tartara | DRBD status |
333 | 3301805f | Michele Tartara | *********** |
334 | 3301805f | Michele Tartara | |
335 | 3301805f | Michele Tartara | This data collector will run only on nodes where DRBD is actually |
336 | 3301805f | Michele Tartara | present and it will gather information about DRBD devices. |
337 | 3301805f | Michele Tartara | |
338 | 3301805f | Michele Tartara | Its ``kind`` in the report will be ``1`` (`Status reporting collectors`_). |
339 | 3301805f | Michele Tartara | |
340 | 3301805f | Michele Tartara | Its ``category`` field in the report will contain the value ``storage``. |
341 | 3301805f | Michele Tartara | |
342 | 3301805f | Michele Tartara | When executed in verbose mode, the ``data`` section of the report of this |
343 | 3301805f | Michele Tartara | collector will provide the following fields: |
344 | 3301805f | Michele Tartara | |
345 | 3301805f | Michele Tartara | ``versionInfo`` |
346 | 3301805f | Michele Tartara | Information about the DRBD version number, given by a combination of |
347 | 3301805f | Michele Tartara | any (but at least one) of the following fields: |
348 | 3301805f | Michele Tartara | |
349 | 3301805f | Michele Tartara | ``version`` |
350 | 3301805f | Michele Tartara | The DRBD driver version. |
351 | 3301805f | Michele Tartara | |
352 | 3301805f | Michele Tartara | ``api`` |
353 | 3301805f | Michele Tartara | The API version number. |
354 | 3301805f | Michele Tartara | |
355 | 3301805f | Michele Tartara | ``proto`` |
356 | 3301805f | Michele Tartara | The protocol version. |
357 | 3301805f | Michele Tartara | |
358 | 3301805f | Michele Tartara | ``srcversion`` |
359 | 3301805f | Michele Tartara | The version of the source files. |
360 | 3301805f | Michele Tartara | |
361 | 3301805f | Michele Tartara | ``gitHash`` |
362 | 3301805f | Michele Tartara | Git hash of the source files. |
363 | 3301805f | Michele Tartara | |
364 | 3301805f | Michele Tartara | ``buildBy`` |
365 | 3301805f | Michele Tartara | Who built the binary, and, optionally, when. |
366 | 3301805f | Michele Tartara | |
367 | 3301805f | Michele Tartara | ``device`` |
368 | 3301805f | Michele Tartara | A list of structures, each describing a DRBD device (a minor) and containing |
369 | 3301805f | Michele Tartara | the following fields: |
370 | 3301805f | Michele Tartara | |
371 | 3301805f | Michele Tartara | ``minor`` |
372 | 3301805f | Michele Tartara | The device minor number. |
373 | 3301805f | Michele Tartara | |
374 | 3301805f | Michele Tartara | ``connectionState`` |
375 | 3301805f | Michele Tartara | The state of the connection. If it is "Unconfigured", all the following |
376 | 3301805f | Michele Tartara | fields are not present. |
377 | 3301805f | Michele Tartara | |
378 | 3301805f | Michele Tartara | ``localRole`` |
379 | 3301805f | Michele Tartara | The role of the local resource. |
380 | 3301805f | Michele Tartara | |
381 | 3301805f | Michele Tartara | ``remoteRole`` |
382 | 3301805f | Michele Tartara | The role of the remote resource. |
383 | 3301805f | Michele Tartara | |
384 | 3301805f | Michele Tartara | ``localState`` |
385 | 3301805f | Michele Tartara | The status of the local disk. |
386 | 3301805f | Michele Tartara | |
387 | 3301805f | Michele Tartara | ``remoteState`` |
388 | 3301805f | Michele Tartara | The status of the remote disk. |
389 | 3301805f | Michele Tartara | |
390 | 3301805f | Michele Tartara | ``replicationProtocol`` |
391 | 3301805f | Michele Tartara | The replication protocol being used. |
392 | 3301805f | Michele Tartara | |
393 | 3301805f | Michele Tartara | ``ioFlags`` |
394 | 3301805f | Michele Tartara | The input/output flags. |
395 | 3301805f | Michele Tartara | |
396 | 3301805f | Michele Tartara | ``perfIndicators`` |
397 | 3301805f | Michele Tartara | The performance indicators. This field will contain the following |
398 | 3301805f | Michele Tartara | sub-fields: |
399 | 3301805f | Michele Tartara | |
400 | 3301805f | Michele Tartara | ``networkSend`` |
401 | 3301805f | Michele Tartara | KiB of data sent on the network. |
402 | 3301805f | Michele Tartara | |
403 | 3301805f | Michele Tartara | ``networkReceive`` |
404 | 3301805f | Michele Tartara | KiB of data received from the network. |
405 | 3301805f | Michele Tartara | |
406 | 3301805f | Michele Tartara | ``diskWrite`` |
407 | 3301805f | Michele Tartara | KiB of data written on local disk. |
408 | 3301805f | Michele Tartara | |
409 | 3301805f | Michele Tartara | ``diskRead`` |
410 | 3301805f | Michele Tartara | KiB of date read from the local disk. |
411 | 3301805f | Michele Tartara | |
412 | 3301805f | Michele Tartara | ``activityLog`` |
413 | 3301805f | Michele Tartara | Number of updates of the activity log. |
414 | 3301805f | Michele Tartara | |
415 | 3301805f | Michele Tartara | ``bitMap`` |
416 | 3301805f | Michele Tartara | Number of updates to the bitmap area of the metadata. |
417 | 3301805f | Michele Tartara | |
418 | 3301805f | Michele Tartara | ``localCount`` |
419 | 3301805f | Michele Tartara | Number of open requests to the local I/O subsystem. |
420 | 3301805f | Michele Tartara | |
421 | 3301805f | Michele Tartara | ``pending`` |
422 | 3301805f | Michele Tartara | Number of requests sent to the partner but not yet answered. |
423 | 3301805f | Michele Tartara | |
424 | 3301805f | Michele Tartara | ``unacknowledged`` |
425 | 3301805f | Michele Tartara | Number of requests received by the partner but still to be answered. |
426 | 3301805f | Michele Tartara | |
427 | 3301805f | Michele Tartara | ``applicationPending`` |
428 | 3301805f | Michele Tartara | Num of block input/output requests forwarded to DRBD but that have not yet |
429 | 3301805f | Michele Tartara | been answered. |
430 | 3301805f | Michele Tartara | |
431 | 3301805f | Michele Tartara | ``epochs`` |
432 | 3301805f | Michele Tartara | (Optional) Number of epoch objects. Not provided by all DRBD versions. |
433 | 3301805f | Michele Tartara | |
434 | 3301805f | Michele Tartara | ``writeOrder`` |
435 | 3301805f | Michele Tartara | (Optional) Currently used write ordering method. Not provided by all DRBD |
436 | 3301805f | Michele Tartara | versions. |
437 | 3301805f | Michele Tartara | |
438 | 3301805f | Michele Tartara | ``outOfSync`` |
439 | 3301805f | Michele Tartara | (Optional) KiB of storage currently out of sync. Not provided by all DRBD |
440 | 3301805f | Michele Tartara | versions. |
441 | 3301805f | Michele Tartara | |
442 | 3301805f | Michele Tartara | ``syncStatus`` |
443 | 3301805f | Michele Tartara | (Optional) The status of the synchronization of the disk. This is present |
444 | 3301805f | Michele Tartara | only if the disk is being synchronized, and includes the following fields: |
445 | 3301805f | Michele Tartara | |
446 | 3301805f | Michele Tartara | ``percentage`` |
447 | 3301805f | Michele Tartara | The percentage of synchronized data. |
448 | 3301805f | Michele Tartara | |
449 | 3301805f | Michele Tartara | ``progress`` |
450 | 3301805f | Michele Tartara | How far the synchronization is. Written as "x/y", where x and y are |
451 | 3301805f | Michele Tartara | integer numbers expressed in the measurement unit stated in |
452 | 3301805f | Michele Tartara | ``progressUnit`` |
453 | 3301805f | Michele Tartara | |
454 | 3301805f | Michele Tartara | ``progressUnit`` |
455 | 3301805f | Michele Tartara | The measurement unit for the progress indicator. |
456 | 3301805f | Michele Tartara | |
457 | 3301805f | Michele Tartara | ``timeToFinish`` |
458 | 3301805f | Michele Tartara | The expected time before finishing the synchronization. |
459 | 3301805f | Michele Tartara | |
460 | 3301805f | Michele Tartara | ``speed`` |
461 | 3301805f | Michele Tartara | The speed of the synchronization. |
462 | 3301805f | Michele Tartara | |
463 | 3301805f | Michele Tartara | ``want`` |
464 | 3301805f | Michele Tartara | The desiderd speed of the synchronization. |
465 | 3301805f | Michele Tartara | |
466 | 3301805f | Michele Tartara | ``speedUnit`` |
467 | 3301805f | Michele Tartara | The measurement unit of the ``speed`` and ``want`` values. Expressed |
468 | 3301805f | Michele Tartara | as "size/time". |
469 | 3301805f | Michele Tartara | |
470 | 3301805f | Michele Tartara | ``instance`` |
471 | 3301805f | Michele Tartara | The name of the Ganeti instance this disk is associated to. |
472 | 109e07c2 | Guido Trotter | |
473 | 109e07c2 | Guido Trotter | |
474 | 109e07c2 | Guido Trotter | Ganeti daemons status |
475 | 109e07c2 | Guido Trotter | +++++++++++++++++++++ |
476 | 109e07c2 | Guido Trotter | |
477 | 3301805f | Michele Tartara | Ganeti will report what information it has about its own daemons. |
478 | 3301805f | Michele Tartara | This should allow identifying possible problems with the Ganeti system itself: |
479 | 3301805f | Michele Tartara | for example memory leaks, crashes and high resource utilization should be |
480 | 3301805f | Michele Tartara | evident by analyzing this information. |
481 | 3301805f | Michele Tartara | |
482 | 3301805f | Michele Tartara | The ``kind`` field will be ``1`` (`Status reporting collectors`_). |
483 | 3301805f | Michele Tartara | |
484 | 3301805f | Michele Tartara | Each daemon will have its own data collector, and each of them will have |
485 | 3301805f | Michele Tartara | a ``category`` field valued ``daemon``. |
486 | 3301805f | Michele Tartara | |
487 | 3301805f | Michele Tartara | When executed in verbose mode, their data section will include at least: |
488 | 3301805f | Michele Tartara | |
489 | 3301805f | Michele Tartara | ``memory`` |
490 | 3301805f | Michele Tartara | The amount of used memory. |
491 | 3301805f | Michele Tartara | |
492 | 3301805f | Michele Tartara | ``size_unit`` |
493 | 3301805f | Michele Tartara | The measurement unit used for the memory. |
494 | 109e07c2 | Guido Trotter | |
495 | 3301805f | Michele Tartara | ``uptime`` |
496 | 3301805f | Michele Tartara | The uptime of the daemon. |
497 | 3301805f | Michele Tartara | |
498 | 3301805f | Michele Tartara | ``CPU usage`` |
499 | 3301805f | Michele Tartara | How much cpu the daemon is using (percentage). |
500 | 3301805f | Michele Tartara | |
501 | 3301805f | Michele Tartara | Any other daemon-specific information can be included as well in the ``data`` |
502 | 3301805f | Michele Tartara | section. |
503 | 109e07c2 | Guido Trotter | |
504 | 109e07c2 | Guido Trotter | Hypervisor resources report |
505 | 109e07c2 | Guido Trotter | +++++++++++++++++++++++++++ |
506 | 109e07c2 | Guido Trotter | |
507 | 109e07c2 | Guido Trotter | Each hypervisor has a view of system resources that sometimes is |
508 | 109e07c2 | Guido Trotter | different than the one the OS sees (for example in Xen the Node OS, |
509 | 109e07c2 | Guido Trotter | running as Dom0, has access to only part of those resources). In this |
510 | 109e07c2 | Guido Trotter | section we'll report all information we can in a "non hypervisor |
511 | 109e07c2 | Guido Trotter | specific" way. Each hypervisor can then add extra specific information |
512 | 109e07c2 | Guido Trotter | that is not generic enough be abstracted. |
513 | 109e07c2 | Guido Trotter | |
514 | 3301805f | Michele Tartara | The ``kind`` field will be ``0`` (`Performance reporting collectors`_). |
515 | 3301805f | Michele Tartara | |
516 | 3301805f | Michele Tartara | Each of the hypervisor data collectory will be of ``category``: ``hypervisor``. |
517 | 3301805f | Michele Tartara | |
518 | 109e07c2 | Guido Trotter | Node OS resources report |
519 | 109e07c2 | Guido Trotter | ++++++++++++++++++++++++ |
520 | 109e07c2 | Guido Trotter | |
521 | 109e07c2 | Guido Trotter | Since Ganeti assumes it's running on Linux, it's useful to export some |
522 | 3301805f | Michele Tartara | basic information as seen by the host system. |
523 | 109e07c2 | Guido Trotter | |
524 | 3301805f | Michele Tartara | The ``category`` field of the report will be ``null``. |
525 | 109e07c2 | Guido Trotter | |
526 | 3301805f | Michele Tartara | The ``kind`` field will be ``0`` (`Performance reporting collectors`_). |
527 | 109e07c2 | Guido Trotter | |
528 | 3301805f | Michele Tartara | The ``data`` section will include: |
529 | 109e07c2 | Guido Trotter | |
530 | 3301805f | Michele Tartara | ``cpu_number`` |
531 | 3301805f | Michele Tartara | The number of available cpus. |
532 | 109e07c2 | Guido Trotter | |
533 | 3301805f | Michele Tartara | ``cpus`` |
534 | 3301805f | Michele Tartara | A list with one element per cpu, showing its average load. |
535 | 109e07c2 | Guido Trotter | |
536 | 3301805f | Michele Tartara | ``memory`` |
537 | 3301805f | Michele Tartara | The current view of memory (free, used, cached, etc.) |
538 | 109e07c2 | Guido Trotter | |
539 | 3301805f | Michele Tartara | ``filesystem`` |
540 | 3301805f | Michele Tartara | A list with one element per filesystem, showing a summary of the |
541 | 3301805f | Michele Tartara | total/available space. |
542 | 109e07c2 | Guido Trotter | |
543 | 3301805f | Michele Tartara | ``NICs`` |
544 | 3301805f | Michele Tartara | A list with one element per network interface, showing the amount of |
545 | 3301805f | Michele Tartara | sent/received data, error rate, IP address of the interface, etc. |
546 | 109e07c2 | Guido Trotter | |
547 | 3301805f | Michele Tartara | ``versions`` |
548 | 3301805f | Michele Tartara | A map using the name of a component Ganeti interacts (Linux, drbd, |
549 | 3301805f | Michele Tartara | hypervisor, etc) as the key and its version number as the value. |
550 | 109e07c2 | Guido Trotter | |
551 | 3301805f | Michele Tartara | Note that we won't go into any hardware specific details (e.g. querying a |
552 | 3301805f | Michele Tartara | node RAID is outside the scope of this, and can be implemented as a |
553 | 3301805f | Michele Tartara | plugin) but we can easily just report the information above, since it's |
554 | 3301805f | Michele Tartara | standard enough across all systems. |
555 | 9ef3e121 | Michele Tartara | |
556 | b166dcfc | Michele Tartara | Format of the query |
557 | b166dcfc | Michele Tartara | ------------------- |
558 | b166dcfc | Michele Tartara | |
559 | 431ff2c1 | Michele Tartara | .. include:: monitoring-query-format.rst |
560 | b166dcfc | Michele Tartara | |
561 | 3301805f | Michele Tartara | Instance disk status propagation |
562 | 3301805f | Michele Tartara | -------------------------------- |
563 | 9ef3e121 | Michele Tartara | |
564 | 3301805f | Michele Tartara | As for the instance status Ganeti has now only partial information about |
565 | 3301805f | Michele Tartara | its instance disks: in particular each node is unaware of the disk to |
566 | 3301805f | Michele Tartara | instance mapping, that exists only on the master. |
567 | 9ef3e121 | Michele Tartara | |
568 | 3301805f | Michele Tartara | For this design doc we plan to fix this by changing all RPCs that create |
569 | 3301805f | Michele Tartara | a backend storage or that put an already existing one in use and passing |
570 | 3301805f | Michele Tartara | the relevant instance to the node. The node can then export these to the |
571 | 3301805f | Michele Tartara | status reporting tool. |
572 | 9ef3e121 | Michele Tartara | |
573 | 3301805f | Michele Tartara | While we haven't implemented these RPC changes yet, we'll use Confd to |
574 | 3301805f | Michele Tartara | fetch this information in the data collectors. |
575 | 9ef3e121 | Michele Tartara | |
576 | 3301805f | Michele Tartara | Plugin system |
577 | 3301805f | Michele Tartara | ------------- |
578 | 9ef3e121 | Michele Tartara | |
579 | 3301805f | Michele Tartara | The monitoring system will be equipped with a plugin system that can |
580 | 3301805f | Michele Tartara | export specific local information through it. |
581 | 9ef3e121 | Michele Tartara | |
582 | 3301805f | Michele Tartara | The plugin system is expected to be used by local installations to |
583 | 3301805f | Michele Tartara | export any installation specific information that they want to be |
584 | 3301805f | Michele Tartara | monitored, about either hardware or software on their systems. |
585 | 9ef3e121 | Michele Tartara | |
586 | 3301805f | Michele Tartara | The plugin system will be in the form of either scripts or binaries whose output |
587 | 3301805f | Michele Tartara | will be inserted in the report. |
588 | 109e07c2 | Guido Trotter | |
589 | 3301805f | Michele Tartara | Eventually support for other kinds of plugins might be added as well, such as |
590 | 3301805f | Michele Tartara | plain text files which will be inserted into the report, or local unix or |
591 | 3301805f | Michele Tartara | network sockets from which the information has to be read. This should allow |
592 | 3301805f | Michele Tartara | most flexibility for implementing an efficient system, while being able to keep |
593 | 3301805f | Michele Tartara | it as simple as possible. |
594 | 109e07c2 | Guido Trotter | |
595 | 109e07c2 | Guido Trotter | Data collectors |
596 | 109e07c2 | Guido Trotter | --------------- |
597 | 109e07c2 | Guido Trotter | |
598 | 109e07c2 | Guido Trotter | In order to ease testing as well as to make it simple to reuse this |
599 | 109e07c2 | Guido Trotter | subsystem it will be possible to run just the "data collectors" on each |
600 | 3301805f | Michele Tartara | node without passing through the agent daemon. |
601 | 109e07c2 | Guido Trotter | |
602 | 9ef3e121 | Michele Tartara | If a data collector is run independently, it should print on stdout its |
603 | 9ef3e121 | Michele Tartara | report, according to the format corresponding to a single data collector |
604 | 3301805f | Michele Tartara | report object, as described in the previous paragraphs. |
605 | 109e07c2 | Guido Trotter | |
606 | 109e07c2 | Guido Trotter | Mode of operation |
607 | 109e07c2 | Guido Trotter | ----------------- |
608 | 109e07c2 | Guido Trotter | |
609 | 109e07c2 | Guido Trotter | In order to be able to report information fast the monitoring agent |
610 | 109e07c2 | Guido Trotter | daemon will keep an in-memory or on-disk cache of the status, which will |
611 | 109e07c2 | Guido Trotter | be returned when queries are made. The status system will then |
612 | 109e07c2 | Guido Trotter | periodically check resources to make sure the status is up to date. |
613 | 109e07c2 | Guido Trotter | |
614 | 109e07c2 | Guido Trotter | Different parts of the report will be queried at different speeds. These |
615 | 109e07c2 | Guido Trotter | will depend on: |
616 | 109e07c2 | Guido Trotter | - how often they vary (or we expect them to vary) |
617 | 109e07c2 | Guido Trotter | - how fast they are to query |
618 | 109e07c2 | Guido Trotter | - how important their freshness is |
619 | 109e07c2 | Guido Trotter | |
620 | 109e07c2 | Guido Trotter | Of course the last parameter is installation specific, and while we'll |
621 | 109e07c2 | Guido Trotter | try to have defaults, it will be configurable. The first two instead we |
622 | 109e07c2 | Guido Trotter | can use adaptively to query a certain resource faster or slower |
623 | 109e07c2 | Guido Trotter | depending on those two parameters. |
624 | 109e07c2 | Guido Trotter | |
625 | 3301805f | Michele Tartara | When run as stand-alone binaries, the data collector will not using any |
626 | 3301805f | Michele Tartara | caching system, and just fetch and return the data immediately. |
627 | 109e07c2 | Guido Trotter | |
628 | 109e07c2 | Guido Trotter | Implementation place |
629 | 109e07c2 | Guido Trotter | -------------------- |
630 | 109e07c2 | Guido Trotter | |
631 | 109e07c2 | Guido Trotter | The status daemon will be implemented as a standalone Haskell daemon. In |
632 | 109e07c2 | Guido Trotter | the future it should be easy to merge multiple daemons into one with |
633 | 109e07c2 | Guido Trotter | multiple entry points, should we find out it saves resources and doesn't |
634 | 109e07c2 | Guido Trotter | impact functionality. |
635 | 109e07c2 | Guido Trotter | |
636 | 109e07c2 | Guido Trotter | The libekg library should be looked at for easily providing metrics in |
637 | 109e07c2 | Guido Trotter | json format. |
638 | 109e07c2 | Guido Trotter | |
639 | 109e07c2 | Guido Trotter | Implementation order |
640 | 109e07c2 | Guido Trotter | -------------------- |
641 | 109e07c2 | Guido Trotter | |
642 | 109e07c2 | Guido Trotter | We will implement the agent system in this order: |
643 | 109e07c2 | Guido Trotter | |
644 | 3301805f | Michele Tartara | - initial example data collectors (eg. for drbd and instance status). |
645 | 3301805f | Michele Tartara | - initial daemon for exporting data, integrating the existing collectors |
646 | 3301805f | Michele Tartara | - plugin system |
647 | 109e07c2 | Guido Trotter | - RPC updates for instance status reasons and disk to instance mapping |
648 | 3301805f | Michele Tartara | - cache layer for the daemon |
649 | 109e07c2 | Guido Trotter | - more data collectors |
650 | 109e07c2 | Guido Trotter | |
651 | 109e07c2 | Guido Trotter | |
652 | 109e07c2 | Guido Trotter | Future work |
653 | 109e07c2 | Guido Trotter | =========== |
654 | 109e07c2 | Guido Trotter | |
655 | 109e07c2 | Guido Trotter | As a future step it can be useful to "centralize" all this reporting |
656 | 109e07c2 | Guido Trotter | data on a single place. This for example can be just the master node, or |
657 | 109e07c2 | Guido Trotter | all the master candidates. We will evaluate doing this after the first |
658 | 109e07c2 | Guido Trotter | node-local version has been developed and tested. |
659 | 109e07c2 | Guido Trotter | |
660 | 109e07c2 | Guido Trotter | Another possible change is replacing the "read-only" RPCs with queries |
661 | 109e07c2 | Guido Trotter | to the agent system, thus having only one way of collecting information |
662 | 109e07c2 | Guido Trotter | from the nodes from a monitoring system and for Ganeti itself. |
663 | 109e07c2 | Guido Trotter | |
664 | 109e07c2 | Guido Trotter | One extra feature we may need is a way to query for only sub-parts of |
665 | 109e07c2 | Guido Trotter | the report (eg. instances status only). This can be done by passing |
666 | 109e07c2 | Guido Trotter | arguments to the HTTP GET, which will be defined when we get to this |
667 | 109e07c2 | Guido Trotter | funtionality. |
668 | 109e07c2 | Guido Trotter | |
669 | 109e07c2 | Guido Trotter | Finally the :doc:`autorepair system design <design-autorepair>`. system |
670 | 109e07c2 | Guido Trotter | (see its design) can be expanded to use the monitoring agent system as a |
671 | 109e07c2 | Guido Trotter | source of information to decide which repairs it can perform. |
672 | 109e07c2 | Guido Trotter | |
673 | 109e07c2 | Guido Trotter | .. vim: set textwidth=72 : |
674 | 109e07c2 | Guido Trotter | .. Local Variables: |
675 | 109e07c2 | Guido Trotter | .. mode: rst |
676 | 109e07c2 | Guido Trotter | .. fill-column: 72 |
677 | 109e07c2 | Guido Trotter | .. End: |