Statistics
| Branch: | Tag: | Revision:

root / doc / design-monitoring-agent.rst @ 82437b28

History | View | Annotate | Download (24.8 kB)

1 109e07c2 Guido Trotter
=======================
2 109e07c2 Guido Trotter
Ganeti monitoring agent
3 109e07c2 Guido Trotter
=======================
4 109e07c2 Guido Trotter
5 109e07c2 Guido Trotter
.. contents:: :depth: 4
6 109e07c2 Guido Trotter
7 109e07c2 Guido Trotter
This is a design document detailing the implementation of a Ganeti
8 109e07c2 Guido Trotter
monitoring agent report system, that can be queried by a monitoring
9 109e07c2 Guido Trotter
system to calculate health information for a Ganeti cluster.
10 109e07c2 Guido Trotter
11 109e07c2 Guido Trotter
Current state and shortcomings
12 109e07c2 Guido Trotter
==============================
13 109e07c2 Guido Trotter
14 109e07c2 Guido Trotter
There is currently no monitoring support in Ganeti. While we don't want
15 109e07c2 Guido Trotter
to build something like Nagios or Pacemaker as part of Ganeti, it would
16 109e07c2 Guido Trotter
be useful if such tools could easily extract information from a Ganeti
17 109e07c2 Guido Trotter
machine in order to take actions (example actions include logging an
18 109e07c2 Guido Trotter
outage for future reporting or alerting a person or system about it).
19 109e07c2 Guido Trotter
20 109e07c2 Guido Trotter
Proposed changes
21 109e07c2 Guido Trotter
================
22 109e07c2 Guido Trotter
23 109e07c2 Guido Trotter
Each Ganeti node should export a status page that can be queried by a
24 109e07c2 Guido Trotter
monitoring system. Such status page will be exported on a network port
25 109e07c2 Guido Trotter
and will be encoded in JSON (simple text) over HTTP.
26 109e07c2 Guido Trotter
27 3301805f Michele Tartara
The choice of JSON is obvious as we already depend on it in Ganeti and
28 109e07c2 Guido Trotter
thus we don't need to add extra libraries to use it, as opposed to what
29 109e07c2 Guido Trotter
would happen for XML or some other markup format.
30 109e07c2 Guido Trotter
31 109e07c2 Guido Trotter
Location of agent report
32 109e07c2 Guido Trotter
------------------------
33 109e07c2 Guido Trotter
34 109e07c2 Guido Trotter
The report will be available from all nodes, and be concerned for all
35 109e07c2 Guido Trotter
node-local resources. This allows more real-time information to be
36 109e07c2 Guido Trotter
available, at the cost of querying all nodes.
37 109e07c2 Guido Trotter
38 109e07c2 Guido Trotter
Information reported
39 109e07c2 Guido Trotter
--------------------
40 109e07c2 Guido Trotter
41 109e07c2 Guido Trotter
The monitoring agent system will report on the following basic information:
42 109e07c2 Guido Trotter
43 109e07c2 Guido Trotter
- Instance status
44 109e07c2 Guido Trotter
- Instance disk status
45 109e07c2 Guido Trotter
- Status of storage for instances
46 109e07c2 Guido Trotter
- Ganeti daemons status, CPU usage, memory footprint
47 109e07c2 Guido Trotter
- Hypervisor resources report (memory, CPU, network interfaces)
48 109e07c2 Guido Trotter
- Node OS resources report (memory, CPU, network interfaces)
49 109e07c2 Guido Trotter
- Information from a plugin system
50 109e07c2 Guido Trotter
51 3301805f Michele Tartara
Format of the report
52 3301805f Michele Tartara
--------------------
53 3301805f Michele Tartara
54 3301805f Michele Tartara
The report of the will be in JSON format, and it will present an array
55 3301805f Michele Tartara
of report objects.
56 3301805f Michele Tartara
Each report object will be produced by a specific data collector.
57 3301805f Michele Tartara
Each report object includes some mandatory fields, to be provided by all
58 3301805f Michele Tartara
the data collectors:
59 3301805f Michele Tartara
60 3301805f Michele Tartara
``name``
61 3301805f Michele Tartara
  The name of the data collector that produced this part of the report.
62 3301805f Michele Tartara
  It is supposed to be unique inside a report.
63 3301805f Michele Tartara
64 3301805f Michele Tartara
``version``
65 3301805f Michele Tartara
  The version of the data collector that produces this part of the
66 3301805f Michele Tartara
  report. Built-in data collectors (as opposed to those implemented as
67 3301805f Michele Tartara
  plugins) should have "B" as the version number.
68 3301805f Michele Tartara
69 834dc290 Michele Tartara
``format_version``
70 3301805f Michele Tartara
  The format of what is represented in the "data" field for each data
71 3301805f Michele Tartara
  collector might change over time. Every time this happens, the
72 3301805f Michele Tartara
  format_version should be changed, so that who reads the report knows
73 3301805f Michele Tartara
  what format to expect, and how to correctly interpret it.
74 3301805f Michele Tartara
75 3301805f Michele Tartara
``timestamp``
76 0e8d8384 Michele Tartara
  The time when the reported data were gathered. It has to be expressed
77 3301805f Michele Tartara
  in nanoseconds since the unix epoch (0:00:00 January 01, 1970). If not
78 3301805f Michele Tartara
  enough precision is available (or needed) it can be padded with
79 3301805f Michele Tartara
  zeroes. If a report object needs multiple timestamps, it can add more
80 3301805f Michele Tartara
  and/or override this one inside its own "data" section.
81 3301805f Michele Tartara
82 3301805f Michele Tartara
``category``
83 3301805f Michele Tartara
  A collector can belong to a given category of collectors (e.g.: storage
84 3301805f Michele Tartara
  collectors, daemon collector). This means that it will have to provide a
85 3301805f Michele Tartara
  minumum set of prescribed fields, as documented for each category.
86 3301805f Michele Tartara
  This field will contain the name of the category the collector belongs to,
87 3301805f Michele Tartara
  if any, or just the ``null`` value.
88 3301805f Michele Tartara
89 3301805f Michele Tartara
``kind``
90 3301805f Michele Tartara
  Two kinds of collectors are possible:
91 3301805f Michele Tartara
  `Performance reporting collectors`_ and `Status reporting collectors`_.
92 3301805f Michele Tartara
  The respective paragraphs will describe them and the value of this field.
93 3301805f Michele Tartara
94 3301805f Michele Tartara
``data``
95 3301805f Michele Tartara
  This field contains all the data generated by the specific data collector,
96 3301805f Michele Tartara
  in its own independently defined format. The monitoring agent could check
97 3301805f Michele Tartara
  this syntactically (according to the JSON specifications) but not
98 3301805f Michele Tartara
  semantically.
99 3301805f Michele Tartara
100 3301805f Michele Tartara
Here follows a minimal example of a report::
101 3301805f Michele Tartara
102 3301805f Michele Tartara
  [
103 3301805f Michele Tartara
  {
104 3301805f Michele Tartara
      "name" : "TheCollectorIdentifier",
105 3301805f Michele Tartara
      "version" : "1.2",
106 834dc290 Michele Tartara
      "format_version" : 1,
107 3301805f Michele Tartara
      "timestamp" : 1351607182000000000,
108 3301805f Michele Tartara
      "category" : null,
109 3301805f Michele Tartara
      "kind" : 0,
110 3301805f Michele Tartara
      "data" : { "plugin_specific_data" : "go_here" }
111 3301805f Michele Tartara
  },
112 3301805f Michele Tartara
  {
113 3301805f Michele Tartara
      "name" : "AnotherDataCollector",
114 3301805f Michele Tartara
      "version" : "B",
115 834dc290 Michele Tartara
      "format_version" : 7,
116 3301805f Michele Tartara
      "timestamp" : 1351609526123854000,
117 3301805f Michele Tartara
      "category" : "storage",
118 3301805f Michele Tartara
      "kind" : 1,
119 3301805f Michele Tartara
      "data" : { "status" : { "code" : 1,
120 3301805f Michele Tartara
                              "message" : "Error on disk 2"
121 3301805f Michele Tartara
                            },
122 3301805f Michele Tartara
                 "plugin_specific" : "data",
123 3301805f Michele Tartara
                 "some_late_data" : { "timestamp" : 1351609526123942720,
124 3301805f Michele Tartara
                                      ...
125 3301805f Michele Tartara
                                    }
126 3301805f Michele Tartara
               }
127 3301805f Michele Tartara
  }
128 3301805f Michele Tartara
  ]
129 3301805f Michele Tartara
130 3301805f Michele Tartara
Performance reporting collectors
131 3301805f Michele Tartara
++++++++++++++++++++++++++++++++
132 3301805f Michele Tartara
133 3301805f Michele Tartara
These collectors only provide data about some component of the system, without
134 3301805f Michele Tartara
giving any interpretation over their meaning.
135 3301805f Michele Tartara
136 3301805f Michele Tartara
The value of the ``kind`` field of the report will be ``0``.
137 3301805f Michele Tartara
138 3301805f Michele Tartara
Status reporting collectors
139 3301805f Michele Tartara
+++++++++++++++++++++++++++
140 3301805f Michele Tartara
141 3301805f Michele Tartara
These collectors will provide information about the status of some
142 3301805f Michele Tartara
component of ganeti, or managed by ganeti.
143 3301805f Michele Tartara
144 3301805f Michele Tartara
The value of their ``kind`` field will be ``1``.
145 3301805f Michele Tartara
146 3301805f Michele Tartara
The rationale behind this kind of collectors is that there are some situations
147 3301805f Michele Tartara
where exporting data about the underlying subsystems would expose potential
148 3301805f Michele Tartara
issues. But if Ganeti itself is able (and going) to fix the problem, conflicts
149 3301805f Michele Tartara
might arise between Ganeti and something/somebody else trying to fix the same
150 3301805f Michele Tartara
problem.
151 3301805f Michele Tartara
Also, some external monitoring systems might not be aware of the internals of a
152 3301805f Michele Tartara
particular subsystem (e.g.: DRBD) and might only exploit the high level
153 3301805f Michele Tartara
response of its data collector, alerting an administrator if anything is wrong.
154 3301805f Michele Tartara
Still, completely hiding the underlying data is not a good idea, as they might
155 3301805f Michele Tartara
still be of use in some cases. So status reporting plugins will provide two
156 3301805f Michele Tartara
output modes: one just exporting a high level information about the status,
157 3301805f Michele Tartara
and one also exporting all the data they gathered.
158 3301805f Michele Tartara
The default output mode will be the status-only one. Through a command line
159 3301805f Michele Tartara
parameter (for stand-alone data collectors) or through the HTTP request to the
160 3301805f Michele Tartara
monitoring agent
161 3301805f Michele Tartara
(when collectors are executed as part of it) the verbose output mode providing
162 3301805f Michele Tartara
all the data can be selected.
163 3301805f Michele Tartara
164 3301805f Michele Tartara
When exporting just the status each status reporting collector will provide,
165 3301805f Michele Tartara
in its ``data`` section, at least the following field:
166 3301805f Michele Tartara
167 3301805f Michele Tartara
``status``
168 3301805f Michele Tartara
  summarizes the status of the component being monitored and consists of two
169 3301805f Michele Tartara
  subfields:
170 3301805f Michele Tartara
171 3301805f Michele Tartara
  ``code``
172 3301805f Michele Tartara
    It assumes a numeric value, encoded in such a way to allow using a bitset
173 3301805f Michele Tartara
    to easily distinguish which states are currently present in the whole cluster.
174 3301805f Michele Tartara
    If the bitwise OR of all the ``status`` fields is 0, the cluster is
175 3301805f Michele Tartara
    completely healty.
176 3301805f Michele Tartara
    The status codes are as follows:
177 3301805f Michele Tartara
178 3301805f Michele Tartara
    ``0``
179 3301805f Michele Tartara
      The collector can determine that everything is working as
180 3301805f Michele Tartara
      intended.
181 3301805f Michele Tartara
182 3301805f Michele Tartara
    ``1``
183 3301805f Michele Tartara
      Something is temporarily wrong but it is being automatically fixed by
184 3301805f Michele Tartara
      Ganeti.
185 3301805f Michele Tartara
      There is no need of external intervention.
186 3301805f Michele Tartara
187 3301805f Michele Tartara
    ``2``
188 3301805f Michele Tartara
      The collector has failed to understand whether the status is good or
189 3301805f Michele Tartara
      bad. Further analysis is required. Interpret this status as a
190 3301805f Michele Tartara
      potentially dangerous situation.
191 3301805f Michele Tartara
192 82437b28 Michele Tartara
    ``4``
193 82437b28 Michele Tartara
      The collector can determine that something is wrong and Ganeti has no
194 82437b28 Michele Tartara
      way to fix it autonomously. External intervention is required.
195 82437b28 Michele Tartara
196 3301805f Michele Tartara
  ``message``
197 3301805f Michele Tartara
    A message to better explain the reason of the status.
198 3301805f Michele Tartara
    The exact format of the message string is data collector dependent.
199 3301805f Michele Tartara
200 debfca88 Michele Tartara
    The field is mandatory, but the content can be an empty string if the
201 debfca88 Michele Tartara
    ``code`` is ``0`` (working as intended) or ``1`` (being fixed
202 debfca88 Michele Tartara
    automatically).
203 3301805f Michele Tartara
204 3301805f Michele Tartara
    If the status code is ``2``, the message should specify what has gone
205 3301805f Michele Tartara
    wrong.
206 3301805f Michele Tartara
    If the status code is ``4``, the message shoud explain why it was not
207 3301805f Michele Tartara
    possible to determine a proper status.
208 3301805f Michele Tartara
209 3301805f Michele Tartara
The ``data`` section will also contain all the fields describing the gathered
210 3301805f Michele Tartara
data, according to a collector-specific format.
211 3301805f Michele Tartara
212 109e07c2 Guido Trotter
Instance status
213 109e07c2 Guido Trotter
+++++++++++++++
214 109e07c2 Guido Trotter
215 109e07c2 Guido Trotter
At the moment each node knows which instances are running on it, which
216 109e07c2 Guido Trotter
instances it is primary for, but not the cause why an instance might not
217 109e07c2 Guido Trotter
be running. On the other hand we don't want to distribute full instance
218 109e07c2 Guido Trotter
"admin" status information to all nodes, because of the performance
219 109e07c2 Guido Trotter
impact this would have.
220 109e07c2 Guido Trotter
221 109e07c2 Guido Trotter
As such we propose that:
222 109e07c2 Guido Trotter
223 109e07c2 Guido Trotter
- Any operation that can affect instance status will have an optional
224 109e07c2 Guido Trotter
  "reason" attached to it (at opcode level). This can be used for
225 109e07c2 Guido Trotter
  example to distinguish an admin request, from a scheduled maintenance
226 109e07c2 Guido Trotter
  or an automated tool's work. If this reason is not passed, Ganeti will
227 109e07c2 Guido Trotter
  just use the information it has about the source of the request: for
228 109e07c2 Guido Trotter
  example a cli shutdown operation will have "cli:shutdown" as a reason,
229 109e07c2 Guido Trotter
  a cli failover operation will have "cli:failover". Operations coming
230 109e07c2 Guido Trotter
  from the remote API will use "rapi" instead of "cli". Of course
231 109e07c2 Guido Trotter
  setting a real site-specific reason is still preferred.
232 109e07c2 Guido Trotter
- RPCs that affect the instance status will be changed so that the
233 109e07c2 Guido Trotter
  "reason" and the version of the config object they ran on is passed to
234 109e07c2 Guido Trotter
  them. They will then export the new expected instance status, together
235 109e07c2 Guido Trotter
  with the associated reason and object version to the status report
236 109e07c2 Guido Trotter
  system, which then will export those themselves.
237 109e07c2 Guido Trotter
238 109e07c2 Guido Trotter
Monitoring and auditing systems can then use the reason to understand
239 3301805f Michele Tartara
the cause of an instance status, and they can use the timestamp to
240 109e07c2 Guido Trotter
understand the freshness of their data even in the absence of an atomic
241 109e07c2 Guido Trotter
cross-node reporting: for example if they see an instance "up" on a node
242 109e07c2 Guido Trotter
after seeing it running on a previous one, they can compare these values
243 109e07c2 Guido Trotter
to understand which data is freshest, and repoll the "older" node. Of
244 109e07c2 Guido Trotter
course if they keep seeing this status this represents an error (either
245 109e07c2 Guido Trotter
an instance continuously "flapping" between nodes, or an instance is
246 109e07c2 Guido Trotter
constantly up on more than one), which should be reported and acted
247 109e07c2 Guido Trotter
upon.
248 109e07c2 Guido Trotter
249 109e07c2 Guido Trotter
The instance status will be on each node, for the instances it is
250 3301805f Michele Tartara
primary for, and its ``data`` section of the report will contain a list
251 3301805f Michele Tartara
of instances, with at least the following fields for each instance:
252 3301805f Michele Tartara
253 3301805f Michele Tartara
``name``
254 3301805f Michele Tartara
  The name of the instance.
255 3301805f Michele Tartara
256 3301805f Michele Tartara
``uuid``
257 3301805f Michele Tartara
  The UUID of the instance (stable on name change).
258 3301805f Michele Tartara
259 3301805f Michele Tartara
``admin_state``
260 3301805f Michele Tartara
  The status of the instance (up/down/offline) as requested by the admin.
261 3301805f Michele Tartara
262 3301805f Michele Tartara
``actual_state``
263 3301805f Michele Tartara
  The actual status of the instance. It can be ``up``, ``down``, or
264 3301805f Michele Tartara
  ``hung`` if the instance is up but it appears to be completely stuck.
265 3301805f Michele Tartara
266 3301805f Michele Tartara
``uptime``
267 3301805f Michele Tartara
  The uptime of the instance (if it is up, "null" otherwise).
268 3301805f Michele Tartara
269 3301805f Michele Tartara
``mtime``
270 3301805f Michele Tartara
  The timestamp of the last known change to the instance state.
271 3301805f Michele Tartara
272 3301805f Michele Tartara
``state_reason``
273 3301805f Michele Tartara
  The last known reason for state change, described according to the
274 3301805f Michele Tartara
  following subfields:
275 3301805f Michele Tartara
276 3301805f Michele Tartara
  ``text``
277 3301805f Michele Tartara
    Either a user-provided reason (if any), or the name of the command that
278 3301805f Michele Tartara
    triggered the state change, as a fallback.
279 3301805f Michele Tartara
280 3301805f Michele Tartara
  ``jobID``
281 3301805f Michele Tartara
    The ID of the job that caused the state change.
282 109e07c2 Guido Trotter
283 3301805f Michele Tartara
  ``source``
284 3301805f Michele Tartara
    Where the state change was triggered (RAPI, CLI).
285 109e07c2 Guido Trotter
286 3301805f Michele Tartara
``status``
287 3301805f Michele Tartara
  It represents the status of the instance, and its format is the same as that
288 3301805f Michele Tartara
  of the ``status`` field of `Status reporting collectors`_.
289 3301805f Michele Tartara
290 3301805f Michele Tartara
Each hypervisor should provide its own instance status data collector, possibly
291 3301805f Michele Tartara
with the addition of more, specific, fields.
292 3301805f Michele Tartara
The ``category`` field of all of them will be ``instance``.
293 3301805f Michele Tartara
The ``kind`` field will be ``1``.
294 109e07c2 Guido Trotter
295 109e07c2 Guido Trotter
Note that as soon as a node knows it's not the primary anymore for an
296 109e07c2 Guido Trotter
instance it will stop reporting status for it: this means the instance
297 109e07c2 Guido Trotter
will either disappear, if it has been deleted, or appear on another
298 109e07c2 Guido Trotter
node, if it's been moved.
299 109e07c2 Guido Trotter
300 3301805f Michele Tartara
The ``code`` of the ``status`` field of the report of the Instance status data
301 3301805f Michele Tartara
collector will be:
302 109e07c2 Guido Trotter
303 3301805f Michele Tartara
``0``
304 3301805f Michele Tartara
  if ``status`` is ``0`` for all the instances it is reporting about.
305 109e07c2 Guido Trotter
306 3301805f Michele Tartara
``1``
307 3301805f Michele Tartara
  otherwise.
308 3301805f Michele Tartara
309 3301805f Michele Tartara
Storage status
310 3301805f Michele Tartara
++++++++++++++
311 3301805f Michele Tartara
312 3301805f Michele Tartara
The storage status collectors will be a series of data collectors
313 3301805f Michele Tartara
(drbd, rbd, plain, file) that will gather data about all the storage types
314 3301805f Michele Tartara
for the current node (this is right now hardcoded to the enabled storage
315 3301805f Michele Tartara
types, and in the future tied to the enabled storage pools for the nodegroup).
316 3301805f Michele Tartara
317 3301805f Michele Tartara
The ``name`` of each of these collector will reflect what storage type each of
318 3301805f Michele Tartara
them refers to.
319 3301805f Michele Tartara
320 3301805f Michele Tartara
The ``category`` field of these collector will be ``storage``.
321 3301805f Michele Tartara
322 3301805f Michele Tartara
The ``kind`` field will be ``1`` (`Status reporting collectors`_).
323 3301805f Michele Tartara
324 3301805f Michele Tartara
The ``data`` section of the report will provide at least the following fields:
325 3301805f Michele Tartara
326 3301805f Michele Tartara
``free``
327 3301805f Michele Tartara
  The amount of free space (in KBytes).
328 3301805f Michele Tartara
329 3301805f Michele Tartara
``used``
330 3301805f Michele Tartara
  The amount of used space (in KBytes).
331 3301805f Michele Tartara
332 3301805f Michele Tartara
``total``
333 3301805f Michele Tartara
  The total visible space (in KBytes).
334 3301805f Michele Tartara
335 3301805f Michele Tartara
Each specific storage type might provide more type-specific fields.
336 3301805f Michele Tartara
337 3301805f Michele Tartara
In case of error, the ``message`` subfield of the ``status`` field of the
338 3301805f Michele Tartara
report of the instance status collector will disclose the nature of the error
339 3301805f Michele Tartara
as a type specific information. Examples of these are "backend pv unavailable"
340 3301805f Michele Tartara
for lvm storage, "unreachable" for network based storage or "filesystem error"
341 3301805f Michele Tartara
for filesystem based implementations.
342 3301805f Michele Tartara
343 3301805f Michele Tartara
DRBD status
344 3301805f Michele Tartara
***********
345 3301805f Michele Tartara
346 3301805f Michele Tartara
This data collector will run only on nodes where DRBD is actually
347 3301805f Michele Tartara
present and it will gather information about DRBD devices.
348 3301805f Michele Tartara
349 3301805f Michele Tartara
Its ``kind`` in the report will be ``1`` (`Status reporting collectors`_).
350 3301805f Michele Tartara
351 3301805f Michele Tartara
Its ``category`` field in the report will contain the value ``storage``.
352 3301805f Michele Tartara
353 3301805f Michele Tartara
When executed in verbose mode, the ``data`` section of the report of this
354 3301805f Michele Tartara
collector will provide the following fields:
355 3301805f Michele Tartara
356 3301805f Michele Tartara
``versionInfo``
357 3301805f Michele Tartara
  Information about the DRBD version number, given by a combination of
358 3301805f Michele Tartara
  any (but at least one) of the following fields:
359 3301805f Michele Tartara
360 3301805f Michele Tartara
  ``version``
361 3301805f Michele Tartara
    The DRBD driver version.
362 3301805f Michele Tartara
363 3301805f Michele Tartara
  ``api``
364 3301805f Michele Tartara
    The API version number.
365 3301805f Michele Tartara
366 3301805f Michele Tartara
  ``proto``
367 3301805f Michele Tartara
    The protocol version.
368 3301805f Michele Tartara
369 3301805f Michele Tartara
  ``srcversion``
370 3301805f Michele Tartara
    The version of the source files.
371 3301805f Michele Tartara
372 3301805f Michele Tartara
  ``gitHash``
373 3301805f Michele Tartara
    Git hash of the source files.
374 3301805f Michele Tartara
375 3301805f Michele Tartara
  ``buildBy``
376 3301805f Michele Tartara
    Who built the binary, and, optionally, when.
377 3301805f Michele Tartara
378 3301805f Michele Tartara
``device``
379 3301805f Michele Tartara
  A list of structures, each describing a DRBD device (a minor) and containing
380 3301805f Michele Tartara
  the following fields:
381 3301805f Michele Tartara
382 3301805f Michele Tartara
  ``minor``
383 3301805f Michele Tartara
    The device minor number.
384 3301805f Michele Tartara
385 3301805f Michele Tartara
  ``connectionState``
386 3301805f Michele Tartara
    The state of the connection. If it is "Unconfigured", all the following
387 3301805f Michele Tartara
    fields are not present.
388 3301805f Michele Tartara
389 3301805f Michele Tartara
  ``localRole``
390 3301805f Michele Tartara
    The role of the local resource.
391 3301805f Michele Tartara
392 3301805f Michele Tartara
  ``remoteRole``
393 3301805f Michele Tartara
    The role of the remote resource.
394 3301805f Michele Tartara
395 3301805f Michele Tartara
  ``localState``
396 3301805f Michele Tartara
    The status of the local disk.
397 3301805f Michele Tartara
398 3301805f Michele Tartara
  ``remoteState``
399 3301805f Michele Tartara
    The status of the remote disk.
400 3301805f Michele Tartara
401 3301805f Michele Tartara
  ``replicationProtocol``
402 3301805f Michele Tartara
    The replication protocol being used.
403 3301805f Michele Tartara
404 3301805f Michele Tartara
  ``ioFlags``
405 3301805f Michele Tartara
    The input/output flags.
406 3301805f Michele Tartara
407 3301805f Michele Tartara
  ``perfIndicators``
408 3301805f Michele Tartara
    The performance indicators. This field will contain the following
409 3301805f Michele Tartara
    sub-fields:
410 3301805f Michele Tartara
411 3301805f Michele Tartara
    ``networkSend``
412 3301805f Michele Tartara
      KiB of data sent on the network.
413 3301805f Michele Tartara
414 3301805f Michele Tartara
    ``networkReceive``
415 3301805f Michele Tartara
      KiB of data received from the network.
416 3301805f Michele Tartara
417 3301805f Michele Tartara
    ``diskWrite``
418 3301805f Michele Tartara
      KiB of data written on local disk.
419 3301805f Michele Tartara
420 3301805f Michele Tartara
    ``diskRead``
421 3301805f Michele Tartara
      KiB of date read from the local disk.
422 3301805f Michele Tartara
423 3301805f Michele Tartara
    ``activityLog``
424 3301805f Michele Tartara
      Number of updates of the activity log.
425 3301805f Michele Tartara
426 3301805f Michele Tartara
    ``bitMap``
427 3301805f Michele Tartara
      Number of updates to the bitmap area of the metadata.
428 3301805f Michele Tartara
429 3301805f Michele Tartara
    ``localCount``
430 3301805f Michele Tartara
      Number of open requests to the local I/O subsystem.
431 3301805f Michele Tartara
432 3301805f Michele Tartara
    ``pending``
433 3301805f Michele Tartara
      Number of requests sent to the partner but not yet answered.
434 3301805f Michele Tartara
435 3301805f Michele Tartara
    ``unacknowledged``
436 3301805f Michele Tartara
      Number of requests received by the partner but still to be answered.
437 3301805f Michele Tartara
438 3301805f Michele Tartara
    ``applicationPending``
439 3301805f Michele Tartara
      Num of block input/output requests forwarded to DRBD but that have not yet
440 3301805f Michele Tartara
      been answered.
441 3301805f Michele Tartara
442 3301805f Michele Tartara
    ``epochs``
443 3301805f Michele Tartara
      (Optional) Number of epoch objects. Not provided by all DRBD versions.
444 3301805f Michele Tartara
445 3301805f Michele Tartara
    ``writeOrder``
446 3301805f Michele Tartara
      (Optional) Currently used write ordering method. Not provided by all DRBD
447 3301805f Michele Tartara
      versions.
448 3301805f Michele Tartara
449 3301805f Michele Tartara
    ``outOfSync``
450 3301805f Michele Tartara
      (Optional) KiB of storage currently out of sync. Not provided by all DRBD
451 3301805f Michele Tartara
      versions.
452 3301805f Michele Tartara
453 3301805f Michele Tartara
  ``syncStatus``
454 3301805f Michele Tartara
    (Optional) The status of the synchronization of the disk. This is present
455 3301805f Michele Tartara
    only if the disk is being synchronized, and includes the following fields:
456 3301805f Michele Tartara
457 3301805f Michele Tartara
    ``percentage``
458 3301805f Michele Tartara
      The percentage of synchronized data.
459 3301805f Michele Tartara
460 3301805f Michele Tartara
    ``progress``
461 3301805f Michele Tartara
      How far the synchronization is. Written as "x/y", where x and y are
462 3301805f Michele Tartara
      integer numbers expressed in the measurement unit stated in
463 3301805f Michele Tartara
      ``progressUnit``
464 3301805f Michele Tartara
465 3301805f Michele Tartara
    ``progressUnit``
466 3301805f Michele Tartara
      The measurement unit for the progress indicator.
467 3301805f Michele Tartara
468 3301805f Michele Tartara
    ``timeToFinish``
469 3301805f Michele Tartara
      The expected time before finishing the synchronization.
470 3301805f Michele Tartara
471 3301805f Michele Tartara
    ``speed``
472 3301805f Michele Tartara
      The speed of the synchronization.
473 3301805f Michele Tartara
474 3301805f Michele Tartara
    ``want``
475 3301805f Michele Tartara
      The desiderd speed of the synchronization.
476 3301805f Michele Tartara
477 3301805f Michele Tartara
    ``speedUnit``
478 3301805f Michele Tartara
      The measurement unit of the ``speed`` and ``want`` values. Expressed
479 3301805f Michele Tartara
      as "size/time".
480 3301805f Michele Tartara
481 3301805f Michele Tartara
  ``instance``
482 3301805f Michele Tartara
    The name of the Ganeti instance this disk is associated to.
483 109e07c2 Guido Trotter
484 109e07c2 Guido Trotter
485 109e07c2 Guido Trotter
Ganeti daemons status
486 109e07c2 Guido Trotter
+++++++++++++++++++++
487 109e07c2 Guido Trotter
488 3301805f Michele Tartara
Ganeti will report what information it has about its own daemons.
489 3301805f Michele Tartara
This should allow identifying possible problems with the Ganeti system itself:
490 3301805f Michele Tartara
for example memory leaks, crashes and high resource utilization should be
491 3301805f Michele Tartara
evident by analyzing this information.
492 3301805f Michele Tartara
493 3301805f Michele Tartara
The ``kind`` field will be ``1`` (`Status reporting collectors`_).
494 3301805f Michele Tartara
495 3301805f Michele Tartara
Each daemon will have its own data collector, and each of them will have
496 3301805f Michele Tartara
a ``category`` field valued ``daemon``.
497 3301805f Michele Tartara
498 3301805f Michele Tartara
When executed in verbose mode, their data section will include at least:
499 3301805f Michele Tartara
500 3301805f Michele Tartara
``memory``
501 3301805f Michele Tartara
  The amount of used memory.
502 3301805f Michele Tartara
503 3301805f Michele Tartara
``size_unit``
504 3301805f Michele Tartara
  The measurement unit used for the memory.
505 109e07c2 Guido Trotter
506 3301805f Michele Tartara
``uptime``
507 3301805f Michele Tartara
  The uptime of the daemon.
508 3301805f Michele Tartara
509 3301805f Michele Tartara
``CPU usage``
510 3301805f Michele Tartara
  How much cpu the daemon is using (percentage).
511 3301805f Michele Tartara
512 3301805f Michele Tartara
Any other daemon-specific information can be included as well in the ``data``
513 3301805f Michele Tartara
section.
514 109e07c2 Guido Trotter
515 109e07c2 Guido Trotter
Hypervisor resources report
516 109e07c2 Guido Trotter
+++++++++++++++++++++++++++
517 109e07c2 Guido Trotter
518 109e07c2 Guido Trotter
Each hypervisor has a view of system resources that sometimes is
519 109e07c2 Guido Trotter
different than the one the OS sees (for example in Xen the Node OS,
520 109e07c2 Guido Trotter
running as Dom0, has access to only part of those resources). In this
521 109e07c2 Guido Trotter
section we'll report all information we can in a "non hypervisor
522 109e07c2 Guido Trotter
specific" way. Each hypervisor can then add extra specific information
523 109e07c2 Guido Trotter
that is not generic enough be abstracted.
524 109e07c2 Guido Trotter
525 3301805f Michele Tartara
The ``kind`` field will be ``0`` (`Performance reporting collectors`_).
526 3301805f Michele Tartara
527 3301805f Michele Tartara
Each of the hypervisor data collectory will be of ``category``: ``hypervisor``.
528 3301805f Michele Tartara
529 109e07c2 Guido Trotter
Node OS resources report
530 109e07c2 Guido Trotter
++++++++++++++++++++++++
531 109e07c2 Guido Trotter
532 109e07c2 Guido Trotter
Since Ganeti assumes it's running on Linux, it's useful to export some
533 3301805f Michele Tartara
basic information as seen by the host system.
534 109e07c2 Guido Trotter
535 3301805f Michele Tartara
The ``category`` field of the report will be ``null``.
536 109e07c2 Guido Trotter
537 3301805f Michele Tartara
The ``kind`` field will be ``0`` (`Performance reporting collectors`_).
538 109e07c2 Guido Trotter
539 3301805f Michele Tartara
The ``data`` section will include:
540 109e07c2 Guido Trotter
541 3301805f Michele Tartara
``cpu_number``
542 3301805f Michele Tartara
  The number of available cpus.
543 109e07c2 Guido Trotter
544 3301805f Michele Tartara
``cpus``
545 3301805f Michele Tartara
  A list with one element per cpu, showing its average load.
546 109e07c2 Guido Trotter
547 3301805f Michele Tartara
``memory``
548 3301805f Michele Tartara
  The current view of memory (free, used, cached, etc.)
549 109e07c2 Guido Trotter
550 3301805f Michele Tartara
``filesystem``
551 3301805f Michele Tartara
  A list with one element per filesystem, showing a summary of the
552 3301805f Michele Tartara
  total/available space.
553 109e07c2 Guido Trotter
554 3301805f Michele Tartara
``NICs``
555 3301805f Michele Tartara
  A list with one element per network interface, showing the amount of
556 3301805f Michele Tartara
  sent/received data, error rate, IP address of the interface, etc.
557 109e07c2 Guido Trotter
558 3301805f Michele Tartara
``versions``
559 3301805f Michele Tartara
  A map using the name of a component Ganeti interacts (Linux, drbd,
560 3301805f Michele Tartara
  hypervisor, etc) as the key and its version number as the value.
561 109e07c2 Guido Trotter
562 3301805f Michele Tartara
Note that we won't go into any hardware specific details (e.g. querying a
563 3301805f Michele Tartara
node RAID is outside the scope of this, and can be implemented as a
564 3301805f Michele Tartara
plugin) but we can easily just report the information above, since it's
565 3301805f Michele Tartara
standard enough across all systems.
566 9ef3e121 Michele Tartara
567 b166dcfc Michele Tartara
Format of the query
568 b166dcfc Michele Tartara
-------------------
569 b166dcfc Michele Tartara
570 b166dcfc Michele Tartara
The queries to the monitoring agent will be HTTP GET requests on port 1815.
571 b166dcfc Michele Tartara
The answer will be encoded in JSON format and will depend on the specific
572 b166dcfc Michele Tartara
accessed resource.
573 b166dcfc Michele Tartara
574 b166dcfc Michele Tartara
If a request is sent to a non-existing resource, a 404 error will be returned by
575 b166dcfc Michele Tartara
the HTTP server.
576 b166dcfc Michele Tartara
577 b166dcfc Michele Tartara
The following paragraphs will present the existing resources supported by the
578 b166dcfc Michele Tartara
current protocol version, that is version 1.
579 b166dcfc Michele Tartara
580 b166dcfc Michele Tartara
``/``
581 b166dcfc Michele Tartara
+++++
582 b166dcfc Michele Tartara
The root resource. It will return the list of the supported protocol version
583 b166dcfc Michele Tartara
numbers.
584 b166dcfc Michele Tartara
585 b166dcfc Michele Tartara
Currently, this will include only version 1.
586 b166dcfc Michele Tartara
587 b166dcfc Michele Tartara
``/1``
588 b166dcfc Michele Tartara
++++++
589 b166dcfc Michele Tartara
Not an actual resource per-se, it is the root of all the resources of protocol
590 b166dcfc Michele Tartara
version 1.
591 b166dcfc Michele Tartara
592 b166dcfc Michele Tartara
If requested through GET, the null JSON value will be returned.
593 b166dcfc Michele Tartara
594 ea322c27 Michele Tartara
``/1/list/collectors``
595 ea322c27 Michele Tartara
++++++++++++++++++++++
596 ea322c27 Michele Tartara
Returns a list of tuples (kind, category, name) showing all the collectors
597 ea322c27 Michele Tartara
available in the system.
598 ea322c27 Michele Tartara
599 ea322c27 Michele Tartara
``/1/report/all``
600 ea322c27 Michele Tartara
+++++++++++++++++
601 ea322c27 Michele Tartara
A list of the reports of all the data collectors, as described in the section
602 b166dcfc Michele Tartara
`Format of the report`_.
603 b166dcfc Michele Tartara
604 b166dcfc Michele Tartara
`Status reporting collectors`_ will provide their output in non-verbose format.
605 b166dcfc Michele Tartara
The verbose format can be requested by adding the parameter ``verbose=1`` to the
606 b166dcfc Michele Tartara
request.
607 b166dcfc Michele Tartara
608 ea322c27 Michele Tartara
``/1/report/[category]/[collector_name]``
609 ea322c27 Michele Tartara
+++++++++++++++++++++++++++++++++++++++++
610 b166dcfc Michele Tartara
Returns the report of the collector ``[collector_name]`` that belongs to the
611 b166dcfc Michele Tartara
specified ``[category]``.
612 b166dcfc Michele Tartara
613 b166dcfc Michele Tartara
If a collector does not belong to any category, ``collector`` will be used as
614 b166dcfc Michele Tartara
the value for ``[category]``.
615 b166dcfc Michele Tartara
616 b166dcfc Michele Tartara
`Status reporting collectors`_ will provide their output in non-verbose format.
617 b166dcfc Michele Tartara
The verbose format can be requested by adding the parameter ``verbose=1`` to the
618 b166dcfc Michele Tartara
request.
619 b166dcfc Michele Tartara
620 3301805f Michele Tartara
Instance disk status propagation
621 3301805f Michele Tartara
--------------------------------
622 9ef3e121 Michele Tartara
623 3301805f Michele Tartara
As for the instance status Ganeti has now only partial information about
624 3301805f Michele Tartara
its instance disks: in particular each node is unaware of the disk to
625 3301805f Michele Tartara
instance mapping, that exists only on the master.
626 9ef3e121 Michele Tartara
627 3301805f Michele Tartara
For this design doc we plan to fix this by changing all RPCs that create
628 3301805f Michele Tartara
a backend storage or that put an already existing one in use and passing
629 3301805f Michele Tartara
the relevant instance to the node. The node can then export these to the
630 3301805f Michele Tartara
status reporting tool.
631 9ef3e121 Michele Tartara
632 3301805f Michele Tartara
While we haven't implemented these RPC changes yet, we'll use Confd to
633 3301805f Michele Tartara
fetch this information in the data collectors.
634 9ef3e121 Michele Tartara
635 3301805f Michele Tartara
Plugin system
636 3301805f Michele Tartara
-------------
637 9ef3e121 Michele Tartara
638 3301805f Michele Tartara
The monitoring system will be equipped with a plugin system that can
639 3301805f Michele Tartara
export specific local information through it.
640 9ef3e121 Michele Tartara
641 3301805f Michele Tartara
The plugin system is expected to be used by local installations to
642 3301805f Michele Tartara
export any installation specific information that they want to be
643 3301805f Michele Tartara
monitored, about either hardware or software on their systems.
644 9ef3e121 Michele Tartara
645 3301805f Michele Tartara
The plugin system will be in the form of either scripts or binaries whose output
646 3301805f Michele Tartara
will be inserted in the report.
647 109e07c2 Guido Trotter
648 3301805f Michele Tartara
Eventually support for other kinds of plugins might be added as well, such as
649 3301805f Michele Tartara
plain text files which will be inserted into the report, or local unix or
650 3301805f Michele Tartara
network sockets from which the information has to be read.  This should allow
651 3301805f Michele Tartara
most flexibility for implementing an efficient system, while being able to keep
652 3301805f Michele Tartara
it as simple as possible.
653 109e07c2 Guido Trotter
654 109e07c2 Guido Trotter
Data collectors
655 109e07c2 Guido Trotter
---------------
656 109e07c2 Guido Trotter
657 109e07c2 Guido Trotter
In order to ease testing as well as to make it simple to reuse this
658 109e07c2 Guido Trotter
subsystem it will be possible to run just the "data collectors" on each
659 3301805f Michele Tartara
node without passing through the agent daemon.
660 109e07c2 Guido Trotter
661 9ef3e121 Michele Tartara
If a data collector is run independently, it should print on stdout its
662 9ef3e121 Michele Tartara
report, according to the format corresponding to a single data collector
663 3301805f Michele Tartara
report object, as described in the previous paragraphs.
664 109e07c2 Guido Trotter
665 109e07c2 Guido Trotter
Mode of operation
666 109e07c2 Guido Trotter
-----------------
667 109e07c2 Guido Trotter
668 109e07c2 Guido Trotter
In order to be able to report information fast the monitoring agent
669 109e07c2 Guido Trotter
daemon will keep an in-memory or on-disk cache of the status, which will
670 109e07c2 Guido Trotter
be returned when queries are made. The status system will then
671 109e07c2 Guido Trotter
periodically check resources to make sure the status is up to date.
672 109e07c2 Guido Trotter
673 109e07c2 Guido Trotter
Different parts of the report will be queried at different speeds. These
674 109e07c2 Guido Trotter
will depend on:
675 109e07c2 Guido Trotter
- how often they vary (or we expect them to vary)
676 109e07c2 Guido Trotter
- how fast they are to query
677 109e07c2 Guido Trotter
- how important their freshness is
678 109e07c2 Guido Trotter
679 109e07c2 Guido Trotter
Of course the last parameter is installation specific, and while we'll
680 109e07c2 Guido Trotter
try to have defaults, it will be configurable. The first two instead we
681 109e07c2 Guido Trotter
can use adaptively to query a certain resource faster or slower
682 109e07c2 Guido Trotter
depending on those two parameters.
683 109e07c2 Guido Trotter
684 3301805f Michele Tartara
When run as stand-alone binaries, the data collector will not using any
685 3301805f Michele Tartara
caching system, and just fetch and return the data immediately.
686 109e07c2 Guido Trotter
687 109e07c2 Guido Trotter
Implementation place
688 109e07c2 Guido Trotter
--------------------
689 109e07c2 Guido Trotter
690 109e07c2 Guido Trotter
The status daemon will be implemented as a standalone Haskell daemon. In
691 109e07c2 Guido Trotter
the future it should be easy to merge multiple daemons into one with
692 109e07c2 Guido Trotter
multiple entry points, should we find out it saves resources and doesn't
693 109e07c2 Guido Trotter
impact functionality.
694 109e07c2 Guido Trotter
695 109e07c2 Guido Trotter
The libekg library should be looked at for easily providing metrics in
696 109e07c2 Guido Trotter
json format.
697 109e07c2 Guido Trotter
698 109e07c2 Guido Trotter
Implementation order
699 109e07c2 Guido Trotter
--------------------
700 109e07c2 Guido Trotter
701 109e07c2 Guido Trotter
We will implement the agent system in this order:
702 109e07c2 Guido Trotter
703 3301805f Michele Tartara
- initial example data collectors (eg. for drbd and instance status).
704 3301805f Michele Tartara
- initial daemon for exporting data, integrating the existing collectors
705 3301805f Michele Tartara
- plugin system
706 109e07c2 Guido Trotter
- RPC updates for instance status reasons and disk to instance mapping
707 3301805f Michele Tartara
- cache layer for the daemon
708 109e07c2 Guido Trotter
- more data collectors
709 109e07c2 Guido Trotter
710 109e07c2 Guido Trotter
711 109e07c2 Guido Trotter
Future work
712 109e07c2 Guido Trotter
===========
713 109e07c2 Guido Trotter
714 109e07c2 Guido Trotter
As a future step it can be useful to "centralize" all this reporting
715 109e07c2 Guido Trotter
data on a single place. This for example can be just the master node, or
716 109e07c2 Guido Trotter
all the master candidates. We will evaluate doing this after the first
717 109e07c2 Guido Trotter
node-local version has been developed and tested.
718 109e07c2 Guido Trotter
719 109e07c2 Guido Trotter
Another possible change is replacing the "read-only" RPCs with queries
720 109e07c2 Guido Trotter
to the agent system, thus having only one way of collecting information
721 109e07c2 Guido Trotter
from the nodes from a monitoring system and for Ganeti itself.
722 109e07c2 Guido Trotter
723 109e07c2 Guido Trotter
One extra feature we may need is a way to query for only sub-parts of
724 109e07c2 Guido Trotter
the report (eg. instances status only). This can be done by passing
725 109e07c2 Guido Trotter
arguments to the HTTP GET, which will be defined when we get to this
726 109e07c2 Guido Trotter
funtionality.
727 109e07c2 Guido Trotter
728 109e07c2 Guido Trotter
Finally the :doc:`autorepair system design <design-autorepair>`. system
729 109e07c2 Guido Trotter
(see its design) can be expanded to use the monitoring agent system as a
730 109e07c2 Guido Trotter
source of information to decide which repairs it can perform.
731 109e07c2 Guido Trotter
732 109e07c2 Guido Trotter
.. vim: set textwidth=72 :
733 109e07c2 Guido Trotter
.. Local Variables:
734 109e07c2 Guido Trotter
.. mode: rst
735 109e07c2 Guido Trotter
.. fill-column: 72
736 109e07c2 Guido Trotter
.. End: