Statistics
| Branch: | Tag: | Revision:

root / doc / design-monitoring-agent.rst @ 56c934da

History | View | Annotate | Download (29.5 kB)

1 109e07c2 Guido Trotter
=======================
2 109e07c2 Guido Trotter
Ganeti monitoring agent
3 109e07c2 Guido Trotter
=======================
4 109e07c2 Guido Trotter
5 109e07c2 Guido Trotter
.. contents:: :depth: 4
6 109e07c2 Guido Trotter
7 109e07c2 Guido Trotter
This is a design document detailing the implementation of a Ganeti
8 109e07c2 Guido Trotter
monitoring agent report system, that can be queried by a monitoring
9 109e07c2 Guido Trotter
system to calculate health information for a Ganeti cluster.
10 109e07c2 Guido Trotter
11 109e07c2 Guido Trotter
Current state and shortcomings
12 109e07c2 Guido Trotter
==============================
13 109e07c2 Guido Trotter
14 109e07c2 Guido Trotter
There is currently no monitoring support in Ganeti. While we don't want
15 109e07c2 Guido Trotter
to build something like Nagios or Pacemaker as part of Ganeti, it would
16 109e07c2 Guido Trotter
be useful if such tools could easily extract information from a Ganeti
17 109e07c2 Guido Trotter
machine in order to take actions (example actions include logging an
18 109e07c2 Guido Trotter
outage for future reporting or alerting a person or system about it).
19 109e07c2 Guido Trotter
20 109e07c2 Guido Trotter
Proposed changes
21 109e07c2 Guido Trotter
================
22 109e07c2 Guido Trotter
23 109e07c2 Guido Trotter
Each Ganeti node should export a status page that can be queried by a
24 109e07c2 Guido Trotter
monitoring system. Such status page will be exported on a network port
25 109e07c2 Guido Trotter
and will be encoded in JSON (simple text) over HTTP.
26 109e07c2 Guido Trotter
27 3301805f Michele Tartara
The choice of JSON is obvious as we already depend on it in Ganeti and
28 109e07c2 Guido Trotter
thus we don't need to add extra libraries to use it, as opposed to what
29 109e07c2 Guido Trotter
would happen for XML or some other markup format.
30 109e07c2 Guido Trotter
31 109e07c2 Guido Trotter
Location of agent report
32 109e07c2 Guido Trotter
------------------------
33 109e07c2 Guido Trotter
34 109e07c2 Guido Trotter
The report will be available from all nodes, and be concerned for all
35 109e07c2 Guido Trotter
node-local resources. This allows more real-time information to be
36 109e07c2 Guido Trotter
available, at the cost of querying all nodes.
37 109e07c2 Guido Trotter
38 109e07c2 Guido Trotter
Information reported
39 109e07c2 Guido Trotter
--------------------
40 109e07c2 Guido Trotter
41 109e07c2 Guido Trotter
The monitoring agent system will report on the following basic information:
42 109e07c2 Guido Trotter
43 109e07c2 Guido Trotter
- Instance status
44 109e07c2 Guido Trotter
- Instance disk status
45 109e07c2 Guido Trotter
- Status of storage for instances
46 109e07c2 Guido Trotter
- Ganeti daemons status, CPU usage, memory footprint
47 109e07c2 Guido Trotter
- Hypervisor resources report (memory, CPU, network interfaces)
48 109e07c2 Guido Trotter
- Node OS resources report (memory, CPU, network interfaces)
49 99b67c35 Spyros Trigazis
- Node OS CPU load average report
50 109e07c2 Guido Trotter
- Information from a plugin system
51 109e07c2 Guido Trotter
52 0ac2ff3b Spyros Trigazis
.. _monitoring-agent-format-of-the-report:
53 0ac2ff3b Spyros Trigazis
54 3301805f Michele Tartara
Format of the report
55 3301805f Michele Tartara
--------------------
56 3301805f Michele Tartara
57 3301805f Michele Tartara
The report of the will be in JSON format, and it will present an array
58 3301805f Michele Tartara
of report objects.
59 3301805f Michele Tartara
Each report object will be produced by a specific data collector.
60 3301805f Michele Tartara
Each report object includes some mandatory fields, to be provided by all
61 3301805f Michele Tartara
the data collectors:
62 3301805f Michele Tartara
63 3301805f Michele Tartara
``name``
64 3301805f Michele Tartara
  The name of the data collector that produced this part of the report.
65 3301805f Michele Tartara
  It is supposed to be unique inside a report.
66 3301805f Michele Tartara
67 3301805f Michele Tartara
``version``
68 3301805f Michele Tartara
  The version of the data collector that produces this part of the
69 3301805f Michele Tartara
  report. Built-in data collectors (as opposed to those implemented as
70 3301805f Michele Tartara
  plugins) should have "B" as the version number.
71 3301805f Michele Tartara
72 834dc290 Michele Tartara
``format_version``
73 3301805f Michele Tartara
  The format of what is represented in the "data" field for each data
74 3301805f Michele Tartara
  collector might change over time. Every time this happens, the
75 3301805f Michele Tartara
  format_version should be changed, so that who reads the report knows
76 3301805f Michele Tartara
  what format to expect, and how to correctly interpret it.
77 3301805f Michele Tartara
78 3301805f Michele Tartara
``timestamp``
79 0e8d8384 Michele Tartara
  The time when the reported data were gathered. It has to be expressed
80 3301805f Michele Tartara
  in nanoseconds since the unix epoch (0:00:00 January 01, 1970). If not
81 3301805f Michele Tartara
  enough precision is available (or needed) it can be padded with
82 3301805f Michele Tartara
  zeroes. If a report object needs multiple timestamps, it can add more
83 3301805f Michele Tartara
  and/or override this one inside its own "data" section.
84 3301805f Michele Tartara
85 3301805f Michele Tartara
``category``
86 3301805f Michele Tartara
  A collector can belong to a given category of collectors (e.g.: storage
87 3301805f Michele Tartara
  collectors, daemon collector). This means that it will have to provide a
88 3301805f Michele Tartara
  minumum set of prescribed fields, as documented for each category.
89 3301805f Michele Tartara
  This field will contain the name of the category the collector belongs to,
90 3301805f Michele Tartara
  if any, or just the ``null`` value.
91 3301805f Michele Tartara
92 3301805f Michele Tartara
``kind``
93 3301805f Michele Tartara
  Two kinds of collectors are possible:
94 3301805f Michele Tartara
  `Performance reporting collectors`_ and `Status reporting collectors`_.
95 3301805f Michele Tartara
  The respective paragraphs will describe them and the value of this field.
96 3301805f Michele Tartara
97 3301805f Michele Tartara
``data``
98 3301805f Michele Tartara
  This field contains all the data generated by the specific data collector,
99 3301805f Michele Tartara
  in its own independently defined format. The monitoring agent could check
100 3301805f Michele Tartara
  this syntactically (according to the JSON specifications) but not
101 3301805f Michele Tartara
  semantically.
102 3301805f Michele Tartara
103 3301805f Michele Tartara
Here follows a minimal example of a report::
104 3301805f Michele Tartara
105 3301805f Michele Tartara
  [
106 3301805f Michele Tartara
  {
107 3301805f Michele Tartara
      "name" : "TheCollectorIdentifier",
108 3301805f Michele Tartara
      "version" : "1.2",
109 834dc290 Michele Tartara
      "format_version" : 1,
110 3301805f Michele Tartara
      "timestamp" : 1351607182000000000,
111 3301805f Michele Tartara
      "category" : null,
112 3301805f Michele Tartara
      "kind" : 0,
113 3301805f Michele Tartara
      "data" : { "plugin_specific_data" : "go_here" }
114 3301805f Michele Tartara
  },
115 3301805f Michele Tartara
  {
116 3301805f Michele Tartara
      "name" : "AnotherDataCollector",
117 3301805f Michele Tartara
      "version" : "B",
118 834dc290 Michele Tartara
      "format_version" : 7,
119 3301805f Michele Tartara
      "timestamp" : 1351609526123854000,
120 3301805f Michele Tartara
      "category" : "storage",
121 3301805f Michele Tartara
      "kind" : 1,
122 3301805f Michele Tartara
      "data" : { "status" : { "code" : 1,
123 3301805f Michele Tartara
                              "message" : "Error on disk 2"
124 3301805f Michele Tartara
                            },
125 3301805f Michele Tartara
                 "plugin_specific" : "data",
126 3301805f Michele Tartara
                 "some_late_data" : { "timestamp" : 1351609526123942720,
127 3301805f Michele Tartara
                                      ...
128 3301805f Michele Tartara
                                    }
129 3301805f Michele Tartara
               }
130 3301805f Michele Tartara
  }
131 3301805f Michele Tartara
  ]
132 3301805f Michele Tartara
133 3301805f Michele Tartara
Performance reporting collectors
134 3301805f Michele Tartara
++++++++++++++++++++++++++++++++
135 3301805f Michele Tartara
136 3301805f Michele Tartara
These collectors only provide data about some component of the system, without
137 3301805f Michele Tartara
giving any interpretation over their meaning.
138 3301805f Michele Tartara
139 3301805f Michele Tartara
The value of the ``kind`` field of the report will be ``0``.
140 3301805f Michele Tartara
141 3301805f Michele Tartara
Status reporting collectors
142 3301805f Michele Tartara
+++++++++++++++++++++++++++
143 3301805f Michele Tartara
144 3301805f Michele Tartara
These collectors will provide information about the status of some
145 3301805f Michele Tartara
component of ganeti, or managed by ganeti.
146 3301805f Michele Tartara
147 3301805f Michele Tartara
The value of their ``kind`` field will be ``1``.
148 3301805f Michele Tartara
149 3301805f Michele Tartara
The rationale behind this kind of collectors is that there are some situations
150 3301805f Michele Tartara
where exporting data about the underlying subsystems would expose potential
151 3301805f Michele Tartara
issues. But if Ganeti itself is able (and going) to fix the problem, conflicts
152 3301805f Michele Tartara
might arise between Ganeti and something/somebody else trying to fix the same
153 3301805f Michele Tartara
problem.
154 3301805f Michele Tartara
Also, some external monitoring systems might not be aware of the internals of a
155 3301805f Michele Tartara
particular subsystem (e.g.: DRBD) and might only exploit the high level
156 3301805f Michele Tartara
response of its data collector, alerting an administrator if anything is wrong.
157 3301805f Michele Tartara
Still, completely hiding the underlying data is not a good idea, as they might
158 3301805f Michele Tartara
still be of use in some cases. So status reporting plugins will provide two
159 3301805f Michele Tartara
output modes: one just exporting a high level information about the status,
160 3301805f Michele Tartara
and one also exporting all the data they gathered.
161 3301805f Michele Tartara
The default output mode will be the status-only one. Through a command line
162 3301805f Michele Tartara
parameter (for stand-alone data collectors) or through the HTTP request to the
163 3301805f Michele Tartara
monitoring agent
164 3301805f Michele Tartara
(when collectors are executed as part of it) the verbose output mode providing
165 3301805f Michele Tartara
all the data can be selected.
166 3301805f Michele Tartara
167 3301805f Michele Tartara
When exporting just the status each status reporting collector will provide,
168 3301805f Michele Tartara
in its ``data`` section, at least the following field:
169 3301805f Michele Tartara
170 3301805f Michele Tartara
``status``
171 3301805f Michele Tartara
  summarizes the status of the component being monitored and consists of two
172 3301805f Michele Tartara
  subfields:
173 3301805f Michele Tartara
174 3301805f Michele Tartara
  ``code``
175 3301805f Michele Tartara
    It assumes a numeric value, encoded in such a way to allow using a bitset
176 88e23508 Michele Tartara
    to easily distinguish which states are currently present in the whole
177 88e23508 Michele Tartara
    cluster. If the bitwise OR of all the ``status`` fields is 0, the cluster
178 88e23508 Michele Tartara
    is completely healty.
179 3301805f Michele Tartara
    The status codes are as follows:
180 3301805f Michele Tartara
181 3301805f Michele Tartara
    ``0``
182 3301805f Michele Tartara
      The collector can determine that everything is working as
183 3301805f Michele Tartara
      intended.
184 3301805f Michele Tartara
185 3301805f Michele Tartara
    ``1``
186 3301805f Michele Tartara
      Something is temporarily wrong but it is being automatically fixed by
187 3301805f Michele Tartara
      Ganeti.
188 3301805f Michele Tartara
      There is no need of external intervention.
189 3301805f Michele Tartara
190 3301805f Michele Tartara
    ``2``
191 3301805f Michele Tartara
      The collector has failed to understand whether the status is good or
192 3301805f Michele Tartara
      bad. Further analysis is required. Interpret this status as a
193 3301805f Michele Tartara
      potentially dangerous situation.
194 3301805f Michele Tartara
195 82437b28 Michele Tartara
    ``4``
196 82437b28 Michele Tartara
      The collector can determine that something is wrong and Ganeti has no
197 82437b28 Michele Tartara
      way to fix it autonomously. External intervention is required.
198 82437b28 Michele Tartara
199 3301805f Michele Tartara
  ``message``
200 3301805f Michele Tartara
    A message to better explain the reason of the status.
201 3301805f Michele Tartara
    The exact format of the message string is data collector dependent.
202 3301805f Michele Tartara
203 debfca88 Michele Tartara
    The field is mandatory, but the content can be an empty string if the
204 debfca88 Michele Tartara
    ``code`` is ``0`` (working as intended) or ``1`` (being fixed
205 debfca88 Michele Tartara
    automatically).
206 3301805f Michele Tartara
207 3301805f Michele Tartara
    If the status code is ``2``, the message should specify what has gone
208 3301805f Michele Tartara
    wrong.
209 3301805f Michele Tartara
    If the status code is ``4``, the message shoud explain why it was not
210 3301805f Michele Tartara
    possible to determine a proper status.
211 3301805f Michele Tartara
212 3301805f Michele Tartara
The ``data`` section will also contain all the fields describing the gathered
213 3301805f Michele Tartara
data, according to a collector-specific format.
214 3301805f Michele Tartara
215 109e07c2 Guido Trotter
Instance status
216 109e07c2 Guido Trotter
+++++++++++++++
217 109e07c2 Guido Trotter
218 109e07c2 Guido Trotter
At the moment each node knows which instances are running on it, which
219 109e07c2 Guido Trotter
instances it is primary for, but not the cause why an instance might not
220 109e07c2 Guido Trotter
be running. On the other hand we don't want to distribute full instance
221 109e07c2 Guido Trotter
"admin" status information to all nodes, because of the performance
222 109e07c2 Guido Trotter
impact this would have.
223 109e07c2 Guido Trotter
224 109e07c2 Guido Trotter
As such we propose that:
225 109e07c2 Guido Trotter
226 109e07c2 Guido Trotter
- Any operation that can affect instance status will have an optional
227 109e07c2 Guido Trotter
  "reason" attached to it (at opcode level). This can be used for
228 109e07c2 Guido Trotter
  example to distinguish an admin request, from a scheduled maintenance
229 109e07c2 Guido Trotter
  or an automated tool's work. If this reason is not passed, Ganeti will
230 2bd9ec7c Michele Tartara
  just use the information it has about the source of the request.
231 2bd9ec7c Michele Tartara
  This reason information will be structured according to the
232 2bd9ec7c Michele Tartara
  :doc:`Ganeti reason trail <design-reason-trail>` design document.
233 109e07c2 Guido Trotter
- RPCs that affect the instance status will be changed so that the
234 109e07c2 Guido Trotter
  "reason" and the version of the config object they ran on is passed to
235 109e07c2 Guido Trotter
  them. They will then export the new expected instance status, together
236 109e07c2 Guido Trotter
  with the associated reason and object version to the status report
237 109e07c2 Guido Trotter
  system, which then will export those themselves.
238 109e07c2 Guido Trotter
239 109e07c2 Guido Trotter
Monitoring and auditing systems can then use the reason to understand
240 3301805f Michele Tartara
the cause of an instance status, and they can use the timestamp to
241 109e07c2 Guido Trotter
understand the freshness of their data even in the absence of an atomic
242 109e07c2 Guido Trotter
cross-node reporting: for example if they see an instance "up" on a node
243 109e07c2 Guido Trotter
after seeing it running on a previous one, they can compare these values
244 109e07c2 Guido Trotter
to understand which data is freshest, and repoll the "older" node. Of
245 109e07c2 Guido Trotter
course if they keep seeing this status this represents an error (either
246 109e07c2 Guido Trotter
an instance continuously "flapping" between nodes, or an instance is
247 109e07c2 Guido Trotter
constantly up on more than one), which should be reported and acted
248 109e07c2 Guido Trotter
upon.
249 109e07c2 Guido Trotter
250 109e07c2 Guido Trotter
The instance status will be on each node, for the instances it is
251 3301805f Michele Tartara
primary for, and its ``data`` section of the report will contain a list
252 42b50796 Michele Tartara
of instances, named ``instances``, with at least the following fields for
253 42b50796 Michele Tartara
each instance:
254 3301805f Michele Tartara
255 3301805f Michele Tartara
``name``
256 3301805f Michele Tartara
  The name of the instance.
257 3301805f Michele Tartara
258 3301805f Michele Tartara
``uuid``
259 3301805f Michele Tartara
  The UUID of the instance (stable on name change).
260 3301805f Michele Tartara
261 3301805f Michele Tartara
``admin_state``
262 3301805f Michele Tartara
  The status of the instance (up/down/offline) as requested by the admin.
263 3301805f Michele Tartara
264 3301805f Michele Tartara
``actual_state``
265 3301805f Michele Tartara
  The actual status of the instance. It can be ``up``, ``down``, or
266 3301805f Michele Tartara
  ``hung`` if the instance is up but it appears to be completely stuck.
267 3301805f Michele Tartara
268 3301805f Michele Tartara
``uptime``
269 3301805f Michele Tartara
  The uptime of the instance (if it is up, "null" otherwise).
270 3301805f Michele Tartara
271 3301805f Michele Tartara
``mtime``
272 3301805f Michele Tartara
  The timestamp of the last known change to the instance state.
273 3301805f Michele Tartara
274 3301805f Michele Tartara
``state_reason``
275 2bd9ec7c Michele Tartara
  The last known reason for state change of the instance, described according
276 42b50796 Michele Tartara
  to the JSON representation of a reason trail, as detailed in the :doc:`reason
277 42b50796 Michele Tartara
  trail design document <design-reason-trail>`.
278 109e07c2 Guido Trotter
279 3301805f Michele Tartara
``status``
280 3301805f Michele Tartara
  It represents the status of the instance, and its format is the same as that
281 3301805f Michele Tartara
  of the ``status`` field of `Status reporting collectors`_.
282 3301805f Michele Tartara
283 3301805f Michele Tartara
Each hypervisor should provide its own instance status data collector, possibly
284 3301805f Michele Tartara
with the addition of more, specific, fields.
285 3301805f Michele Tartara
The ``category`` field of all of them will be ``instance``.
286 3301805f Michele Tartara
The ``kind`` field will be ``1``.
287 109e07c2 Guido Trotter
288 109e07c2 Guido Trotter
Note that as soon as a node knows it's not the primary anymore for an
289 109e07c2 Guido Trotter
instance it will stop reporting status for it: this means the instance
290 109e07c2 Guido Trotter
will either disappear, if it has been deleted, or appear on another
291 109e07c2 Guido Trotter
node, if it's been moved.
292 109e07c2 Guido Trotter
293 3301805f Michele Tartara
The ``code`` of the ``status`` field of the report of the Instance status data
294 3301805f Michele Tartara
collector will be:
295 109e07c2 Guido Trotter
296 3301805f Michele Tartara
``0``
297 3301805f Michele Tartara
  if ``status`` is ``0`` for all the instances it is reporting about.
298 109e07c2 Guido Trotter
299 3301805f Michele Tartara
``1``
300 3301805f Michele Tartara
  otherwise.
301 3301805f Michele Tartara
302 05f88ad6 Michele Tartara
Storage collectors
303 05f88ad6 Michele Tartara
++++++++++++++++++
304 3301805f Michele Tartara
305 05f88ad6 Michele Tartara
The storage collectors will be a series of data collectors
306 05f88ad6 Michele Tartara
that will gather data about storage for the current node. The collection
307 05f88ad6 Michele Tartara
will be performed at different granularity and abstraction levels, from
308 05f88ad6 Michele Tartara
the physical disks, to partitions, logical volumes and to the specific
309 05f88ad6 Michele Tartara
storage types used by Ganeti itself (drbd, rbd, plain, file).
310 3301805f Michele Tartara
311 3301805f Michele Tartara
The ``name`` of each of these collector will reflect what storage type each of
312 3301805f Michele Tartara
them refers to.
313 3301805f Michele Tartara
314 3301805f Michele Tartara
The ``category`` field of these collector will be ``storage``.
315 3301805f Michele Tartara
316 05f88ad6 Michele Tartara
The ``kind`` field will depend on the specific collector.
317 3301805f Michele Tartara
318 05f88ad6 Michele Tartara
Each ``storage`` collector's ``data`` section will provide collector-specific
319 05f88ad6 Michele Tartara
fields.
320 3301805f Michele Tartara
321 fae96b7c Michele Tartara
The various storage collectors will provide keys to join the data they provide,
322 fae96b7c Michele Tartara
in order to allow the user to get a better understanding of the system. E.g.:
323 fae96b7c Michele Tartara
through device names, or instance names.
324 3301805f Michele Tartara
325 777a3109 Michele Tartara
Diskstats collector
326 777a3109 Michele Tartara
*******************
327 777a3109 Michele Tartara
328 777a3109 Michele Tartara
This storage data collector will gather information about the status of the
329 777a3109 Michele Tartara
disks installed in the system, as listed in the /proc/diskstats file. This means
330 777a3109 Michele Tartara
that not only physical hard drives, but also ramdisks and loopback devices will
331 777a3109 Michele Tartara
be listed.
332 777a3109 Michele Tartara
333 777a3109 Michele Tartara
Its ``kind`` in the report will be ``0`` (`Performance reporting collectors`_).
334 777a3109 Michele Tartara
335 777a3109 Michele Tartara
Its ``category`` field in the report will contain the value ``storage``.
336 777a3109 Michele Tartara
337 777a3109 Michele Tartara
When executed in verbose mode, the ``data`` section of the report of this
338 777a3109 Michele Tartara
collector will be a list of items, each representing one disk, each providing
339 777a3109 Michele Tartara
the following fields:
340 777a3109 Michele Tartara
341 777a3109 Michele Tartara
``major``
342 777a3109 Michele Tartara
  The major number of the device.
343 777a3109 Michele Tartara
344 777a3109 Michele Tartara
``minor``
345 777a3109 Michele Tartara
  The minor number of the device.
346 777a3109 Michele Tartara
347 777a3109 Michele Tartara
``name``
348 777a3109 Michele Tartara
  The name of the device.
349 777a3109 Michele Tartara
350 92070017 Michele Tartara
``readsNum``
351 777a3109 Michele Tartara
  This is the total number of reads completed successfully.
352 777a3109 Michele Tartara
353 777a3109 Michele Tartara
``mergedReads``
354 777a3109 Michele Tartara
  Reads which are adjacent to each other may be merged for efficiency. Thus
355 777a3109 Michele Tartara
  two 4K reads may become one 8K read before it is ultimately handed to the
356 777a3109 Michele Tartara
  disk, and so it will be counted (and queued) as only one I/O. This field
357 777a3109 Michele Tartara
  specifies how often this was done.
358 777a3109 Michele Tartara
359 777a3109 Michele Tartara
``secRead``
360 777a3109 Michele Tartara
  This is the total number of sectors read successfully.
361 777a3109 Michele Tartara
362 777a3109 Michele Tartara
``timeRead``
363 777a3109 Michele Tartara
  This is the total number of milliseconds spent by all reads.
364 777a3109 Michele Tartara
365 777a3109 Michele Tartara
``writes``
366 777a3109 Michele Tartara
  This is the total number of writes completed successfully.
367 777a3109 Michele Tartara
368 777a3109 Michele Tartara
``mergedWrites``
369 777a3109 Michele Tartara
  Writes which are adjacent to each other may be merged for efficiency. Thus
370 777a3109 Michele Tartara
  two 4K writes may become one 8K read before it is ultimately handed to the
371 777a3109 Michele Tartara
  disk, and so it will be counted (and queued) as only one I/O. This field
372 777a3109 Michele Tartara
  specifies how often this was done.
373 777a3109 Michele Tartara
374 777a3109 Michele Tartara
``secWritten``
375 777a3109 Michele Tartara
  This is the total number of sectors written successfully.
376 777a3109 Michele Tartara
377 777a3109 Michele Tartara
``timeWrite``
378 fae96b7c Michele Tartara
  This is the total number of milliseconds spent by all writes.
379 777a3109 Michele Tartara
380 777a3109 Michele Tartara
``ios``
381 777a3109 Michele Tartara
  The number of I/Os currently in progress.
382 777a3109 Michele Tartara
  The only field that should go to zero, it is incremented as requests are
383 777a3109 Michele Tartara
  given to appropriate struct request_queue and decremented as they finish.
384 777a3109 Michele Tartara
385 777a3109 Michele Tartara
``timeIO``
386 777a3109 Michele Tartara
  The number of milliseconds spent doing I/Os. This field increases so long
387 777a3109 Michele Tartara
  as field ``IOs`` is nonzero.
388 777a3109 Michele Tartara
389 777a3109 Michele Tartara
``wIOmillis``
390 777a3109 Michele Tartara
  The weighted number of milliseconds spent doing I/Os.
391 777a3109 Michele Tartara
  This field is incremented at each I/O start, I/O completion, I/O merge,
392 777a3109 Michele Tartara
  or read of these stats by the number of I/Os in progress (field ``IOs``)
393 777a3109 Michele Tartara
  times the number of milliseconds spent doing I/O since the last update of
394 777a3109 Michele Tartara
  this field. This can provide an easy measure of both I/O completion time
395 777a3109 Michele Tartara
  and the backlog that may be accumulating.
396 777a3109 Michele Tartara
397 a1f2fb58 Michele Tartara
Logical Volume collector
398 a1f2fb58 Michele Tartara
************************
399 a1f2fb58 Michele Tartara
400 a1f2fb58 Michele Tartara
This data collector will gather information about the attributes of logical
401 a1f2fb58 Michele Tartara
volumes present in the system.
402 a1f2fb58 Michele Tartara
403 a1f2fb58 Michele Tartara
Its ``kind`` in the report will be ``0`` (`Performance reporting collectors`_).
404 a1f2fb58 Michele Tartara
405 a1f2fb58 Michele Tartara
Its ``category`` field in the report will contain the value ``storage``.
406 a1f2fb58 Michele Tartara
407 a1f2fb58 Michele Tartara
The ``data`` section of the report of this collector will be a list of items,
408 a1f2fb58 Michele Tartara
each representing one logical volume and providing the following fields:
409 a1f2fb58 Michele Tartara
410 a1f2fb58 Michele Tartara
``uuid``
411 a1f2fb58 Michele Tartara
  The UUID of the logical volume.
412 a1f2fb58 Michele Tartara
413 a1f2fb58 Michele Tartara
``name``
414 a1f2fb58 Michele Tartara
  The name of the logical volume.
415 a1f2fb58 Michele Tartara
416 a1f2fb58 Michele Tartara
``attr``
417 a1f2fb58 Michele Tartara
  The attributes of the logical volume.
418 a1f2fb58 Michele Tartara
419 a1f2fb58 Michele Tartara
``major``
420 a1f2fb58 Michele Tartara
  Persistent major number or -1 if not persistent.
421 a1f2fb58 Michele Tartara
422 a1f2fb58 Michele Tartara
``minor``
423 a1f2fb58 Michele Tartara
  Persistent minor number or -1 if not persistent.
424 a1f2fb58 Michele Tartara
425 a1f2fb58 Michele Tartara
``kernel_major``
426 a1f2fb58 Michele Tartara
  Currently assigned major number or -1 if LV is not active.
427 a1f2fb58 Michele Tartara
428 a1f2fb58 Michele Tartara
``kernel_minor``
429 a1f2fb58 Michele Tartara
  Currently assigned minor number or -1 if LV is not active.
430 a1f2fb58 Michele Tartara
431 a1f2fb58 Michele Tartara
``size``
432 a1f2fb58 Michele Tartara
  Size of LV in bytes.
433 a1f2fb58 Michele Tartara
434 a1f2fb58 Michele Tartara
``seg_count``
435 a1f2fb58 Michele Tartara
  Number of segments in LV.
436 a1f2fb58 Michele Tartara
437 a1f2fb58 Michele Tartara
``tags``
438 a1f2fb58 Michele Tartara
  Tags, if any.
439 a1f2fb58 Michele Tartara
440 a1f2fb58 Michele Tartara
``modules``
441 a1f2fb58 Michele Tartara
  Kernel device-mapper modules required for this LV, if any.
442 a1f2fb58 Michele Tartara
443 a1f2fb58 Michele Tartara
``vg_uuid``
444 a1f2fb58 Michele Tartara
  Unique identifier of the volume group.
445 a1f2fb58 Michele Tartara
446 a1f2fb58 Michele Tartara
``vg_name``
447 a1f2fb58 Michele Tartara
  Name of the volume group.
448 a1f2fb58 Michele Tartara
449 a1f2fb58 Michele Tartara
``segtype``
450 a1f2fb58 Michele Tartara
  Type of LV segment.
451 a1f2fb58 Michele Tartara
452 a1f2fb58 Michele Tartara
``seg_start``
453 a1f2fb58 Michele Tartara
  Offset within the LVto the start of the segment in bytes.
454 a1f2fb58 Michele Tartara
455 a1f2fb58 Michele Tartara
``seg_start_pe``
456 a1f2fb58 Michele Tartara
  Offset within the LV to the start of the segment in physical extents.
457 a1f2fb58 Michele Tartara
458 a1f2fb58 Michele Tartara
``seg_size``
459 a1f2fb58 Michele Tartara
  Size of the segment in bytes.
460 a1f2fb58 Michele Tartara
461 a1f2fb58 Michele Tartara
``seg_tags``
462 a1f2fb58 Michele Tartara
  Tags for the segment, if any.
463 a1f2fb58 Michele Tartara
464 a1f2fb58 Michele Tartara
``seg_pe_ranges``
465 a1f2fb58 Michele Tartara
  Ranges of Physical Extents of underlying devices in lvs command line format.
466 a1f2fb58 Michele Tartara
467 a1f2fb58 Michele Tartara
``devices``
468 a1f2fb58 Michele Tartara
  Underlying devices used with starting extent numbers.
469 a1f2fb58 Michele Tartara
470 a1f2fb58 Michele Tartara
``instance``
471 fbfa1d19 Michele Tartara
  The name of the instance this LV is used by, or ``null`` if it was not
472 fbfa1d19 Michele Tartara
  possible to determine it.
473 a1f2fb58 Michele Tartara
474 3301805f Michele Tartara
DRBD status
475 3301805f Michele Tartara
***********
476 3301805f Michele Tartara
477 3301805f Michele Tartara
This data collector will run only on nodes where DRBD is actually
478 3301805f Michele Tartara
present and it will gather information about DRBD devices.
479 3301805f Michele Tartara
480 3301805f Michele Tartara
Its ``kind`` in the report will be ``1`` (`Status reporting collectors`_).
481 3301805f Michele Tartara
482 3301805f Michele Tartara
Its ``category`` field in the report will contain the value ``storage``.
483 3301805f Michele Tartara
484 3301805f Michele Tartara
When executed in verbose mode, the ``data`` section of the report of this
485 3301805f Michele Tartara
collector will provide the following fields:
486 3301805f Michele Tartara
487 3301805f Michele Tartara
``versionInfo``
488 3301805f Michele Tartara
  Information about the DRBD version number, given by a combination of
489 3301805f Michele Tartara
  any (but at least one) of the following fields:
490 3301805f Michele Tartara
491 3301805f Michele Tartara
  ``version``
492 3301805f Michele Tartara
    The DRBD driver version.
493 3301805f Michele Tartara
494 3301805f Michele Tartara
  ``api``
495 3301805f Michele Tartara
    The API version number.
496 3301805f Michele Tartara
497 3301805f Michele Tartara
  ``proto``
498 3301805f Michele Tartara
    The protocol version.
499 3301805f Michele Tartara
500 3301805f Michele Tartara
  ``srcversion``
501 3301805f Michele Tartara
    The version of the source files.
502 3301805f Michele Tartara
503 3301805f Michele Tartara
  ``gitHash``
504 3301805f Michele Tartara
    Git hash of the source files.
505 3301805f Michele Tartara
506 3301805f Michele Tartara
  ``buildBy``
507 3301805f Michele Tartara
    Who built the binary, and, optionally, when.
508 3301805f Michele Tartara
509 3301805f Michele Tartara
``device``
510 3301805f Michele Tartara
  A list of structures, each describing a DRBD device (a minor) and containing
511 3301805f Michele Tartara
  the following fields:
512 3301805f Michele Tartara
513 3301805f Michele Tartara
  ``minor``
514 3301805f Michele Tartara
    The device minor number.
515 3301805f Michele Tartara
516 3301805f Michele Tartara
  ``connectionState``
517 3301805f Michele Tartara
    The state of the connection. If it is "Unconfigured", all the following
518 3301805f Michele Tartara
    fields are not present.
519 3301805f Michele Tartara
520 3301805f Michele Tartara
  ``localRole``
521 3301805f Michele Tartara
    The role of the local resource.
522 3301805f Michele Tartara
523 3301805f Michele Tartara
  ``remoteRole``
524 3301805f Michele Tartara
    The role of the remote resource.
525 3301805f Michele Tartara
526 3301805f Michele Tartara
  ``localState``
527 3301805f Michele Tartara
    The status of the local disk.
528 3301805f Michele Tartara
529 3301805f Michele Tartara
  ``remoteState``
530 3301805f Michele Tartara
    The status of the remote disk.
531 3301805f Michele Tartara
532 3301805f Michele Tartara
  ``replicationProtocol``
533 3301805f Michele Tartara
    The replication protocol being used.
534 3301805f Michele Tartara
535 3301805f Michele Tartara
  ``ioFlags``
536 3301805f Michele Tartara
    The input/output flags.
537 3301805f Michele Tartara
538 3301805f Michele Tartara
  ``perfIndicators``
539 3301805f Michele Tartara
    The performance indicators. This field will contain the following
540 3301805f Michele Tartara
    sub-fields:
541 3301805f Michele Tartara
542 3301805f Michele Tartara
    ``networkSend``
543 3301805f Michele Tartara
      KiB of data sent on the network.
544 3301805f Michele Tartara
545 3301805f Michele Tartara
    ``networkReceive``
546 3301805f Michele Tartara
      KiB of data received from the network.
547 3301805f Michele Tartara
548 3301805f Michele Tartara
    ``diskWrite``
549 3301805f Michele Tartara
      KiB of data written on local disk.
550 3301805f Michele Tartara
551 3301805f Michele Tartara
    ``diskRead``
552 3301805f Michele Tartara
      KiB of date read from the local disk.
553 3301805f Michele Tartara
554 3301805f Michele Tartara
    ``activityLog``
555 3301805f Michele Tartara
      Number of updates of the activity log.
556 3301805f Michele Tartara
557 3301805f Michele Tartara
    ``bitMap``
558 3301805f Michele Tartara
      Number of updates to the bitmap area of the metadata.
559 3301805f Michele Tartara
560 3301805f Michele Tartara
    ``localCount``
561 3301805f Michele Tartara
      Number of open requests to the local I/O subsystem.
562 3301805f Michele Tartara
563 3301805f Michele Tartara
    ``pending``
564 3301805f Michele Tartara
      Number of requests sent to the partner but not yet answered.
565 3301805f Michele Tartara
566 3301805f Michele Tartara
    ``unacknowledged``
567 3301805f Michele Tartara
      Number of requests received by the partner but still to be answered.
568 3301805f Michele Tartara
569 3301805f Michele Tartara
    ``applicationPending``
570 3301805f Michele Tartara
      Num of block input/output requests forwarded to DRBD but that have not yet
571 3301805f Michele Tartara
      been answered.
572 3301805f Michele Tartara
573 3301805f Michele Tartara
    ``epochs``
574 3301805f Michele Tartara
      (Optional) Number of epoch objects. Not provided by all DRBD versions.
575 3301805f Michele Tartara
576 3301805f Michele Tartara
    ``writeOrder``
577 3301805f Michele Tartara
      (Optional) Currently used write ordering method. Not provided by all DRBD
578 3301805f Michele Tartara
      versions.
579 3301805f Michele Tartara
580 3301805f Michele Tartara
    ``outOfSync``
581 3301805f Michele Tartara
      (Optional) KiB of storage currently out of sync. Not provided by all DRBD
582 3301805f Michele Tartara
      versions.
583 3301805f Michele Tartara
584 3301805f Michele Tartara
  ``syncStatus``
585 3301805f Michele Tartara
    (Optional) The status of the synchronization of the disk. This is present
586 3301805f Michele Tartara
    only if the disk is being synchronized, and includes the following fields:
587 3301805f Michele Tartara
588 3301805f Michele Tartara
    ``percentage``
589 3301805f Michele Tartara
      The percentage of synchronized data.
590 3301805f Michele Tartara
591 3301805f Michele Tartara
    ``progress``
592 3301805f Michele Tartara
      How far the synchronization is. Written as "x/y", where x and y are
593 3301805f Michele Tartara
      integer numbers expressed in the measurement unit stated in
594 3301805f Michele Tartara
      ``progressUnit``
595 3301805f Michele Tartara
596 3301805f Michele Tartara
    ``progressUnit``
597 3301805f Michele Tartara
      The measurement unit for the progress indicator.
598 3301805f Michele Tartara
599 3301805f Michele Tartara
    ``timeToFinish``
600 3301805f Michele Tartara
      The expected time before finishing the synchronization.
601 3301805f Michele Tartara
602 3301805f Michele Tartara
    ``speed``
603 3301805f Michele Tartara
      The speed of the synchronization.
604 3301805f Michele Tartara
605 3301805f Michele Tartara
    ``want``
606 3301805f Michele Tartara
      The desiderd speed of the synchronization.
607 3301805f Michele Tartara
608 3301805f Michele Tartara
    ``speedUnit``
609 3301805f Michele Tartara
      The measurement unit of the ``speed`` and ``want`` values. Expressed
610 3301805f Michele Tartara
      as "size/time".
611 3301805f Michele Tartara
612 3301805f Michele Tartara
  ``instance``
613 3301805f Michele Tartara
    The name of the Ganeti instance this disk is associated to.
614 109e07c2 Guido Trotter
615 109e07c2 Guido Trotter
616 109e07c2 Guido Trotter
Ganeti daemons status
617 109e07c2 Guido Trotter
+++++++++++++++++++++
618 109e07c2 Guido Trotter
619 3301805f Michele Tartara
Ganeti will report what information it has about its own daemons.
620 3301805f Michele Tartara
This should allow identifying possible problems with the Ganeti system itself:
621 3301805f Michele Tartara
for example memory leaks, crashes and high resource utilization should be
622 3301805f Michele Tartara
evident by analyzing this information.
623 3301805f Michele Tartara
624 3301805f Michele Tartara
The ``kind`` field will be ``1`` (`Status reporting collectors`_).
625 3301805f Michele Tartara
626 3301805f Michele Tartara
Each daemon will have its own data collector, and each of them will have
627 3301805f Michele Tartara
a ``category`` field valued ``daemon``.
628 3301805f Michele Tartara
629 3301805f Michele Tartara
When executed in verbose mode, their data section will include at least:
630 3301805f Michele Tartara
631 3301805f Michele Tartara
``memory``
632 3301805f Michele Tartara
  The amount of used memory.
633 3301805f Michele Tartara
634 3301805f Michele Tartara
``size_unit``
635 3301805f Michele Tartara
  The measurement unit used for the memory.
636 109e07c2 Guido Trotter
637 3301805f Michele Tartara
``uptime``
638 3301805f Michele Tartara
  The uptime of the daemon.
639 3301805f Michele Tartara
640 3301805f Michele Tartara
``CPU usage``
641 3301805f Michele Tartara
  How much cpu the daemon is using (percentage).
642 3301805f Michele Tartara
643 3301805f Michele Tartara
Any other daemon-specific information can be included as well in the ``data``
644 3301805f Michele Tartara
section.
645 109e07c2 Guido Trotter
646 109e07c2 Guido Trotter
Hypervisor resources report
647 109e07c2 Guido Trotter
+++++++++++++++++++++++++++
648 109e07c2 Guido Trotter
649 109e07c2 Guido Trotter
Each hypervisor has a view of system resources that sometimes is
650 109e07c2 Guido Trotter
different than the one the OS sees (for example in Xen the Node OS,
651 109e07c2 Guido Trotter
running as Dom0, has access to only part of those resources). In this
652 109e07c2 Guido Trotter
section we'll report all information we can in a "non hypervisor
653 109e07c2 Guido Trotter
specific" way. Each hypervisor can then add extra specific information
654 109e07c2 Guido Trotter
that is not generic enough be abstracted.
655 109e07c2 Guido Trotter
656 3301805f Michele Tartara
The ``kind`` field will be ``0`` (`Performance reporting collectors`_).
657 3301805f Michele Tartara
658 3301805f Michele Tartara
Each of the hypervisor data collectory will be of ``category``: ``hypervisor``.
659 3301805f Michele Tartara
660 109e07c2 Guido Trotter
Node OS resources report
661 109e07c2 Guido Trotter
++++++++++++++++++++++++
662 109e07c2 Guido Trotter
663 109e07c2 Guido Trotter
Since Ganeti assumes it's running on Linux, it's useful to export some
664 3301805f Michele Tartara
basic information as seen by the host system.
665 109e07c2 Guido Trotter
666 3301805f Michele Tartara
The ``category`` field of the report will be ``null``.
667 109e07c2 Guido Trotter
668 3301805f Michele Tartara
The ``kind`` field will be ``0`` (`Performance reporting collectors`_).
669 109e07c2 Guido Trotter
670 3301805f Michele Tartara
The ``data`` section will include:
671 109e07c2 Guido Trotter
672 3301805f Michele Tartara
``cpu_number``
673 3301805f Michele Tartara
  The number of available cpus.
674 109e07c2 Guido Trotter
675 3301805f Michele Tartara
``cpus``
676 3301805f Michele Tartara
  A list with one element per cpu, showing its average load.
677 109e07c2 Guido Trotter
678 3301805f Michele Tartara
``memory``
679 3301805f Michele Tartara
  The current view of memory (free, used, cached, etc.)
680 109e07c2 Guido Trotter
681 3301805f Michele Tartara
``filesystem``
682 3301805f Michele Tartara
  A list with one element per filesystem, showing a summary of the
683 3301805f Michele Tartara
  total/available space.
684 109e07c2 Guido Trotter
685 3301805f Michele Tartara
``NICs``
686 3301805f Michele Tartara
  A list with one element per network interface, showing the amount of
687 3301805f Michele Tartara
  sent/received data, error rate, IP address of the interface, etc.
688 109e07c2 Guido Trotter
689 3301805f Michele Tartara
``versions``
690 3301805f Michele Tartara
  A map using the name of a component Ganeti interacts (Linux, drbd,
691 3301805f Michele Tartara
  hypervisor, etc) as the key and its version number as the value.
692 109e07c2 Guido Trotter
693 3301805f Michele Tartara
Note that we won't go into any hardware specific details (e.g. querying a
694 3301805f Michele Tartara
node RAID is outside the scope of this, and can be implemented as a
695 3301805f Michele Tartara
plugin) but we can easily just report the information above, since it's
696 3301805f Michele Tartara
standard enough across all systems.
697 9ef3e121 Michele Tartara
698 99b67c35 Spyros Trigazis
Node OS CPU load average report
699 99b67c35 Spyros Trigazis
+++++++++++++++++++++++++++++++
700 99b67c35 Spyros Trigazis
701 99b67c35 Spyros Trigazis
This data collector will export CPU load statistics as seen by the host
702 99b67c35 Spyros Trigazis
system. Apart from using the data from an external monitoring system we
703 99b67c35 Spyros Trigazis
can also use the data to improve instance allocation and/or the Ganeti
704 99b67c35 Spyros Trigazis
cluster balance. To compute the CPU load average we will use a number of
705 99b67c35 Spyros Trigazis
values collected inside a time window. The collection process will be
706 99b67c35 Spyros Trigazis
done by an independent thread (see `Mode of Operation`_).
707 99b67c35 Spyros Trigazis
708 99b67c35 Spyros Trigazis
This report is a subset of the previous report (`Node OS resources
709 99b67c35 Spyros Trigazis
report`_) and they might eventually get merged, once reporting for the
710 99b67c35 Spyros Trigazis
other fields (memory, filesystem, NICs) gets implemented too.
711 99b67c35 Spyros Trigazis
712 99b67c35 Spyros Trigazis
Specifically:
713 99b67c35 Spyros Trigazis
714 99b67c35 Spyros Trigazis
The ``category`` field of the report will be ``null``.
715 99b67c35 Spyros Trigazis
716 99b67c35 Spyros Trigazis
The ``kind`` field will be ``0`` (`Performance reporting collectors`_).
717 99b67c35 Spyros Trigazis
718 99b67c35 Spyros Trigazis
The ``data`` section will include:
719 99b67c35 Spyros Trigazis
720 99b67c35 Spyros Trigazis
``cpu_number``
721 99b67c35 Spyros Trigazis
  The number of available cpus.
722 99b67c35 Spyros Trigazis
723 99b67c35 Spyros Trigazis
``cpus``
724 99b67c35 Spyros Trigazis
  A list with one element per cpu, showing its average load.
725 99b67c35 Spyros Trigazis
726 99b67c35 Spyros Trigazis
``cpu_total``
727 99b67c35 Spyros Trigazis
  The total CPU load average as a sum of the all separate cpus.
728 99b67c35 Spyros Trigazis
729 99b67c35 Spyros Trigazis
The CPU load report function will get N values, collected by the
730 99b67c35 Spyros Trigazis
CPU load collection function and calculate the above averages. Please
731 99b67c35 Spyros Trigazis
see the section `Mode of Operation`_  for more information one how the
732 99b67c35 Spyros Trigazis
two functions of the data collector interact.
733 99b67c35 Spyros Trigazis
734 b166dcfc Michele Tartara
Format of the query
735 b166dcfc Michele Tartara
-------------------
736 b166dcfc Michele Tartara
737 431ff2c1 Michele Tartara
.. include:: monitoring-query-format.rst
738 b166dcfc Michele Tartara
739 3301805f Michele Tartara
Instance disk status propagation
740 3301805f Michele Tartara
--------------------------------
741 9ef3e121 Michele Tartara
742 3301805f Michele Tartara
As for the instance status Ganeti has now only partial information about
743 3301805f Michele Tartara
its instance disks: in particular each node is unaware of the disk to
744 3301805f Michele Tartara
instance mapping, that exists only on the master.
745 9ef3e121 Michele Tartara
746 3301805f Michele Tartara
For this design doc we plan to fix this by changing all RPCs that create
747 3301805f Michele Tartara
a backend storage or that put an already existing one in use and passing
748 3301805f Michele Tartara
the relevant instance to the node. The node can then export these to the
749 3301805f Michele Tartara
status reporting tool.
750 9ef3e121 Michele Tartara
751 3301805f Michele Tartara
While we haven't implemented these RPC changes yet, we'll use Confd to
752 3301805f Michele Tartara
fetch this information in the data collectors.
753 9ef3e121 Michele Tartara
754 3301805f Michele Tartara
Plugin system
755 3301805f Michele Tartara
-------------
756 9ef3e121 Michele Tartara
757 3301805f Michele Tartara
The monitoring system will be equipped with a plugin system that can
758 3301805f Michele Tartara
export specific local information through it.
759 9ef3e121 Michele Tartara
760 3301805f Michele Tartara
The plugin system is expected to be used by local installations to
761 3301805f Michele Tartara
export any installation specific information that they want to be
762 3301805f Michele Tartara
monitored, about either hardware or software on their systems.
763 9ef3e121 Michele Tartara
764 3301805f Michele Tartara
The plugin system will be in the form of either scripts or binaries whose output
765 3301805f Michele Tartara
will be inserted in the report.
766 109e07c2 Guido Trotter
767 3301805f Michele Tartara
Eventually support for other kinds of plugins might be added as well, such as
768 3301805f Michele Tartara
plain text files which will be inserted into the report, or local unix or
769 3301805f Michele Tartara
network sockets from which the information has to be read.  This should allow
770 3301805f Michele Tartara
most flexibility for implementing an efficient system, while being able to keep
771 3301805f Michele Tartara
it as simple as possible.
772 109e07c2 Guido Trotter
773 109e07c2 Guido Trotter
Data collectors
774 109e07c2 Guido Trotter
---------------
775 109e07c2 Guido Trotter
776 109e07c2 Guido Trotter
In order to ease testing as well as to make it simple to reuse this
777 109e07c2 Guido Trotter
subsystem it will be possible to run just the "data collectors" on each
778 3301805f Michele Tartara
node without passing through the agent daemon.
779 109e07c2 Guido Trotter
780 9ef3e121 Michele Tartara
If a data collector is run independently, it should print on stdout its
781 9ef3e121 Michele Tartara
report, according to the format corresponding to a single data collector
782 3301805f Michele Tartara
report object, as described in the previous paragraphs.
783 109e07c2 Guido Trotter
784 109e07c2 Guido Trotter
Mode of operation
785 109e07c2 Guido Trotter
-----------------
786 109e07c2 Guido Trotter
787 109e07c2 Guido Trotter
In order to be able to report information fast the monitoring agent
788 109e07c2 Guido Trotter
daemon will keep an in-memory or on-disk cache of the status, which will
789 109e07c2 Guido Trotter
be returned when queries are made. The status system will then
790 109e07c2 Guido Trotter
periodically check resources to make sure the status is up to date.
791 109e07c2 Guido Trotter
792 109e07c2 Guido Trotter
Different parts of the report will be queried at different speeds. These
793 109e07c2 Guido Trotter
will depend on:
794 109e07c2 Guido Trotter
- how often they vary (or we expect them to vary)
795 109e07c2 Guido Trotter
- how fast they are to query
796 109e07c2 Guido Trotter
- how important their freshness is
797 109e07c2 Guido Trotter
798 109e07c2 Guido Trotter
Of course the last parameter is installation specific, and while we'll
799 109e07c2 Guido Trotter
try to have defaults, it will be configurable. The first two instead we
800 109e07c2 Guido Trotter
can use adaptively to query a certain resource faster or slower
801 109e07c2 Guido Trotter
depending on those two parameters.
802 109e07c2 Guido Trotter
803 3301805f Michele Tartara
When run as stand-alone binaries, the data collector will not using any
804 3301805f Michele Tartara
caching system, and just fetch and return the data immediately.
805 109e07c2 Guido Trotter
806 99b67c35 Spyros Trigazis
Since some performance collectors have to operate on a number of values
807 99b67c35 Spyros Trigazis
collected in previous times, we need a mechanism independent of the data
808 99b67c35 Spyros Trigazis
collector which will trigger the collection of those values and also
809 99b67c35 Spyros Trigazis
store them, so that they are available for calculation by the data
810 99b67c35 Spyros Trigazis
collectors.
811 99b67c35 Spyros Trigazis
812 99b67c35 Spyros Trigazis
To collect data periodically, a thread will be created by the monitoring
813 99b67c35 Spyros Trigazis
agent which will run the collection function of every data collector
814 99b67c35 Spyros Trigazis
that provides one. The values returned by the collection function of
815 99b67c35 Spyros Trigazis
the data collector will be saved in an appropriate map, associating each
816 99b67c35 Spyros Trigazis
value to the corresponding collector, using the collector's name as the
817 99b67c35 Spyros Trigazis
key of the map. This map will be stored in mond's memory.
818 99b67c35 Spyros Trigazis
819 0a3aa3d6 Spyros Trigazis
The collectors are divided in two categories:
820 0a3aa3d6 Spyros Trigazis
821 0a3aa3d6 Spyros Trigazis
- stateless collectors, collectors who have immediate access to the
822 0a3aa3d6 Spyros Trigazis
  reported information
823 0a3aa3d6 Spyros Trigazis
- stateful collectors, collectors whose report is based on data collected
824 0a3aa3d6 Spyros Trigazis
  in a previous time window
825 0a3aa3d6 Spyros Trigazis
826 99b67c35 Spyros Trigazis
For example: the collection function of the CPU load collector will
827 99b67c35 Spyros Trigazis
collect a CPU load value and save it in the map mentioned above. The
828 99b67c35 Spyros Trigazis
collection function will be called by the collector thread every t
829 99b67c35 Spyros Trigazis
milliseconds. When the report function of the collector is called, it
830 99b67c35 Spyros Trigazis
will process the last N values of the map and calculate the
831 99b67c35 Spyros Trigazis
corresponding average.
832 99b67c35 Spyros Trigazis
833 109e07c2 Guido Trotter
Implementation place
834 109e07c2 Guido Trotter
--------------------
835 109e07c2 Guido Trotter
836 109e07c2 Guido Trotter
The status daemon will be implemented as a standalone Haskell daemon. In
837 109e07c2 Guido Trotter
the future it should be easy to merge multiple daemons into one with
838 109e07c2 Guido Trotter
multiple entry points, should we find out it saves resources and doesn't
839 109e07c2 Guido Trotter
impact functionality.
840 109e07c2 Guido Trotter
841 109e07c2 Guido Trotter
The libekg library should be looked at for easily providing metrics in
842 109e07c2 Guido Trotter
json format.
843 109e07c2 Guido Trotter
844 109e07c2 Guido Trotter
Implementation order
845 109e07c2 Guido Trotter
--------------------
846 109e07c2 Guido Trotter
847 109e07c2 Guido Trotter
We will implement the agent system in this order:
848 109e07c2 Guido Trotter
849 3301805f Michele Tartara
- initial example data collectors (eg. for drbd and instance status).
850 3301805f Michele Tartara
- initial daemon for exporting data, integrating the existing collectors
851 3301805f Michele Tartara
- plugin system
852 109e07c2 Guido Trotter
- RPC updates for instance status reasons and disk to instance mapping
853 3301805f Michele Tartara
- cache layer for the daemon
854 109e07c2 Guido Trotter
- more data collectors
855 109e07c2 Guido Trotter
856 109e07c2 Guido Trotter
857 109e07c2 Guido Trotter
Future work
858 109e07c2 Guido Trotter
===========
859 109e07c2 Guido Trotter
860 109e07c2 Guido Trotter
As a future step it can be useful to "centralize" all this reporting
861 109e07c2 Guido Trotter
data on a single place. This for example can be just the master node, or
862 109e07c2 Guido Trotter
all the master candidates. We will evaluate doing this after the first
863 109e07c2 Guido Trotter
node-local version has been developed and tested.
864 109e07c2 Guido Trotter
865 109e07c2 Guido Trotter
Another possible change is replacing the "read-only" RPCs with queries
866 109e07c2 Guido Trotter
to the agent system, thus having only one way of collecting information
867 109e07c2 Guido Trotter
from the nodes from a monitoring system and for Ganeti itself.
868 109e07c2 Guido Trotter
869 109e07c2 Guido Trotter
One extra feature we may need is a way to query for only sub-parts of
870 109e07c2 Guido Trotter
the report (eg. instances status only). This can be done by passing
871 109e07c2 Guido Trotter
arguments to the HTTP GET, which will be defined when we get to this
872 109e07c2 Guido Trotter
funtionality.
873 109e07c2 Guido Trotter
874 109e07c2 Guido Trotter
Finally the :doc:`autorepair system design <design-autorepair>`. system
875 109e07c2 Guido Trotter
(see its design) can be expanded to use the monitoring agent system as a
876 109e07c2 Guido Trotter
source of information to decide which repairs it can perform.
877 109e07c2 Guido Trotter
878 109e07c2 Guido Trotter
.. vim: set textwidth=72 :
879 109e07c2 Guido Trotter
.. Local Variables:
880 109e07c2 Guido Trotter
.. mode: rst
881 109e07c2 Guido Trotter
.. fill-column: 72
882 109e07c2 Guido Trotter
.. End: