Statistics
| Branch: | Tag: | Revision:

root / doc / design-reason-trail.rst @ 33c730a2

History | View | Annotate | Download (4 kB)

1 6d2e1c12 Michele Tartara
===================
2 6d2e1c12 Michele Tartara
Ganeti reason trail
3 6d2e1c12 Michele Tartara
===================
4 6d2e1c12 Michele Tartara
5 6d2e1c12 Michele Tartara
.. contents:: :depth: 2
6 6d2e1c12 Michele Tartara
7 6d2e1c12 Michele Tartara
This is a design document detailing the implementation of a way for Ganeti to
8 6d2e1c12 Michele Tartara
track the origin and the reason of every executed command, from its starting
9 6d2e1c12 Michele Tartara
point (command line, remote API, some htool, etc.) to its actual execution
10 6d2e1c12 Michele Tartara
time.
11 6d2e1c12 Michele Tartara
12 6d2e1c12 Michele Tartara
Current state and shortcomings
13 6d2e1c12 Michele Tartara
==============================
14 6d2e1c12 Michele Tartara
15 6d2e1c12 Michele Tartara
There is currently no way to track why a job and all the operations part of it
16 6d2e1c12 Michele Tartara
were executed, and who or what triggered the execution.
17 6d2e1c12 Michele Tartara
This is an inconvenience in general, and also it makes impossible to have
18 6d2e1c12 Michele Tartara
certain information, such as finding the reason why an instance last changed its
19 6d2e1c12 Michele Tartara
status (i.e.: why it was started/stopped/rebooted/etc.), or distinguishing
20 6d2e1c12 Michele Tartara
an admin request from a scheduled maintenance or an automated tool's work.
21 6d2e1c12 Michele Tartara
22 6d2e1c12 Michele Tartara
Proposed changes
23 6d2e1c12 Michele Tartara
================
24 6d2e1c12 Michele Tartara
25 6d2e1c12 Michele Tartara
We propose to introduce a new piece of information, that will be called "reason
26 6d2e1c12 Michele Tartara
trail", to track the path from the issuing of a command to its execution.
27 6d2e1c12 Michele Tartara
28 6d2e1c12 Michele Tartara
The reason trail will be a list of 3-tuples ``(source, reason, timestamp)``,
29 6d2e1c12 Michele Tartara
with:
30 6d2e1c12 Michele Tartara
31 6d2e1c12 Michele Tartara
``source``
32 6d2e1c12 Michele Tartara
  The entity deciding to perform (or forward) a command.
33 6d2e1c12 Michele Tartara
  It is represented by an arbitrary string, but strings prepended by "gnt:"
34 6d2e1c12 Michele Tartara
  are reserved for Ganeti components, and they will be refused by the
35 6d2e1c12 Michele Tartara
  interfaces towards the external world.
36 6d2e1c12 Michele Tartara
37 6d2e1c12 Michele Tartara
``reason``
38 6d2e1c12 Michele Tartara
  The reason why the entity decided to perform the operation.
39 6d2e1c12 Michele Tartara
  It is represented by an arbitrary string. The string might possibly be empty,
40 6d2e1c12 Michele Tartara
  because certain components of the system might just "pass on" the operation
41 6d2e1c12 Michele Tartara
  (therefore wanting to be recorded in the trail) but without an explicit
42 6d2e1c12 Michele Tartara
  reason.
43 6d2e1c12 Michele Tartara
44 6d2e1c12 Michele Tartara
``timestamp``
45 6d2e1c12 Michele Tartara
  The time when the element was added to the reason trail. It has to be
46 6d2e1c12 Michele Tartara
  expressed in nanoseconds since the unix epoch (0:00:00 January 01, 1970).
47 6d2e1c12 Michele Tartara
  If not enough precision is available (or needed) it can be padded with
48 6d2e1c12 Michele Tartara
  zeroes.
49 6d2e1c12 Michele Tartara
50 6d2e1c12 Michele Tartara
The reason trail will be attached at the OpCode level. When it has to be
51 6d2e1c12 Michele Tartara
serialized externally (such as on the RAPI interface), it will be serialized in
52 6d2e1c12 Michele Tartara
JSON format. Specifically, it will be serialized as a list of elements.
53 6d2e1c12 Michele Tartara
Each element will be a list with two strings (for ``source`` and ``reason``)
54 6d2e1c12 Michele Tartara
and one integer number (the ``timestamp``).
55 6d2e1c12 Michele Tartara
56 6d2e1c12 Michele Tartara
Any component the operation goes through is allowed (but not required) to append
57 6d2e1c12 Michele Tartara
it's own reason to the list. Other than this, the list shouldn't be modified.
58 6d2e1c12 Michele Tartara
59 6d2e1c12 Michele Tartara
As an example here is the reason trail for a shutdown operation invoked from
60 6d2e1c12 Michele Tartara
the command line through the gnt-instance tool::
61 6d2e1c12 Michele Tartara
62 6d2e1c12 Michele Tartara
  [("user", "Cleanup of unused instances", 1363088484000000000),
63 6d2e1c12 Michele Tartara
   ("gnt:client:gnt-instance", "stop", 1363088484020000000),
64 6d2e1c12 Michele Tartara
   ("gnt:opcode:shutdown", "job=1234;index=0", 1363088484026000000),
65 6d2e1c12 Michele Tartara
   ("gnt:daemon:noded:shutdown", "", 1363088484135000000)]
66 6d2e1c12 Michele Tartara
67 6d2e1c12 Michele Tartara
where the first 3-tuple is determined by a user-specified message, passed to
68 6d2e1c12 Michele Tartara
gnt-instance through a command line parameter.
69 6d2e1c12 Michele Tartara
70 6d2e1c12 Michele Tartara
The same operation, launched by an external GUI tool, and executed through the
71 6d2e1c12 Michele Tartara
remote API, would have a reason trail like::
72 6d2e1c12 Michele Tartara
73 6d2e1c12 Michele Tartara
  [("user", "Cleanup of unused instances", 1363088484000000000),
74 6d2e1c12 Michele Tartara
   ("other-app:tool-name", "gui:stop", 1363088484000300000),
75 6d2e1c12 Michele Tartara
   ("gnt:client:rapi:shutdown", "", 1363088484020000000),
76 6d2e1c12 Michele Tartara
   ("gnt:library:rlib2:shutdown", "", 1363088484023000000),
77 6d2e1c12 Michele Tartara
   ("gnt:opcode:shutdown", "job=1234;index=0", 1363088484026000000),
78 6d2e1c12 Michele Tartara
   ("gnt:daemon:noded:shutdown", "", 1363088484135000000)]
79 6d2e1c12 Michele Tartara
80 6d2e1c12 Michele Tartara
Implementation
81 6d2e1c12 Michele Tartara
==============
82 6d2e1c12 Michele Tartara
83 6d2e1c12 Michele Tartara
The OpCode base class will be modified to include a new field, OP_REASON.
84 6d2e1c12 Michele Tartara
This will receive the reason trail as built by all the previous steps.
85 6d2e1c12 Michele Tartara
86 6d2e1c12 Michele Tartara
When an OpCode is added to a job (in jqueue.py) the job number and the opcode
87 6d2e1c12 Michele Tartara
index will be recorded as the reason for the existence of that opcode.
88 6d2e1c12 Michele Tartara
89 6d2e1c12 Michele Tartara
The implementation of this design will start from the operations that affect the
90 6d2e1c12 Michele Tartara
instance status. They will be changed so that the "reason" is passed to them.
91 6d2e1c12 Michele Tartara
They will then export the new expected instance status, together
92 6d2e1c12 Michele Tartara
with the associated reason for the monitoring daemon.
93 6d2e1c12 Michele Tartara
94 6d2e1c12 Michele Tartara
.. vim: set textwidth=72 :
95 6d2e1c12 Michele Tartara
.. Local Variables:
96 6d2e1c12 Michele Tartara
.. mode: rst
97 6d2e1c12 Michele Tartara
.. fill-column: 72
98 6d2e1c12 Michele Tartara
.. End: