root / doc / design-reason-trail.rst @ 33c730a2
History | View | Annotate | Download (4 kB)
1 | 6d2e1c12 | Michele Tartara | =================== |
---|---|---|---|
2 | 6d2e1c12 | Michele Tartara | Ganeti reason trail |
3 | 6d2e1c12 | Michele Tartara | =================== |
4 | 6d2e1c12 | Michele Tartara | |
5 | 6d2e1c12 | Michele Tartara | .. contents:: :depth: 2 |
6 | 6d2e1c12 | Michele Tartara | |
7 | 6d2e1c12 | Michele Tartara | This is a design document detailing the implementation of a way for Ganeti to |
8 | 6d2e1c12 | Michele Tartara | track the origin and the reason of every executed command, from its starting |
9 | 6d2e1c12 | Michele Tartara | point (command line, remote API, some htool, etc.) to its actual execution |
10 | 6d2e1c12 | Michele Tartara | time. |
11 | 6d2e1c12 | Michele Tartara | |
12 | 6d2e1c12 | Michele Tartara | Current state and shortcomings |
13 | 6d2e1c12 | Michele Tartara | ============================== |
14 | 6d2e1c12 | Michele Tartara | |
15 | 6d2e1c12 | Michele Tartara | There is currently no way to track why a job and all the operations part of it |
16 | 6d2e1c12 | Michele Tartara | were executed, and who or what triggered the execution. |
17 | 6d2e1c12 | Michele Tartara | This is an inconvenience in general, and also it makes impossible to have |
18 | 6d2e1c12 | Michele Tartara | certain information, such as finding the reason why an instance last changed its |
19 | 6d2e1c12 | Michele Tartara | status (i.e.: why it was started/stopped/rebooted/etc.), or distinguishing |
20 | 6d2e1c12 | Michele Tartara | an admin request from a scheduled maintenance or an automated tool's work. |
21 | 6d2e1c12 | Michele Tartara | |
22 | 6d2e1c12 | Michele Tartara | Proposed changes |
23 | 6d2e1c12 | Michele Tartara | ================ |
24 | 6d2e1c12 | Michele Tartara | |
25 | 6d2e1c12 | Michele Tartara | We propose to introduce a new piece of information, that will be called "reason |
26 | 6d2e1c12 | Michele Tartara | trail", to track the path from the issuing of a command to its execution. |
27 | 6d2e1c12 | Michele Tartara | |
28 | 6d2e1c12 | Michele Tartara | The reason trail will be a list of 3-tuples ``(source, reason, timestamp)``, |
29 | 6d2e1c12 | Michele Tartara | with: |
30 | 6d2e1c12 | Michele Tartara | |
31 | 6d2e1c12 | Michele Tartara | ``source`` |
32 | 6d2e1c12 | Michele Tartara | The entity deciding to perform (or forward) a command. |
33 | 6d2e1c12 | Michele Tartara | It is represented by an arbitrary string, but strings prepended by "gnt:" |
34 | 6d2e1c12 | Michele Tartara | are reserved for Ganeti components, and they will be refused by the |
35 | 6d2e1c12 | Michele Tartara | interfaces towards the external world. |
36 | 6d2e1c12 | Michele Tartara | |
37 | 6d2e1c12 | Michele Tartara | ``reason`` |
38 | 6d2e1c12 | Michele Tartara | The reason why the entity decided to perform the operation. |
39 | 6d2e1c12 | Michele Tartara | It is represented by an arbitrary string. The string might possibly be empty, |
40 | 6d2e1c12 | Michele Tartara | because certain components of the system might just "pass on" the operation |
41 | 6d2e1c12 | Michele Tartara | (therefore wanting to be recorded in the trail) but without an explicit |
42 | 6d2e1c12 | Michele Tartara | reason. |
43 | 6d2e1c12 | Michele Tartara | |
44 | 6d2e1c12 | Michele Tartara | ``timestamp`` |
45 | 6d2e1c12 | Michele Tartara | The time when the element was added to the reason trail. It has to be |
46 | 6d2e1c12 | Michele Tartara | expressed in nanoseconds since the unix epoch (0:00:00 January 01, 1970). |
47 | 6d2e1c12 | Michele Tartara | If not enough precision is available (or needed) it can be padded with |
48 | 6d2e1c12 | Michele Tartara | zeroes. |
49 | 6d2e1c12 | Michele Tartara | |
50 | 6d2e1c12 | Michele Tartara | The reason trail will be attached at the OpCode level. When it has to be |
51 | 6d2e1c12 | Michele Tartara | serialized externally (such as on the RAPI interface), it will be serialized in |
52 | 6d2e1c12 | Michele Tartara | JSON format. Specifically, it will be serialized as a list of elements. |
53 | 6d2e1c12 | Michele Tartara | Each element will be a list with two strings (for ``source`` and ``reason``) |
54 | 6d2e1c12 | Michele Tartara | and one integer number (the ``timestamp``). |
55 | 6d2e1c12 | Michele Tartara | |
56 | 6d2e1c12 | Michele Tartara | Any component the operation goes through is allowed (but not required) to append |
57 | 6d2e1c12 | Michele Tartara | it's own reason to the list. Other than this, the list shouldn't be modified. |
58 | 6d2e1c12 | Michele Tartara | |
59 | 6d2e1c12 | Michele Tartara | As an example here is the reason trail for a shutdown operation invoked from |
60 | 6d2e1c12 | Michele Tartara | the command line through the gnt-instance tool:: |
61 | 6d2e1c12 | Michele Tartara | |
62 | 6d2e1c12 | Michele Tartara | [("user", "Cleanup of unused instances", 1363088484000000000), |
63 | 6d2e1c12 | Michele Tartara | ("gnt:client:gnt-instance", "stop", 1363088484020000000), |
64 | 6d2e1c12 | Michele Tartara | ("gnt:opcode:shutdown", "job=1234;index=0", 1363088484026000000), |
65 | 6d2e1c12 | Michele Tartara | ("gnt:daemon:noded:shutdown", "", 1363088484135000000)] |
66 | 6d2e1c12 | Michele Tartara | |
67 | 6d2e1c12 | Michele Tartara | where the first 3-tuple is determined by a user-specified message, passed to |
68 | 6d2e1c12 | Michele Tartara | gnt-instance through a command line parameter. |
69 | 6d2e1c12 | Michele Tartara | |
70 | 6d2e1c12 | Michele Tartara | The same operation, launched by an external GUI tool, and executed through the |
71 | 6d2e1c12 | Michele Tartara | remote API, would have a reason trail like:: |
72 | 6d2e1c12 | Michele Tartara | |
73 | 6d2e1c12 | Michele Tartara | [("user", "Cleanup of unused instances", 1363088484000000000), |
74 | 6d2e1c12 | Michele Tartara | ("other-app:tool-name", "gui:stop", 1363088484000300000), |
75 | 6d2e1c12 | Michele Tartara | ("gnt:client:rapi:shutdown", "", 1363088484020000000), |
76 | 6d2e1c12 | Michele Tartara | ("gnt:library:rlib2:shutdown", "", 1363088484023000000), |
77 | 6d2e1c12 | Michele Tartara | ("gnt:opcode:shutdown", "job=1234;index=0", 1363088484026000000), |
78 | 6d2e1c12 | Michele Tartara | ("gnt:daemon:noded:shutdown", "", 1363088484135000000)] |
79 | 6d2e1c12 | Michele Tartara | |
80 | 6d2e1c12 | Michele Tartara | Implementation |
81 | 6d2e1c12 | Michele Tartara | ============== |
82 | 6d2e1c12 | Michele Tartara | |
83 | 6d2e1c12 | Michele Tartara | The OpCode base class will be modified to include a new field, OP_REASON. |
84 | 6d2e1c12 | Michele Tartara | This will receive the reason trail as built by all the previous steps. |
85 | 6d2e1c12 | Michele Tartara | |
86 | 6d2e1c12 | Michele Tartara | When an OpCode is added to a job (in jqueue.py) the job number and the opcode |
87 | 6d2e1c12 | Michele Tartara | index will be recorded as the reason for the existence of that opcode. |
88 | 6d2e1c12 | Michele Tartara | |
89 | 6d2e1c12 | Michele Tartara | The implementation of this design will start from the operations that affect the |
90 | 6d2e1c12 | Michele Tartara | instance status. They will be changed so that the "reason" is passed to them. |
91 | 6d2e1c12 | Michele Tartara | They will then export the new expected instance status, together |
92 | 6d2e1c12 | Michele Tartara | with the associated reason for the monitoring daemon. |
93 | 6d2e1c12 | Michele Tartara | |
94 | 6d2e1c12 | Michele Tartara | .. vim: set textwidth=72 : |
95 | 6d2e1c12 | Michele Tartara | .. Local Variables: |
96 | 6d2e1c12 | Michele Tartara | .. mode: rst |
97 | 6d2e1c12 | Michele Tartara | .. fill-column: 72 |
98 | 6d2e1c12 | Michele Tartara | .. End: |