root / doc / design-node-state-cache.rst @ 7142485a
History | View | Annotate | Download (4.5 kB)
1 |
================ |
---|---|
2 |
Node State Cache |
3 |
================ |
4 |
|
5 |
.. contents:: :depth: 4 |
6 |
|
7 |
This is a design doc about the optimization of machine info retrieval. |
8 |
|
9 |
|
10 |
Current State |
11 |
============= |
12 |
|
13 |
Currently every RPC call is quite expensive as a TCP handshake has to be |
14 |
made as well as SSL negotiation. This especially is visible when getting |
15 |
node and instance info over and over again. |
16 |
|
17 |
This data, however, is quite easy to cache but needs some changes to how |
18 |
we retrieve data in the RPC as this is spread over several RPC calls |
19 |
and are hard to unify. |
20 |
|
21 |
|
22 |
Proposed changes |
23 |
================ |
24 |
|
25 |
To overcome this situation with multiple information retrieval calls we |
26 |
introduce one single RPC call to get all the info in a organized manner, |
27 |
for easy store in the cache. |
28 |
|
29 |
As of now we have 3 different information RPC calls: |
30 |
|
31 |
- ``call_node_info``: To retrieve disk and hyper-visor information |
32 |
- ``call_instance_info``: To retrieve hyper-visor information for one |
33 |
instance |
34 |
- ``call_all_instance_info``: To retrieve hyper-visor information for |
35 |
all instances |
36 |
|
37 |
Not to mention that ``call_all_instance_info`` and |
38 |
``call_instance_info`` return different information in the dict. |
39 |
|
40 |
To unify the data and organize them we introduce a new RPC call |
41 |
``call_node_snapshot`` doing all of the above in one go. Which |
42 |
data we want to know will be specified about a dict of request |
43 |
types: CACHE_REQ_HV, CACHE_REQ_DISKINFO, CACHE_REQ_BOOTID |
44 |
|
45 |
As this cache is representing the state of a given node we use the |
46 |
name of a node as the key to retrieve the data from the cache. A |
47 |
name-space separation of node and instance data is not possible at the |
48 |
current point. This is due to the fact that some of the node hyper-visor |
49 |
information like free memory is correlating with instances running. |
50 |
|
51 |
An example of how the data for a node in the cache looks like:: |
52 |
|
53 |
{ |
54 |
constants.CACHE_REQ_HV: { |
55 |
constants.HT_XEN_PVM: { |
56 |
_NODE_DATA: { |
57 |
"memory_total": 32763, |
58 |
"memory_free": 9159, |
59 |
"memory_dom0": 1024, |
60 |
"cpu_total": 4, |
61 |
"cpu_sockets": 2 |
62 |
}, |
63 |
_INSTANCES_DATA: { |
64 |
"inst1": { |
65 |
"memory": 4096, |
66 |
"state": "-b----", |
67 |
"time": 102399.3, |
68 |
"vcpus": 1 |
69 |
}, |
70 |
"inst2": { |
71 |
"memory": 4096, |
72 |
"state": "-b----", |
73 |
"time": 12280.0, |
74 |
"vcpus": 3 |
75 |
} |
76 |
} |
77 |
} |
78 |
}, |
79 |
constants.CACHE_REQ_DISKINFO: { |
80 |
"xenvg": { |
81 |
"vg_size": 1048576, |
82 |
"vg_free": 491520 |
83 |
}, |
84 |
} |
85 |
constants.CACHE_REQ_BOOTID: "0dd0983c-913d-4ce6-ad94-0eceb77b69f9" |
86 |
} |
87 |
|
88 |
This way we get easy to organize information which can simply be arranged in |
89 |
the cache. |
90 |
|
91 |
The 3 RPC calls mentioned above will remain for compatibility reason but |
92 |
will be simple wrappers around this RPC call. |
93 |
|
94 |
|
95 |
Cache invalidation |
96 |
------------------ |
97 |
|
98 |
The cache is invalidated at every RPC call which is not proven to not |
99 |
modify the state of a given node. This is to avoid inconsistency between |
100 |
cache and actual node state. |
101 |
|
102 |
There are some corner cases which invalidates the whole cache at once as |
103 |
they usually affect other nodes states too: |
104 |
|
105 |
- migrate/failover |
106 |
- import/export |
107 |
|
108 |
A request will be served from the cache if and only if it can be |
109 |
fulfilled entirely from it (i.e. all the CACHE_REQ_* entries are already |
110 |
present). Otherwise, we will invalidate the cache and actually do the |
111 |
remote call. |
112 |
|
113 |
In addition, every cache entry will have a TTL of about 10 minutes which |
114 |
should be enough to accommodate most use cases. |
115 |
|
116 |
We also allow an option to the calls to bypass the cache completely and |
117 |
do a force remote call. However, this will invalidate the present |
118 |
entries and populate the cache with the new retrieved values. |
119 |
|
120 |
|
121 |
Additional cache population |
122 |
--------------------------- |
123 |
|
124 |
Besides of the commands which calls above RPC calls, a full cache |
125 |
population can also be done by a separate new op-code run by |
126 |
``ganeti-watcher`` periodically. This op-code will be used instead of |
127 |
the old ones. |
128 |
|
129 |
|
130 |
Possible regressions |
131 |
==================== |
132 |
|
133 |
As we change from getting "one hyper-visor information" to "get all we |
134 |
know about this hyper-visor"-style we have a regression in time of |
135 |
execution. The execution time is about 1.8x more in process execution |
136 |
time. However, this does not include the latency and negotiation time |
137 |
needed for each separate RPC call. Also if we hit the cache all 3 costs |
138 |
will be 0. The only time taken is to look up the info in the cache and |
139 |
the deserialization of the data. Which takes down the time from today |
140 |
~300ms to ~100ms. |
141 |
|
142 |
.. vim: set textwidth=72 : |
143 |
.. Local Variables: |
144 |
.. mode: rst |
145 |
.. fill-column: 72 |
146 |
.. End: |