Statistics
| Branch: | Tag: | Revision:

root / doc / design-node-state-cache.rst @ 23f0b93e

History | View | Annotate | Download (4.5 kB)

1
================
2
Node State Cache
3
================
4

    
5
.. contents:: :depth: 4
6

    
7
This is a design doc about the optimization of machine info retrieval.
8

    
9

    
10
Current State
11
=============
12

    
13
Currently every RPC call is quite expensive as a TCP handshake has to be
14
made as well as SSL negotiation. This especially is visible when getting
15
node and instance info over and over again.
16

    
17
This data, however, is quite easy to cache but needs some changes to how
18
we retrieve data in the RPC as this is spread over several RPC calls
19
and are hard to unify.
20

    
21

    
22
Proposed changes
23
================
24

    
25
To overcome this situation with multiple information retrieval calls we
26
introduce one single RPC call to get all the info in a organized manner,
27
for easy store in the cache.
28

    
29
As of now we have 3 different information RPC calls:
30

    
31
- ``call_node_info``: To retrieve disk and hyper-visor information
32
- ``call_instance_info``: To retrieve hyper-visor information for one
33
  instance
34
- ``call_all_instance_info``: To retrieve hyper-visor information for
35
  all instances
36

    
37
Not to mention that ``call_all_instance_info`` and
38
``call_instance_info`` return different information in the dict.
39

    
40
To unify the data and organize them we introduce a new RPC call
41
``call_node_snapshot`` doing all of the above in one go. Which
42
data we want to know will be specified about a dict of request
43
types: CACHE_REQ_HV, CACHE_REQ_DISKINFO, CACHE_REQ_BOOTID
44

    
45
As this cache is representing the state of a given node we use the
46
name of a node as the key to retrieve the data from the cache. A
47
name-space separation of node and instance data is not possible at the
48
current point. This is due to the fact that some of the node hyper-visor
49
information like free memory is correlating with instances running.
50

    
51
An example of how the data for a node in the cache looks like::
52

    
53
  {
54
    constants.CACHE_REQ_HV: {
55
      constants.HT_XEN_PVM: {
56
        _NODE_DATA: {
57
          "memory_total": 32763,
58
          "memory_free": 9159,
59
          "memory_dom0": 1024,
60
          "cpu_total": 4,
61
          "cpu_sockets": 2
62
        },
63
        _INSTANCES_DATA: {
64
          "inst1": {
65
            "memory": 4096,
66
            "state": "-b----",
67
            "time": 102399.3,
68
            "vcpus": 1
69
          },
70
          "inst2": {
71
            "memory": 4096,
72
            "state": "-b----",
73
            "time": 12280.0,
74
            "vcpus": 3
75
          }
76
        }
77
      }
78
    },
79
    constants.CACHE_REQ_DISKINFO: {
80
      "xenvg": {
81
        "vg_size": 1048576,
82
        "vg_free": 491520
83
      },
84
    }
85
    constants.CACHE_REQ_BOOTID: "0dd0983c-913d-4ce6-ad94-0eceb77b69f9"
86
  }
87

    
88
This way we get easy to organize information which can simply be arranged in
89
the cache.
90

    
91
The 3 RPC calls mentioned above will remain for compatibility reason but
92
will be simple wrappers around this RPC call.
93

    
94

    
95
Cache invalidation
96
------------------
97

    
98
The cache is invalidated at every RPC call which is not proven to not
99
modify the state of a given node. This is to avoid inconsistency between
100
cache and actual node state.
101

    
102
There are some corner cases which invalidates the whole cache at once as
103
they usually affect other nodes states too:
104

    
105
 - migrate/failover
106
 - import/export
107

    
108
A request will be served from the cache if and only if it can be
109
fulfilled entirely from it (i.e. all the CACHE_REQ_* entries are already
110
present). Otherwise, we will invalidate the cache and actually do the
111
remote call.
112

    
113
In addition, every cache entry will have a TTL of about 10 minutes which
114
should be enough to accommodate most use cases.
115

    
116
We also allow an option to the calls to bypass the cache completely and
117
do a force remote call. However, this will invalidate the present
118
entries and populate the cache with the new retrieved values.
119

    
120

    
121
Additional cache population
122
---------------------------
123

    
124
Besides of the commands which calls above RPC calls, a full cache
125
population can also be done by a separate new op-code run by
126
``ganeti-watcher`` periodically. This op-code will be used instead of
127
the old ones.
128

    
129

    
130
Possible regressions
131
====================
132

    
133
As we change from getting "one hyper-visor information" to "get all we
134
know about this hyper-visor"-style we have a regression in time of
135
execution. The execution time is about 1.8x more in process execution
136
time. However, this does not include the latency and negotiation time
137
needed for each separate RPC call. Also if we hit the cache all 3 costs
138
will be 0. The only time taken is to look up the info in the cache and
139
the deserialization of the data. Which takes down the time from today
140
~300ms to ~100ms.
141

    
142
.. vim: set textwidth=72 :
143
.. Local Variables:
144
.. mode: rst
145
.. fill-column: 72
146
.. End: