Statistics
| Branch: | Tag: | Revision:

root / doc / design-query-splitting.rst @ 9110fb4a

History | View | Annotate | Download (6.7 kB)

1
===========================================
2
Splitting the query and job execution paths
3
===========================================
4

    
5

    
6
Introduction
7
============
8

    
9
Currently, the master daemon does two main roles:
10

    
11
- execute jobs that change the cluster state
12
- respond to queries
13

    
14
Due to the technical details of the implementation, the job execution
15
and query paths interact with each other, and for example the "masterd
16
hang" issue that we had late in the 2.5 release cycle was due to the
17
interaction between job queries and job execution.
18

    
19
Furthermore, also because technical implementations (Python lacking
20
read-only variables being one example), we can't share internal data
21
structures for jobs; instead, in the query path, we read them from
22
disk in order to not block job execution due to locks.
23

    
24
All these point to the fact that the integration of both queries and
25
job execution in the same process (multi-threaded) creates more
26
problems than advantages, and hence we should look into separating
27
them.
28

    
29

    
30
Proposed design
31
===============
32

    
33
In Ganeti 2.7, we will introduce a separate, optional daemon to handle
34
queries (note: whether this is an actual "new" daemon, or its
35
functionality is folded into confd, remains to be seen).
36

    
37
This daemon will expose exactly the same Luxi interface as masterd,
38
except that job submission will be disabled. If so configured (at
39
build time), clients will be changed to:
40

    
41
- keep sending REQ_SUBMIT_JOB, REQ_SUBMIT_MANY_JOBS, and all requests
42
  except REQ_QUERY_* to the masterd socket (but also QR_LOCK)
43
- redirect all REQ_QUERY_* requests to the new Luxi socket of the new
44
  daemon (except generic query with QR_LOCK)
45

    
46
This new daemon will serve both pure configuration queries (which
47
confd can already serve), and run-time queries (which currently only
48
masterd can serve). Since the RPC can be done from any node to any
49
node, the new daemon can run on all master candidates, not only on the
50
master node. This means that all gnt-* list options can be now run on
51
other nodes than the master node. If we implement this as a separate
52
daemon that talks to confd, then we could actually run this on all
53
nodes of the cluster (to be decided).
54

    
55
During the 2.7 release, masterd will still respond to queries itself,
56
but it will log all such queries for identification of "misbehaving"
57
clients.
58

    
59
Advantages
60
----------
61

    
62
As far as I can see, this will bring some significant advantages.
63

    
64
First, we remove any interaction between the job execution and cluster
65
query state. This means that bugs in the locking code (job execution)
66
will not impact the query of the cluster state, nor the query of the
67
job execution itself. Furthermore, we will be able to have different
68
tuning parameters between job execution (e.g. 25 threads for job
69
execution) versus query (since these are transient, we could
70
practically have unlimited numbers of query threads).
71

    
72
As a result of the above split, we move from the current model, where
73
shutdown of the master daemon practically "breaks" the entire Ganeti
74
functionality (no job execution nor queries, not even connecting to
75
the instance console), to a split model:
76

    
77
- if just masterd is stopped, then other cluster functionality remains
78
  available: listing instances, connecting to the console of an
79
  instance, etc.
80
- if just "luxid" is stopped, masterd can still process jobs, and one
81
  can furthermore run queries from other nodes (MCs)
82
- only if both are stopped, we end up with the previous state
83

    
84
This will help, for example, in the case where the master node has
85
crashed and we haven't failed it over yet: querying and investigating
86
the cluster state will still be possible from other master candidates
87
(on small clusters, this will mean from all nodes).
88

    
89
A last advantage is that we finally will be able to reduce the
90
footprint of masterd; instead of previous discussion of splitting
91
individual jobs, which requires duplication of all the base
92
functionality, this will just split the queries, a more trivial piece
93
of code than job execution. This should be a reasonable work effort,
94
with a much smaller impact in case of failure (we can still run
95
masterd as before).
96

    
97
Disadvantages
98
-------------
99

    
100
We might get increased inconsistency during queries, as there will be
101
a delay between masterd saving an updated configuration and
102
confd/query loading and parsing it. However, this could be compensated
103
by the fact that queries will only look at "snapshots" of the
104
configuration, whereas before it could also look at "in-progress"
105
modifications (due to the non-atomic updates). I think these will
106
cancel each other out, we will have to see in practice how it works.
107

    
108
Another disadvantage *might* be that we have a more complex setup, due
109
to the introduction of a new daemon. However, the query path will be
110
much simpler, and when we remove the query functionality from masterd
111
we should have a more robust system.
112

    
113
Finally, we have QR_LOCK, which is an internal query related to the
114
master daemon, using the same infrastructure as the other queries
115
(related to cluster state). This is unfortunate, and will require
116
untangling in order to keep code duplication low.
117

    
118
Long-term plans
119
===============
120

    
121
If this works well, the plan would be (tentatively) to disable the
122
query functionality in masterd completely in Ganeti 2.8, in order to
123
remove the duplication. This might change based on how/if we split the
124
configuration/locking daemon out, or not.
125

    
126
Once we split this out, there is not technical reason why we can't
127
execute any query from any node; except maybe practical reasons
128
(network topology, remote nodes, etc.) or security reasons (if/whether
129
we want to change the cluster security model). In any case, it should
130
be possible to do this in a reliable way from all master candidates.
131

    
132
Update: We decided to keep the restriction to run queries on the master
133
node. The reason is that it is confusing from a usability point of view
134
that querying will work on any node and suddenly, when the user tries
135
to submit a job, it won't work.
136

    
137
Some implementation details
138
---------------------------
139

    
140
We will fold this in confd, at least initially, to reduce the
141
proliferation of daemons. Haskell will limit (if used properly) any too
142
deep integration between the old "confd" functionality and the new query
143
one. As advantages, we'll have a single daemons that handles
144
configuration queries.
145

    
146
The redirection of Luxi requests can be easily done based on the
147
request type, if we have both sockets open, or if we open on demand.
148

    
149
We don't want the masterd to talk to the luxid itself (hidden
150
redirection), since we want to be able to run queries while masterd is
151
down.
152

    
153
During the 2.7 release cycle, we can test all queries against both
154
masterd and luxid in QA, so we know we have exactly the same
155
interface and it is consistent.
156

    
157
.. vim: set textwidth=72 :
158
.. Local Variables:
159
.. mode: rst
160
.. fill-column: 72
161
.. End: