Statistics
| Branch: | Tag: | Revision:

root / doc / design-query2.rst @ 2237687b

History | View | Annotate | Download (12.6 kB)

1
======================
2
Query version 2 design
3
======================
4

    
5
.. contents:: :depth: 4
6
.. highlight:: python
7

    
8
Current state and shortcomings
9
==============================
10

    
11
Queries are used to retrieve information about the cluster, e.g. a list
12
of instances or nodes. For historical reasons they use a simple data
13
structure for their result. The client submits the fields it would like
14
to receive and the query returns a list for each item (instance, node,
15
etc.) available. Each item consists of another list representing the
16
fields' values.
17

    
18
This data structure has a few drawbacks. It can't associate a status
19
(e.g. “node offline”) with fields as using special values can lead to
20
ambiguities. Additionally it can't mark fields as “not found” as the
21
list of returned columns must match the fields requested.
22

    
23
Example::
24

    
25
  >>> cli.GetClient().QueryNodes([], ["name", "pip", "mfree"], False)
26
  [
27
    ['node1.example.com', '192.0.2.18', 14800],
28
    ['node2.example.com', '192.0.2.19', 31280]
29
  ]
30

    
31
There is no way for clients to determine the list of possible fields,
32
meaning they have to be hardcoded. Selecting unknown fields raises
33
an exception::
34

    
35
  >>> cli.GetClient().QueryNodes([], ["name", "UnknownField"], False)
36
  ganeti.errors.OpPrereqError: (u'Unknown output fields selected: UnknownField', u'wrong_input')
37

    
38
The client must also know each fields' kind, that is whether a field is
39
numeric, boolean, describes a storage size, etc. Centralizing this
40
information in one place, the master daemon, is desirable.
41

    
42

    
43
Proposed changes
44
----------------
45

    
46
The current query result format can not be changed as it's being used in
47
various places. Changing the format from one Ganeti version to another
48
would cause too much disruption. For this reason the ability to
49
explicitly request a new result format must be added while the old
50
format stays the default.
51

    
52
The implementation of query filters is planned for the future. To avoid
53
having to change the calls again, a (hopefully) future-compatible
54
interface will be implemented now.
55

    
56
In Python code, the objects described below will be implemented using
57
subclasses of ``objects.ConfigObject``, providing existing facilities
58
for de-/serializing.
59

    
60
Regular expressions
61
+++++++++++++++++++
62

    
63
As it turned out, only very few fields for instances used regular
64
expressions, all of which can easily be turned into static field names.
65
Therefore their use in field names is dropped. Reasons:
66

    
67
- When regexps are used and a field name is not listed as a simple
68
  string in the field dictionary, all keys in the field dictionary have
69
  to be checked whether they're a regular expression object and if so,
70
  matched (see ``utils.FindMatch``).
71
- Code becomes simpler. There would be no need anymore to care about
72
  regular expressions as field names—they'd all be simple strings, even
73
  if there are many more. The list of field names would be static once
74
  built at module-load time.
75
- There's the issue of formatting titles for the clients. Should it be
76
  done in the server? In the client? The field definition's title would
77
  contain backreferences to the regexp groups in the field name
78
  (``re.MatchObject.expand`` can be used). With just strings, the field
79
  definitions can be passed directly to the client. They're static.
80
- Only a side note: In the memory consumed for 1'000
81
  ``_sre.SRE_Pattern`` objects (as returned by ``re.compile`` for an
82
  expression with one group) one can easily store 10'000 strings of the
83
  same length (the regexp objects keep the expression string around, so
84
  compiling the expression always uses more memory).
85

    
86

    
87
.. _item-types:
88

    
89
Item types
90
++++++++++
91

    
92
The proposal is to implement this new interface for the following
93
items:
94

    
95
``instance``
96
  Instances
97
``node``
98
  Nodes
99
``job``
100
  Jobs
101
``lock``
102
  Locks
103

    
104
.. _data-query:
105

    
106
Data query
107
++++++++++
108

    
109
.. _data-query-request:
110

    
111
Request
112
^^^^^^^
113

    
114
The request is a dictionary with the following entries:
115

    
116
``what`` (string, required)
117
  An :ref:`item type <item-types>`.
118
``fields`` (list of strings, required)
119
  List of names of fields to return. Example::
120

    
121
    ["name", "mem", "nic0.ip", "disk0.size", "disk1.size"]
122

    
123
``filter`` (optional)
124
  This will be used to filter queries. In this implementation only names
125
  can be filtered to replace the previous ``names`` parameter to
126
  queries. An empty filter (``None``) will return all items. To retrieve
127
  specific names, the filter must be specified as follows, with the
128
  inner part repeated for each name::
129

    
130
    ["|", ["=", "name", "node1"], ["=", "name", "node2"], …]
131

    
132
  Filters consist of S-expressions (``["operator", <operants…>]``) and
133
  extensions will be made in the future to allow for more operators and
134
  fields. Such extensions might include a Python-style "in" operator,
135
  but for simplicity only "=" is supported in this implementation.
136

    
137
  To reiterate: Filters for this implementation must consist of exactly
138
  one OR expression (``["|", …]``) and one or more name equality filters
139
  (``["=", "name", "…"]``).
140

    
141
Support for synchronous queries, currently available in the interface
142
but disabled in the master daemon, will be dropped. Direct calls to
143
opcodes have to be used instead.
144

    
145
.. _data-query-response:
146

    
147
Response
148
^^^^^^^^
149

    
150
The result is a dictionary with the following entries:
151

    
152
``fields`` (list of :ref:`field definitions <field-def>`)
153
  In-order list of a :ref:`field definition <field-def>` for each
154
  requested field, unknown fields are returned with the kind
155
  ``unknown``. Length must be equal to number of requested fields.
156
``data`` (list of lists of tuples)
157
  List of lists, one list for each item found. Each item's list must
158
  have one entry for each field listed in ``fields`` (meaning their
159
  length is equal). Each field entry is a tuple of ``(status, value)``.
160
  ``status`` must be one of the following values:
161

    
162
  Normal (numeric 0)
163
    Value is available and matches the kind in the :ref:`field
164
    definition <field-def>`.
165
  Unknown field (numeric 1)
166
    Field for this column is not known. Value must be ``None``.
167
  No data (numeric 2)
168
    Exact meaning depends on query, e.g. node is unreachable or marked
169
    offline. Value must be ``None``.
170
  Value unavailable for item (numeric 3)
171
    Used if, for example, NIC 3 is requested for an instance with only
172
    one network interface. Value must be ``None``.
173
  Resource offline (numeric 4)
174
    Used if resource is marked offline. Value must be ``None``.
175

    
176
Example response after requesting the fields ``name``, ``mfree``,
177
``xyz``, ``mtotal``, ``nic0.ip``, ``nic1.ip`` and ``nic2.ip``::
178

    
179
  {
180
    "fields": [
181
      { "name": "name", "title": "Name", "kind": "text", },
182
      { "name": "mfree", "title": "MemFree", "kind": "unit", },
183
      # Unknown field
184
      { "name": "xyz", "title": None, "kind": "unknown", },
185
      { "name": "mtotal", "title": "MemTotal", "kind": "unit", },
186
      { "name": "nic0.ip", "title": "Nic.IP/0", "kind": "text", },
187
      { "name": "nic1.ip", "title": "Nic.IP/1", "kind": "text", },
188
      { "name": "nic2.ip", "title": "Nic.IP/2", "kind": "text", },
189
      ],
190

    
191
    "data": [
192
      [(0, "node1"), (0, 128), (1, None), (0, 4096),
193
       (0, "192.0.2.1"), (0, "192.0.2.2"), (3, None)],
194
      [(0, "node2"), (0, 96), (1, None), (0, 5000),
195
       (0, "192.0.2.21"), (0, "192.0.2.39"), (3, "192.0.2.90")],
196
      # Node not available, can't get "mfree" or "mtotal"
197
      [(0, "node3"), (2, None), (1, None), (2, None),
198
       (0, "192.0.2.30"), (3, None), (3, None)],
199
      ],
200
  }
201

    
202
.. _fields-query:
203

    
204
Fields query
205
++++++++++++
206

    
207
.. _fields-query-request:
208

    
209
Request
210
^^^^^^^
211

    
212
The request is a dictionary with the following entries:
213

    
214
``what`` (string, required)
215
  An :ref:`item type <item-types>`.
216
``fields`` (list of strings, optional)
217
  List of names of fields to return. If not set, all fields are
218
  returned. Example::
219

    
220
    ["name", "mem", "nic0.ip", "disk0.size", "disk1.size"]
221

    
222
.. _fields-query-response:
223

    
224
Response
225
^^^^^^^^
226

    
227
The result is a dictionary with the following entries:
228

    
229
``fields`` (list of :ref:`field definitions <field-def>`)
230
  List of a :ref:`field definition <field-def>` for each field. If
231
  ``fields`` was set in the request and contained an unknown field, it
232
  is returned as type ``unknown``.
233

    
234
Example::
235

    
236
  {
237
    "fields": [
238
      { "name": "name", "title": "Name", "kind": "text", },
239
      { "name": "mfree", "title": "MemFree", "kind": "unit", },
240
      { "name": "mtotal", "title": "MemTotal", "kind": "unit", },
241
      { "name": "nic0.ip", "title": "Nic.IP/0", "kind": "text", },
242
      { "name": "nic1.ip", "title": "Nic.IP/1", "kind": "text", },
243
      { "name": "nic2.ip", "title": "Nic.IP/2", "kind": "text", },
244
      { "name": "nic3.ip", "title": "Nic.IP/3", "kind": "text", },
245
      # …
246
      { "name": "disk0.size", "title": "Disk.Size/0", "kind": "unit", },
247
      { "name": "disk1.size", "title": "Disk.Size/1", "kind": "unit", },
248
      { "name": "disk2.size", "title": "Disk.Size/2", "kind": "unit", },
249
      { "name": "disk3.size", "title": "Disk.Size/3", "kind": "unit", },
250
      # …
251
      ]
252
  }
253

    
254
.. _field-def:
255

    
256
Field definition
257
++++++++++++++++
258

    
259
A field definition is a dictionary with the following entries:
260

    
261
``name`` (string)
262
  Field name. Must only contain characters matching ``[a-z0-9/._]``.
263
``title`` (string)
264
  Human-readable title to use in output. Must not contain whitespace.
265
``kind`` (string)
266
  Field type, one of the following:
267

    
268
  ``unknown``
269
    Unknown field
270
  ``text``
271
    String
272
  ``bool``
273
    Boolean, true/false
274
  ``number``
275
    Numeric
276
  ``unit``
277
    Numeric, in megabytes
278
  ``timestamp``
279
    Unix timestamp in seconds since the epoch
280
  ``other``
281
    Free-form type, depending on query
282

    
283
  More types can be added in the future, so clients should default to
284
  formatting any unknown types the same way as "other", which should be
285
  a string representation in most cases.
286

    
287
.. TODO: Investigate whether there are fields with floating point
288
.. numbers
289

    
290
Example 1 (item name)::
291

    
292
  {
293
    "name": "name",
294
    "title": "Name",
295
    "kind": "text",
296
  }
297

    
298
Example 2 (free memory)::
299

    
300
  {
301
    "name": "mfree",
302
    "title": "MemFree",
303
    "kind": "unit",
304
  }
305

    
306
Example 3 (list of primary instances)::
307

    
308
  {
309
    "name": "pinst",
310
    "title": "PrimaryInstances",
311
    "kind": "other",
312
  }
313

    
314
.. _old-result-format:
315

    
316
Old result format
317
+++++++++++++++++
318

    
319
To limit the amount of code necessary, the :ref:`new result format
320
<data-query-response>` will be converted for clients calling the old
321
methods.  Unavailable values are set to ``None``. If unknown fields were
322
requested, the whole query fails as the client expects exactly the
323
fields it requested.
324

    
325
.. _luxi:
326

    
327
LUXI
328
++++
329

    
330
Currently query calls take a number of parameters, e.g. names, fields
331
and whether to use locking. These will continue to work and return the
332
:ref:`old result format <old-result-format>`. Only clients using the
333
new calls described below will be able to make use of new features such
334
as filters. Two new calls are introduced:
335

    
336
``Query``
337
  Execute a query on items, optionally filtered. Takes a single
338
  parameter, a :ref:`query object <data-query-request>` encoded as a
339
  dictionary and returns a :ref:`data query response
340
  <data-query-response`.
341
``QueryFields``
342
  Return list of supported fields as :ref:`field definitions
343
  <field-def>`. Takes a single parameter, a :ref:`fields query object
344
  <fields-query-request>` encoded as a dictionary and returns a
345
  :ref:`fields query response <fields-query-response>`.
346

    
347

    
348
Python
349
++++++
350

    
351
The LUXI API is more or less mapped directly into Python. In addition to
352
the existing stub functions new ones will be added for the new query
353
requests.
354

    
355
RAPI
356
++++
357

    
358
The RAPI interface already returns dictionaries for each item, but to
359
not break compatibility no changes should be made to the structure (e.g.
360
to include field definitions). The proposal here is to add a new
361
parameter to allow clients to execute the requests described in this
362
proposal directly and to receive the unmodified result. The new formats
363
are a lot more verbose, flexible and extensible.
364

    
365
.. _cli-programs:
366

    
367
CLI programs
368
++++++++++++
369

    
370
Command line programs might have difficulties to display the verbose
371
status data to the user. There are several options:
372

    
373
- Use colours to indicate missing values
374
- Display status as value in parentheses, e.g. "(unavailable)"
375
- Hide unknown columns from the result table and print a warning
376
- Exit with non-zero code to indicate failures and/or missing data
377

    
378
Some are better for interactive usage, some better for use by other
379
programs. It is expected that a combination will be used. The column
380
separator (``--separator=…``) can be used to differentiate between
381
interactive and programmatic usage.
382

    
383

    
384
Other discussed solutions
385
-------------------------
386

    
387
Another solution discussed was to add an additional column for each
388
non-static field containing the status. Clients interested in the status
389
could explicitely query for it.
390

    
391
.. vim: set textwidth=72 :
392
.. Local Variables:
393
.. mode: rst
394
.. fill-column: 72
395
.. End: