Statistics
| Branch: | Tag: | Revision:

root / doc / design-query2.rst @ 9189c902

History | View | Annotate | Download (12.5 kB)

1
======================
2
Query version 2 design
3
======================
4

    
5
.. contents:: :depth: 4
6
.. highlight:: python
7

    
8
Current state and shortcomings
9
==============================
10

    
11
Queries are used to retrieve information about the cluster, e.g. a list
12
of instances or nodes. For historical reasons they use a simple data
13
structure for their result. The client submits the fields it would like
14
to receive and the query returns a list for each item (instance, node,
15
etc.) available. Each item consists of another list representing the
16
fields' values.
17

    
18
This data structure has a few drawbacks. It can't associate a status
19
(e.g. “node offline”) with fields as using special values can lead to
20
ambiguities. Additionally it can't mark fields as “not found” as the
21
list of returned columns must match the fields requested.
22

    
23
Example::
24

    
25
  >>> cli.GetClient().QueryNodes([], ["name", "pip", "mfree"], False)
26
  [
27
    ['node1.example.com', '192.0.2.18', 14800],
28
    ['node2.example.com', '192.0.2.19', 31280]
29
  ]
30

    
31
There is no way for clients to determine the list of possible fields,
32
meaning they have to be hardcoded. Selecting unknown fields raises
33
an exception::
34

    
35
  >>> cli.GetClient().QueryNodes([], ["name", "UnknownField"], False)
36
  ganeti.errors.OpPrereqError: (u'Unknown output fields selected: UnknownField', u'wrong_input')
37

    
38
The client must also know each fields' kind, that is whether a field is
39
numeric, boolean, describes a storage size, etc. Centralizing this
40
information in one place, the master daemon, is desirable.
41

    
42

    
43
Proposed changes
44
----------------
45

    
46
The current query result format can not be changed as it's being used in
47
various places. Changing the format from one Ganeti version to another
48
would cause too much disruption. For this reason the ability to
49
explicitly request a new result format must be added while the old
50
format stays the default.
51

    
52
The implementation of query filters is planned for the future. To avoid
53
having to change the calls again, a (hopefully) future-compatible
54
interface will be implemented now.
55

    
56
In Python code, the objects described below will be implemented using
57
subclasses of ``objects.ConfigObject``, providing existing facilities
58
for de-/serializing.
59

    
60
Regular expressions
61
+++++++++++++++++++
62

    
63
As it turned out, only very few fields for instances used regular
64
expressions, all of which can easily be turned into static field names.
65
Therefore their use in field names is dropped. Reasons:
66

    
67
- When regexps are used and a field name is not listed as a simple
68
  string in the field dictionary, all keys in the field dictionary have
69
  to be checked whether they're a regular expression object and if so,
70
  matched (see ``utils.FindMatch``).
71
- Code becomes simpler. There would be no need anymore to care about
72
  regular expressions as field names—they'd all be simple strings, even
73
  if there are many more. The list of field names would be static once
74
  built at module-load time.
75
- There's the issue of formatting titles for the clients. Should it be
76
  done in the server? In the client? The field definition's title would
77
  contain backreferences to the regexp groups in the field name
78
  (``re.MatchObject.expand`` can be used). With just strings, the field
79
  definitions can be passed directly to the client. They're static.
80
- Only a side note: In the memory consumed for 1'000
81
  ``_sre.SRE_Pattern`` objects (as returned by ``re.compile`` for an
82
  expression with one group) one can easily store 10'000 strings of the
83
  same length (the regexp objects keep the expression string around, so
84
  compiling the expression always uses more memory).
85

    
86

    
87
.. _item-types:
88

    
89
Item types
90
++++++++++
91

    
92
The proposal is to implement this new interface for the following
93
items:
94

    
95
``instance``
96
  Instances
97
``node``
98
  Nodes
99
``job``
100
  Jobs
101
``lock``
102
  Locks
103

    
104
.. _data-query:
105

    
106
Data query
107
++++++++++
108

    
109
.. _data-query-request:
110

    
111
Request
112
^^^^^^^
113

    
114
The request is a dictionary with the following entries:
115

    
116
``kind`` (string, required)
117
  An :ref:`item type <item-types>`.
118
``fields`` (list of strings, required)
119
  List of names of fields to return. Example::
120

    
121
    ["name", "mem", "nic0.ip", "disk0.size", "disk1.size"]
122

    
123
``filter`` (optional)
124
  This will be used to filter queries. In this implementation only names
125
  can be filtered to replace the previous ``names`` parameter to
126
  queries. An empty filter (``None``) will return all items. To retrieve
127
  specific names, the filter must be specified as follows, with the
128
  inner part repeated for each name::
129

    
130
    ["|", ["=", "name", "node1"], ["=", "name", "node2"], …]
131

    
132
  Filters consist of S-expressions (``["operator", <operants…>]``) and
133
  extensions will be made in the future to allow for more operators and
134
  fields. Such extensions might include a Python-style "in" operator,
135
  but for simplicity only "=" is supported in this implementation.
136

    
137
  To reiterate: Filters for this implementation must consist of exactly
138
  one OR expression (``["|", …]``) and one or more name equality filters
139
  (``["=", "name", "…"]``).
140

    
141
Support for synchronous queries, currently available in the interface
142
but disabled in the master daemon, will be dropped. Direct calls to
143
opcodes have to be used instead.
144

    
145
.. _data-query-response:
146

    
147
Response
148
^^^^^^^^
149

    
150
The result is a dictionary with the following entries:
151

    
152
``fields`` (list of :ref:`field definitions <field-def>`)
153
  In-order list of a :ref:`field definition <field-def>` for each
154
  requested field, unknown fields are returned with the kind
155
  ``unknown``. Length must be equal to number of requested fields.
156
``data`` (list of lists of tuples)
157
  List of lists, one list for each item found. Each item's list must
158
  have one entry for each field listed in ``fields`` (meaning their
159
  length is equal). Each field entry is a tuple of ``(status, value)``.
160
  ``status`` must be one of the following values:
161

    
162
  Normal (numeric 0)
163
    Value is available and matches the kind in the :ref:`field
164
    definition <field-def>`.
165
  Unknown field (numeric 1)
166
    Field for this column is not known. Value must be ``None``.
167
  No data (numeric 2)
168
    Exact meaning depends on query, e.g. node is unreachable or marked
169
    offline. Value must be ``None``.
170
  Value unavailable for item (numeric 3)
171
    Used if, for example, NIC 3 is requested for an instance with only
172
    one network interface. Value must be ``None``.
173

    
174
Example response after requesting the fields ``name``, ``mfree``,
175
``xyz``, ``mtotal``, ``nic0.ip``, ``nic1.ip`` and ``nic2.ip``::
176

    
177
  {
178
    "fields": [
179
      { "name": "name", "title": "Name", "kind": "text", },
180
      { "name": "mfree", "title": "MemFree", "kind": "unit", },
181
      # Unknown field
182
      { "name": "xyz", "title": None, "kind": "unknown", },
183
      { "name": "mtotal", "title": "MemTotal", "kind": "unit", },
184
      { "name": "nic0.ip", "title": "Nic.IP/0", "kind": "text", },
185
      { "name": "nic1.ip", "title": "Nic.IP/1", "kind": "text", },
186
      { "name": "nic2.ip", "title": "Nic.IP/2", "kind": "text", },
187
      ],
188

    
189
    "data": [
190
      [(0, "node1"), (0, 128), (1, None), (0, 4096),
191
       (0, "192.0.2.1"), (0, "192.0.2.2"), (3, None)],
192
      [(0, "node2"), (0, 96), (1, None), (0, 5000),
193
       (0, "192.0.2.21"), (0, "192.0.2.39"), (3, "192.0.2.90")],
194
      # Node not available, can't get "mfree" or "mtotal"
195
      [(0, "node3"), (2, None), (1, None), (2, None),
196
       (0, "192.0.2.30"), (3, None), (3, None)],
197
      ],
198
  }
199

    
200
.. _fields-query:
201

    
202
Fields query
203
++++++++++++
204

    
205
.. _fields-query-request:
206

    
207
Request
208
^^^^^^^
209

    
210
The request is a dictionary with the following entries:
211

    
212
``kind`` (string, required)
213
  An :ref:`item type <item-types>`.
214
``fields`` (list of strings, optional)
215
  List of names of fields to return. If not set, all fields are
216
  returned. Example::
217

    
218
    ["name", "mem", "nic0.ip", "disk0.size", "disk1.size"]
219

    
220
.. _fields-query-response:
221

    
222
Response
223
^^^^^^^^
224

    
225
The result is a dictionary with the following entries:
226

    
227
``fields`` (list of :ref:`field definitions <field-def>`)
228
  List of a :ref:`field definition <field-def>` for each field. If
229
  ``fields`` was set in the request and contained an unknown field, it
230
  is returned as type ``unknown``.
231

    
232
Example::
233

    
234
  {
235
    "fields": [
236
      { "name": "name", "title": "Name", "kind": "text", },
237
      { "name": "mfree", "title": "MemFree", "kind": "unit", },
238
      { "name": "mtotal", "title": "MemTotal", "kind": "unit", },
239
      { "name": "nic0.ip", "title": "Nic.IP/0", "kind": "text", },
240
      { "name": "nic1.ip", "title": "Nic.IP/1", "kind": "text", },
241
      { "name": "nic2.ip", "title": "Nic.IP/2", "kind": "text", },
242
      { "name": "nic3.ip", "title": "Nic.IP/3", "kind": "text", },
243
      # …
244
      { "name": "disk0.size", "title": "Disk.Size/0", "kind": "unit", },
245
      { "name": "disk1.size", "title": "Disk.Size/1", "kind": "unit", },
246
      { "name": "disk2.size", "title": "Disk.Size/2", "kind": "unit", },
247
      { "name": "disk3.size", "title": "Disk.Size/3", "kind": "unit", },
248
      # …
249
      ]
250
  }
251

    
252
.. _field-def:
253

    
254
Field definition
255
++++++++++++++++
256

    
257
A field definition is a dictionary with the following entries:
258

    
259
``name`` (string)
260
  Field name. Must only contain characters matching ``[a-z0-9/._]``.
261
``title`` (string)
262
  Human-readable title to use in output. Must not contain whitespace.
263
``kind`` (string)
264
  Field type, one of the following:
265

    
266
  ``unknown``
267
    Unknown field
268
  ``text``
269
    String
270
  ``bool``
271
    Boolean, true/false
272
  ``number``
273
    Numeric
274
  ``unit``
275
    Numeric, in megabytes
276
  ``timestamp``
277
    Unix timestamp in seconds since the epoch
278
  ``other``
279
    Free-form type, depending on query
280

    
281
  More types can be added in the future, so clients should default to
282
  formatting any unknown types the same way as "other", which should be
283
  a string representation in most cases.
284

    
285
.. TODO: Investigate whether there are fields with floating point
286
.. numbers
287

    
288
Example 1 (item name)::
289

    
290
  {
291
    "name": "name",
292
    "title": "Name",
293
    "kind": "text",
294
  }
295

    
296
Example 2 (free memory)::
297

    
298
  {
299
    "name": "mfree",
300
    "title": "MemFree",
301
    "kind": "unit",
302
  }
303

    
304
Example 3 (list of primary instances)::
305

    
306
  {
307
    "name": "pinst",
308
    "title": "PrimaryInstances",
309
    "kind": "other",
310
  }
311

    
312
.. _old-result-format:
313

    
314
Old result format
315
+++++++++++++++++
316

    
317
To limit the amount of code necessary, the :ref:`new result format
318
<data-query-response>` will be converted for clients calling the old
319
methods.  Unavailable values are set to ``None``. If unknown fields were
320
requested, the whole query fails as the client expects exactly the
321
fields it requested.
322

    
323
.. _luxi:
324

    
325
LUXI
326
++++
327

    
328
Currently query calls take a number of parameters, e.g. names, fields
329
and whether to use locking. These will continue to work and return the
330
:ref:`old result format <old-result-format>`. Only clients using the
331
new calls described below will be able to make use of new features such
332
as filters. Two new calls are introduced:
333

    
334
``Query``
335
  Execute a query on items, optionally filtered. Takes a single
336
  parameter, a :ref:`query object <data-query-request>` encoded as a
337
  dictionary and returns a :ref:`data query response
338
  <data-query-response`.
339
``QueryFields``
340
  Return list of supported fields as :ref:`field definitions
341
  <field-def>`. Takes a single parameter, a :ref:`fields query object
342
  <fields-query-request>` encoded as a dictionary and returns a
343
  :ref:`fields query response <fields-query-response>`.
344

    
345

    
346
Python
347
++++++
348

    
349
The LUXI API is more or less mapped directly into Python. In addition to
350
the existing stub functions new ones will be added for the new query
351
requests.
352

    
353
RAPI
354
++++
355

    
356
The RAPI interface already returns dictionaries for each item, but to
357
not break compatibility no changes should be made to the structure (e.g.
358
to include field definitions). The proposal here is to add a new
359
parameter to allow clients to execute the requests described in this
360
proposal directly and to receive the unmodified result. The new formats
361
are a lot more verbose, flexible and extensible.
362

    
363
.. _cli-programs:
364

    
365
CLI programs
366
++++++++++++
367

    
368
Command line programs might have difficulties to display the verbose
369
status data to the user. There are several options:
370

    
371
- Use colours to indicate missing values
372
- Display status as value in parentheses, e.g. "(unavailable)"
373
- Hide unknown columns from the result table and print a warning
374
- Exit with non-zero code to indicate failures and/or missing data
375

    
376
Some are better for interactive usage, some better for use by other
377
programs. It is expected that a combination will be used. The column
378
separator (``--separator=…``) can be used to differentiate between
379
interactive and programmatic usage.
380

    
381

    
382
Other discussed solutions
383
-------------------------
384

    
385
Another solution discussed was to add an additional column for each
386
non-static field containing the status. Clients interested in the status
387
could explicitely query for it.
388

    
389
.. vim: set textwidth=72 :
390
.. Local Variables:
391
.. mode: rst
392
.. fill-column: 72
393
.. End: