Statistics
| Branch: | Tag: | Revision:

root / doc / design-query2.rst @ 7142485a

History | View | Annotate | Download (12.8 kB)

1
======================
2
Query version 2 design
3
======================
4

    
5
.. contents:: :depth: 4
6
.. highlight:: python
7

    
8
Current state and shortcomings
9
==============================
10

    
11
Queries are used to retrieve information about the cluster, e.g. a list
12
of instances or nodes. For historical reasons they use a simple data
13
structure for their result. The client submits the fields it would like
14
to receive and the query returns a list for each item (instance, node,
15
etc.) available. Each item consists of another list representing the
16
fields' values.
17

    
18
This data structure has a few drawbacks. It can't associate a status
19
(e.g. “node offline”) with fields as using special values can lead to
20
ambiguities. Additionally it can't mark fields as “not found” as the
21
list of returned columns must match the fields requested.
22

    
23
Example::
24

    
25
  >>> cli.GetClient().QueryNodes([], ["name", "pip", "mfree"], False)
26
  [
27
    ['node1.example.com', '192.0.2.18', 14800],
28
    ['node2.example.com', '192.0.2.19', 31280]
29
  ]
30

    
31
There is no way for clients to determine the list of possible fields,
32
meaning they have to be hardcoded. Selecting unknown fields raises
33
an exception::
34

    
35
  >>> cli.GetClient().QueryNodes([], ["name", "UnknownField"], False)
36
  ganeti.errors.OpPrereqError: (u'Unknown output fields selected: UnknownField', u'wrong_input')
37

    
38
The client must also know each fields' kind, that is whether a field is
39
numeric, boolean, describes a storage size, etc. Centralizing this
40
information in one place, the master daemon, is desirable.
41

    
42

    
43
Proposed changes
44
----------------
45

    
46
The current query result format can not be changed as it's being used in
47
various places. Changing the format from one Ganeti version to another
48
would cause too much disruption. For this reason the ability to
49
explicitly request a new result format must be added while the old
50
format stays the default.
51

    
52
The implementation of query filters is planned for the future. To avoid
53
having to change the calls again, a (hopefully) future-compatible
54
interface will be implemented now.
55

    
56
In Python code, the objects described below will be implemented using
57
subclasses of ``objects.ConfigObject``, providing existing facilities
58
for de-/serializing.
59

    
60
Regular expressions
61
+++++++++++++++++++
62

    
63
As it turned out, only very few fields for instances used regular
64
expressions, all of which can easily be turned into static field names.
65
Therefore their use in field names is dropped. Reasons:
66

    
67
- When regexps are used and a field name is not listed as a simple
68
  string in the field dictionary, all keys in the field dictionary have
69
  to be checked whether they're a regular expression object and if so,
70
  matched (see ``utils.FindMatch``).
71
- Code becomes simpler. There would be no need anymore to care about
72
  regular expressions as field names—they'd all be simple strings, even
73
  if there are many more. The list of field names would be static once
74
  built at module-load time.
75
- There's the issue of formatting titles for the clients. Should it be
76
  done in the server? In the client? The field definition's title would
77
  contain backreferences to the regexp groups in the field name
78
  (``re.MatchObject.expand`` can be used). With just strings, the field
79
  definitions can be passed directly to the client. They're static.
80
- Only a side note: In the memory consumed for 1'000
81
  ``_sre.SRE_Pattern`` objects (as returned by ``re.compile`` for an
82
  expression with one group) one can easily store 10'000 strings of the
83
  same length (the regexp objects keep the expression string around, so
84
  compiling the expression always uses more memory).
85

    
86

    
87
.. _item-types:
88

    
89
Item types
90
++++++++++
91

    
92
The proposal is to implement this new interface for the following
93
items:
94

    
95
``instance``
96
  Instances
97
``node``
98
  Nodes
99
``job``
100
  Jobs
101
``lock``
102
  Locks
103
``os``
104
  Operating systems
105

    
106
.. _data-query:
107

    
108
Data query
109
++++++++++
110

    
111
.. _data-query-request:
112

    
113
Request
114
^^^^^^^
115

    
116
The request is a dictionary with the following entries:
117

    
118
``what`` (string, required)
119
  An :ref:`item type <item-types>`.
120
``fields`` (list of strings, required)
121
  List of names of fields to return. Example::
122

    
123
    ["name", "mem", "nic0.ip", "disk0.size", "disk1.size"]
124

    
125
``filter`` (optional)
126
  This will be used to filter queries. In this implementation only names
127
  can be filtered to replace the previous ``names`` parameter to
128
  queries. An empty filter (``None``) will return all items. To retrieve
129
  specific names, the filter must be specified as follows, with the
130
  inner part repeated for each name::
131

    
132
    ["|", ["=", "name", "node1"], ["=", "name", "node2"], …]
133

    
134
  Filters consist of S-expressions (``["operator", <operants…>]``) and
135
  extensions will be made in the future to allow for more operators and
136
  fields. Such extensions might include a Python-style "in" operator,
137
  but for simplicity only "=" is supported in this implementation.
138

    
139
  To reiterate: Filters for this implementation must consist of exactly
140
  one OR expression (``["|", …]``) and one or more name equality filters
141
  (``["=", "name", "…"]``).
142

    
143
Support for synchronous queries, currently available in the interface
144
but disabled in the master daemon, will be dropped. Direct calls to
145
opcodes have to be used instead.
146

    
147
.. _data-query-response:
148

    
149
Response
150
^^^^^^^^
151

    
152
The result is a dictionary with the following entries:
153

    
154
``fields`` (list of :ref:`field definitions <field-def>`)
155
  In-order list of a :ref:`field definition <field-def>` for each
156
  requested field, unknown fields are returned with the kind
157
  ``unknown``. Length must be equal to number of requested fields.
158
``data`` (list of lists of tuples)
159
  List of lists, one list for each item found. Each item's list must
160
  have one entry for each field listed in ``fields`` (meaning their
161
  length is equal). Each field entry is a tuple of ``(status, value)``.
162
  ``status`` must be one of the following values:
163

    
164
  Normal (numeric 0)
165
    Value is available and matches the kind in the :ref:`field
166
    definition <field-def>`.
167
  Unknown field (numeric 1)
168
    Field for this column is not known. Value must be ``None``.
169
  No data (numeric 2)
170
    Exact meaning depends on query, e.g. node is unreachable or marked
171
    offline. Value must be ``None``.
172
  Value unavailable for item (numeric 3)
173
    Used if, for example, NIC 3 is requested for an instance with only
174
    one network interface. Value must be ``None``.
175
  Resource offline (numeric 4)
176
    Used if resource is marked offline. Value must be ``None``.
177

    
178
Example response after requesting the fields ``name``, ``mfree``,
179
``xyz``, ``mtotal``, ``nic0.ip``, ``nic1.ip`` and ``nic2.ip``::
180

    
181
  {
182
    "fields": [
183
      { "name": "name", "title": "Name", "kind": "text", },
184
      { "name": "mfree", "title": "MemFree", "kind": "unit", },
185
      # Unknown field
186
      { "name": "xyz", "title": None, "kind": "unknown", },
187
      { "name": "mtotal", "title": "MemTotal", "kind": "unit", },
188
      { "name": "nic0.ip", "title": "Nic.IP/0", "kind": "text", },
189
      { "name": "nic1.ip", "title": "Nic.IP/1", "kind": "text", },
190
      { "name": "nic2.ip", "title": "Nic.IP/2", "kind": "text", },
191
      ],
192

    
193
    "data": [
194
      [(0, "node1"), (0, 128), (1, None), (0, 4096),
195
       (0, "192.0.2.1"), (0, "192.0.2.2"), (3, None)],
196
      [(0, "node2"), (0, 96), (1, None), (0, 5000),
197
       (0, "192.0.2.21"), (0, "192.0.2.39"), (3, "192.0.2.90")],
198
      # Node not available, can't get "mfree" or "mtotal"
199
      [(0, "node3"), (2, None), (1, None), (2, None),
200
       (0, "192.0.2.30"), (3, None), (3, None)],
201
      ],
202
  }
203

    
204
.. _fields-query:
205

    
206
Fields query
207
++++++++++++
208

    
209
.. _fields-query-request:
210

    
211
Request
212
^^^^^^^
213

    
214
The request is a dictionary with the following entries:
215

    
216
``what`` (string, required)
217
  An :ref:`item type <item-types>`.
218
``fields`` (list of strings, optional)
219
  List of names of fields to return. If not set, all fields are
220
  returned. Example::
221

    
222
    ["name", "mem", "nic0.ip", "disk0.size", "disk1.size"]
223

    
224
.. _fields-query-response:
225

    
226
Response
227
^^^^^^^^
228

    
229
The result is a dictionary with the following entries:
230

    
231
``fields`` (list of :ref:`field definitions <field-def>`)
232
  List of a :ref:`field definition <field-def>` for each field. If
233
  ``fields`` was set in the request and contained an unknown field, it
234
  is returned as type ``unknown``.
235

    
236
Example::
237

    
238
  {
239
    "fields": [
240
      { "name": "name", "title": "Name", "kind": "text", },
241
      { "name": "mfree", "title": "MemFree", "kind": "unit", },
242
      { "name": "mtotal", "title": "MemTotal", "kind": "unit", },
243
      { "name": "nic0.ip", "title": "Nic.IP/0", "kind": "text", },
244
      { "name": "nic1.ip", "title": "Nic.IP/1", "kind": "text", },
245
      { "name": "nic2.ip", "title": "Nic.IP/2", "kind": "text", },
246
      { "name": "nic3.ip", "title": "Nic.IP/3", "kind": "text", },
247
      # …
248
      { "name": "disk0.size", "title": "Disk.Size/0", "kind": "unit", },
249
      { "name": "disk1.size", "title": "Disk.Size/1", "kind": "unit", },
250
      { "name": "disk2.size", "title": "Disk.Size/2", "kind": "unit", },
251
      { "name": "disk3.size", "title": "Disk.Size/3", "kind": "unit", },
252
      # …
253
      ]
254
  }
255

    
256
.. _field-def:
257

    
258
Field definition
259
++++++++++++++++
260

    
261
A field definition is a dictionary with the following entries:
262

    
263
``name`` (string)
264
  Field name. Must only contain characters matching ``[a-z0-9/._]``.
265
``title`` (string)
266
  Human-readable title to use in output. Must not contain whitespace.
267
``kind`` (string)
268
  Field type, one of the following:
269

    
270
  ``unknown``
271
    Unknown field
272
  ``text``
273
    String
274
  ``bool``
275
    Boolean, true/false
276
  ``number``
277
    Numeric
278
  ``unit``
279
    Numeric, in megabytes
280
  ``timestamp``
281
    Unix timestamp in seconds since the epoch
282
  ``other``
283
    Free-form type, depending on query
284

    
285
  More types can be added in the future, so clients should default to
286
  formatting any unknown types the same way as "other", which should be
287
  a string representation in most cases.
288

    
289
``doc`` (string)
290
  Human-readable description. Must start with uppercase character and
291
  must not end with punctuation or contain newlines.
292

    
293
.. TODO: Investigate whether there are fields with floating point
294
.. numbers
295

    
296
Example 1 (item name)::
297

    
298
  {
299
    "name": "name",
300
    "title": "Name",
301
    "kind": "text",
302
  }
303

    
304
Example 2 (free memory)::
305

    
306
  {
307
    "name": "mfree",
308
    "title": "MemFree",
309
    "kind": "unit",
310
  }
311

    
312
Example 3 (list of primary instances)::
313

    
314
  {
315
    "name": "pinst",
316
    "title": "PrimaryInstances",
317
    "kind": "other",
318
  }
319

    
320
.. _old-result-format:
321

    
322
Old result format
323
+++++++++++++++++
324

    
325
To limit the amount of code necessary, the :ref:`new result format
326
<data-query-response>` will be converted for clients calling the old
327
methods.  Unavailable values are set to ``None``. If unknown fields were
328
requested, the whole query fails as the client expects exactly the
329
fields it requested.
330

    
331
.. _query2-luxi:
332

    
333
LUXI
334
++++
335

    
336
Currently query calls take a number of parameters, e.g. names, fields
337
and whether to use locking. These will continue to work and return the
338
:ref:`old result format <old-result-format>`. Only clients using the
339
new calls described below will be able to make use of new features such
340
as filters. Two new calls are introduced:
341

    
342
``Query``
343
  Execute a query on items, optionally filtered. Takes a single
344
  parameter, a :ref:`query object <data-query-request>` encoded as a
345
  dictionary and returns a :ref:`data query response
346
  <data-query-response>`.
347
``QueryFields``
348
  Return list of supported fields as :ref:`field definitions
349
  <field-def>`. Takes a single parameter, a :ref:`fields query object
350
  <fields-query-request>` encoded as a dictionary and returns a
351
  :ref:`fields query response <fields-query-response>`.
352

    
353

    
354
Python
355
++++++
356

    
357
The LUXI API is more or less mapped directly into Python. In addition to
358
the existing stub functions new ones will be added for the new query
359
requests.
360

    
361
RAPI
362
++++
363

    
364
The RAPI interface already returns dictionaries for each item, but to
365
not break compatibility no changes should be made to the structure (e.g.
366
to include field definitions). The proposal here is to add a new
367
parameter to allow clients to execute the requests described in this
368
proposal directly and to receive the unmodified result. The new formats
369
are a lot more verbose, flexible and extensible.
370

    
371
.. _cli-programs:
372

    
373
CLI programs
374
++++++++++++
375

    
376
Command line programs might have difficulties to display the verbose
377
status data to the user. There are several options:
378

    
379
- Use colours to indicate missing values
380
- Display status as value in parentheses, e.g. "(unavailable)"
381
- Hide unknown columns from the result table and print a warning
382
- Exit with non-zero code to indicate failures and/or missing data
383

    
384
Some are better for interactive usage, some better for use by other
385
programs. It is expected that a combination will be used. The column
386
separator (``--separator=…``) can be used to differentiate between
387
interactive and programmatic usage.
388

    
389

    
390
Other discussed solutions
391
-------------------------
392

    
393
Another solution discussed was to add an additional column for each
394
non-static field containing the status. Clients interested in the status
395
could explicitely query for it.
396

    
397
.. vim: set textwidth=72 :
398
.. Local Variables:
399
.. mode: rst
400
.. fill-column: 72
401
.. End: