Revision c9af0703 docs/source/devguide.rst

b/docs/source/devguide.rst
6 6

  
7 7
Pithos is a storage service implemented by GRNET (http://www.grnet.gr). Data is stored as objects, organized in containers, belonging to an account. This hierarchy of storage layers has been inspired by the OpenStack Object Storage (OOS) API and similar CloudFiles API by Rackspace. The Pithos API follows the OOS API as closely as possible. One of the design requirements has been to be able to use Pithos with clients built for the OOS, without changes.
8 8

  
9
However, to be able to take full advantage of the Pithos infrastructure, client software should be aware of the extensions that differentiate Pithos from OOS. Pithos objects can be updated, or appended to. They can also be versioned, meaning that the server will track changes, assign version numbers and allow reading previous instances.
9
However, to be able to take full advantage of the Pithos infrastructure, client software should be aware of the extensions that differentiate Pithos from OOS. Pithos objects can be updated, or appended to. Automatic version management, allows taking account and container listings back in time, as well as reading previous instances of objects.
10 10

  
11
The storage backend of Pithos is block oriented, which allows for efficient, deduplicated data placement. The block structure of objects is exposed at the API layer, in order to encourage external software to implement advanced data management operations.
11
The storage backend of Pithos is block oriented, permitting efficient, deduplicated data placement. The block structure of objects is exposed at the API layer, in order to encourage external software to implement advanced data management operations.
12 12

  
13 13
This document's goals are:
14 14

  
......
25 25
=========================  ================================
26 26
Revision                   Description
27 27
=========================  ================================
28
0.3 (June 10, 2011)        Allow for publicly available objects via ``https://hostname/public``.
29
\                          Support time-variant account/container listings. 
30
\                          Add source version when duplicating with PUT/COPY/MOVE.
31
\                          Request version in object HEAD/GET requests (list versions with GET).
28 32
0.2 (May 31, 2011)         Add object meta listing and filtering in containers.
29 33
\                          Include underlying storage characteristics in container meta.
30 34
\                          Support for partial object updates through POST.
......
36 40
The Pithos API
37 41
--------------
38 42

  
39
The URI requests supported by Pithos follow one of the following forms:
43
The URI requests supported by the Pithos API follow one of the following forms:
40 44

  
41 45
* Top level: ``https://hostname/v1/``
42 46
* Account level: ``https://hostname/v1/<account>``
43 47
* Container level: ``https://hostname/v1/<account>/<container>``
44 48
* Object level: ``https://hostname/v1/<account>/<container>/<object>``
45 49

  
46
All requests must include an ``X-Auth-Token``, except from those that refer to publicly available files (**TBD**). The process of obtaining the token is still to be determined (**TBD**).
50
All requests must include an ``X-Auth-Token``. The process of obtaining the token is still to be determined (**TBD**).
47 51

  
48 52
The allowable request operations and respective return codes per level are presented in the remainder of this chapter. Common to all requests are the following return codes.
49 53

  
......
95 99
HEAD
96 100
""""
97 101

  
98
No request parameters/headers.
102
======================  ===================================
103
Request Parameter Name  Value
104
======================  ===================================
105
until                   Optional timestamp
106
======================  ===================================
107

  
108
|
99 109

  
100 110
==========================  =====================
101 111
Reply Header Name           Value
......
105 115
X-Account-Bytes-Used        The total number of bytes stored
106 116
X-Account-Bytes-Remaining   The total number of bytes remaining (**TBD**)
107 117
X-Account-Last-Login        The last login (**TBD**)
118
X-Account-Until-Timestamp   The last account modification date until the timestamp provided
108 119
X-Account-Meta-*            Optional user defined metadata
109
Last-Modified               The last object modification date
120
Last-Modified               The last account modification date (regardless of ``until``)
110 121
==========================  =====================
111 122

  
112 123
|
......
136 147
limit                   The amount of results requested (default is 10000)
137 148
marker                  Return containers with name lexicographically after marker
138 149
format                  Optional extended reply type (can be ``json`` or ``xml``)
150
until                   Optional timestamp
139 151
======================  =========================
140 152

  
141 153
The reply is a list of container names. Account headers (as in a ``HEAD`` request) will also be included.
142 154
If a ``format=xml`` or ``format=json`` argument is given, extended information on the containers will be returned, serialized in the chosen format.
143 155
For each container, the information will include all container metadata (names will be in lower case and with hyphens replaced with underscores):
144 156

  
145
===================  ============================
146
Name                 Description
147
===================  ============================
148
name                 The name of the container
149
count                The number of objects inside the container
150
bytes                The total size of the objects inside the container
151
last_modified        The last object modification date
152
x_container_meta_*   Optional user defined metadata
153
===================  ============================
157
===========================  ============================
158
Name                         Description
159
===========================  ============================
160
name                         The name of the container
161
count                        The number of objects inside the container
162
bytes                        The total size of the objects inside the container
163
last_modified                The last container modification date (regardless of ``until``)
164
x_container_until_timestamp  The last container modification date until the timestamp provided
165
x_container_meta_*           Optional user defined metadata
166
===========================  ============================
154 167

  
155 168
For examples of container details returned in JSON/XML formats refer to the OOS API documentation.
156 169

  
......
205 218
HEAD
206 219
""""
207 220

  
208
No request parameters/headers.
221
======================  ===================================
222
Request Parameter Name  Value
223
======================  ===================================
224
until                   Optional timestamp
225
======================  ===================================
209 226

  
210
==========================  ===============================
211
Reply Header Name           Value
212
==========================  ===============================
213
X-Container-Object-Count    The total number of objects in the container
214
X-Container-Bytes-Used      The total number of bytes of all objects stored
215
X-Container-Meta-*          Optional user defined metadata
216
X-Container-Object-Meta     A list with all meta keys used by objects
217
X-Container-Block-Size      The block size used by the storage backend
218
X-Container-Block-Hash      The hash algorithm used for block identifiers in object hashmaps
219
Last-Modified               The last object modification date
220
==========================  ===============================
227
|
228

  
229
===========================  ===============================
230
Reply Header Name            Value
231
===========================  ===============================
232
X-Container-Object-Count     The total number of objects in the container
233
X-Container-Bytes-Used       The total number of bytes of all objects stored
234
X-Container-Block-Size       The block size used by the storage backend
235
X-Container-Block-Hash       The hash algorithm used for block identifiers in object hashmaps
236
X-Container-Until-Timestamp  The last container modification date until the timestamp provided
237
X-Container-Object-Meta      A list with all meta keys used by objects
238
X-Container-Meta-*           Optional user defined metadata
239
Last-Modified                The last container modification date (regardless of ``until``)
240
===========================  ===============================
221 241

  
222 242
The keys returned in ``X-Container-Object-Meta`` are all the unique strings after the ``X-Object-Meta-`` prefix.
223 243

  
......
250 270
path                    Assume ``prefix=path`` and ``delimiter=/``
251 271
format                  Optional extended reply type (can be ``json`` or ``xml``)
252 272
meta                    Return objects having the specified meta keys (can be a comma separated list)
273
until                   Optional timestamp
253 274
======================  ===================================
254 275

  
255 276
The ``path`` parameter overrides ``prefix`` and ``delimiter``. When using ``path``, results will include objects ending in ``delimiter``.
......
260 281
If a ``format=xml`` or ``format=json`` argument is given, extended information on the objects will be returned, serialized in the chosen format.
261 282
For each object, the information will include all object metadata (names will be in lower case and with hyphens replaced with underscores):
262 283

  
263
===================  ======================================
264
Name                 Description
265
===================  ======================================
266
name                 The name of the object
267
hash                 The ETag of the object
268
bytes                The size of the object
269
content_type         The MIME content type of the object
270
content_encoding     The encoding of the object (optional)
271
last_modified        The last object modification date
272
x_object_manifest    Large object support
273
x_object_meta_*      Optional user defined metadata
274
===================  ======================================
284
==========================  ======================================
285
Name                        Description
286
==========================  ======================================
287
name                        The name of the object
288
hash                        The ETag of the object
289
bytes                       The size of the object
290
content_type                The MIME content type of the object
291
content_encoding            The encoding of the object (optional)
292
content-disposition         The presentation style of the object (optional)
293
last_modified               The last object modification date (regardless of version)
294
x_object_version            The object's version identifier
295
x_object_version_timestamp  The object's version timestamp
296
x_object_manifest           Large object support (optional)
297
x_object_public             Object is publicly accessible (optional)
298
x_object_meta_*             Optional user defined metadata
299
==========================  ======================================
275 300

  
276 301
Extended replies may also include virtual directory markers in separate sections of the ``json`` or ``xml`` results.
277 302
Virtual directory markers are only included when ``delimiter`` is explicitly set. They correspond to the substrings up to and including the first occurrence of the delimiter.
278 303
In JSON results they appear as dictionaries with only a ``"subdir"`` key. In XML results they appear interleaved with ``<object>`` tags as ``<subdir name="..." />``.
279 304
In case there is an object with the same name as a virtual directory marker, the object will be returned.
280
 
305

  
281 306
For examples of object details returned in JSON/XML formats refer to the OOS API documentation.
282 307

  
283 308
===========================  ===============================
......
367 392
HEAD
368 393
""""
369 394

  
370
No request parameters/headers.
395
======================  ===================================
396
Request Parameter Name  Value
397
======================  ===================================
398
version                 Optional version identifier
399
======================  ===================================
400

  
401
|
371 402

  
372 403
==========================  ===============================
373 404
Reply Header Name           Value
......
375 406
ETag                        The ETag of the object
376 407
Content-Length              The size of the object
377 408
Content-Type                The MIME content type of the object
378
Last-Modified               The last object modification date
409
Last-Modified               The last object modification date (regardless of version)
379 410
Content-Encoding            The encoding of the object (optional)
380 411
Content-Disposition         The presentation style of the object (optional)
412
X-Object-Version            The object's version identifier
413
X-Object-Version-Timestamp  The object's version timestamp
381 414
X-Object-Manifest           Large object support (optional)
415
X-Object-Public             Object is publicly accessible (optional)
382 416
X-Object-Meta-*             Optional user defined metadata
383 417
==========================  ===============================
384 418

  
......
410 444
Request Parameter Name  Value
411 445
======================  ===================================
412 446
format                  Optional extended reply type (can be ``json`` or ``xml``)
447
version                 Optional version identifier or ``list`` (specify a format if requesting a list)
413 448
======================  ===================================
414 449

  
415
The reply is the object's data (or part of it), except if a hashmap is requested with the ``format`` parameter. Object headers (as in a ``HEAD`` request) are always included.
450
The reply is the object's data (or part of it), except if a hashmap is requested with the ``format`` parameter, or a version list with ``version=list`` (in which case an extended reply format must be specified). Object headers (as in a ``HEAD`` request) are always included.
416 451

  
417 452
Hashmaps expose the underlying storage format of the object. Note that each hash is computed after trimming trailing null bytes of the corresponding block.
418 453

  
......
420 455

  
421 456
::
422 457

  
423
  {"block_hash": "sha1", "hashes": ["7295c41da03d7f916440b98e32c4a2a39351546c"], "block_size": 131072, "bytes": 242}
458
  {"block_hash": "sha1", "hashes": ["7295c41da03d7f916440b98e32c4a2a39351546c", ...], "block_size": 131072, "bytes": 242}
424 459

  
425 460
Example ``format=xml`` reply:
426 461

  
......
432 467
    <hash>...</hash>
433 468
  </object>
434 469

  
470
Version lists include the version identifier and timestamp for each available object version. Version identifiers are integers, with the only requirement that newer versions have a larger identifier than previous ones.
471

  
472
Example ``format=json`` reply:
473

  
474
::
475

  
476
  {"versions": [[23, 1307700892], [28, 1307700898], ...]}
477

  
478
Example ``format=xml`` reply:
479

  
480
::
481

  
482
  <?xml version="1.0" encoding="UTF-8"?>
483
  <object name="file">
484
    <version timestamp="1307700892">23</version>
485
    <version timestamp="1307700898">28</version>
486
    <version timestamp="...">...</version>
487
  </object>
488

  
435 489
The ``Range`` header may include multiple ranges, as outlined in RFC2616. Then the ``Content-Type`` of the reply will be ``multipart/byteranges`` and each part will include a ``Content-Range`` header.
436 490

  
437 491
==========================  ===============================
......
441 495
Content-Length              The size of the data returned
442 496
Content-Type                The MIME content type of the object
443 497
Content-Range               The range of data included (only on a single range request)
444
Last-Modified               The last object modification date
498
Last-Modified               The last object modification date (regardless of version)
445 499
Content-Encoding            The encoding of the object (optional)
446 500
Content-Disposition         The presentation style of the object (optional)
501
X-Object-Version            The object's version identifier
502
X-Object-Version-Timestamp  The object's version timestamp
447 503
X-Object-Manifest           Large object support (optional)
504
X-Object-Public             Object is publicly accessible (optional)
448 505
X-Object-Meta-*             Optional user defined metadata
449 506
==========================  ===============================
450 507

  
......
473 530
Transfer-Encoding     Set to ``chunked`` to specify incremental uploading (if used, ``Content-Length`` is ignored)
474 531
X-Copy-From           The source path in the form ``/<container>/<object>``
475 532
X-Move-From           The source path in the form ``/<container>/<object>``
533
X-Source-Version      The source version to copy/move from
476 534
Content-Encoding      The encoding of the object (optional)
477 535
Content-Disposition   The presentation style of the object (optional)
478 536
X-Object-Manifest     Large object support (optional)
537
X-Object-Public       Object is publicly accessible (optional)
479 538
X-Object-Meta-*       Optional user defined metadata
480 539
====================  ================================
481 540

  
......
508 567
Content-Type          The MIME content type of the object (optional)
509 568
Content-Encoding      The encoding of the object (optional)
510 569
Content-Disposition   The presentation style of the object (optional)
570
X-Source-Version      The source version to copy/move from
511 571
X-Object-Manifest     Large object support (optional)
572
X-Object-Public       Object is publicly accessible (optional)
512 573
X-Object-Meta-*       Optional user defined metadata
513 574
====================  ================================
514 575

  
......
540 601
Content-Encoding      The encoding of the object (optional)
541 602
Content-Disposition   The presentation style of the object (optional)
542 603
X-Object-Manifest     Large object support (optional)
604
X-Object-Public       Object is publicly accessible (optional)
543 605
X-Object-Meta-*       Optional user defined metadata
544 606
====================  ================================
545 607

  
546
The ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest`` and ``X-Object-Meta-*`` headers are considered to be user defined metadata. The update operation will overwrite all previous values and remove any keys not supplied.
608
The ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest``, ``X-Object-Public`` and ``X-Object-Meta-*`` headers are considered to be user defined metadata. The update operation will overwrite all previous values and remove any keys not supplied.
547 609

  
548 610
To update an object:
549 611

  
......
590 652
204 (No Content)             The request succeeded
591 653
===========================  ==============================
592 654

  
655
Public Objects
656
^^^^^^^^^^^^^^
657

  
658
Objects that are marked as public, via the ``X-Object-Public`` meta, are also available at the corresponding URI ``https://hostname/public/<account>/<container>/<object>`` for ``HEAD`` or ``GET``. Requests for public objects do not need to include an ``X-Auth-Token``. Pithos will ignore request parameters and only include the following headers in the reply (all ``X-Object-*`` meta is hidden).
659

  
660
==========================  ===============================
661
Reply Header Name           Value
662
==========================  ===============================
663
ETag                        The ETag of the object
664
Content-Length              The size of the data returned
665
Content-Type                The MIME content type of the object
666
Content-Range               The range of data included (only on a single range request)
667
Last-Modified               The last object modification date (regardless of version)
668
Content-Encoding            The encoding of the object (optional)
669
Content-Disposition         The presentation style of the object (optional)
670
==========================  ===============================
593 671

  
594 672
Summary
595 673
^^^^^^^
......
603 681
* All metadata replies, at all levels, include latest modification information.
604 682
* At all levels, a ``GET`` request may use ``If-Modified-Since`` and ``If-Unmodified-Since`` headers.
605 683
* Container/object lists include all associated metadata if the reply is of type json/xml. Some names are kept to their OOS API equivalents for compatibility. 
606
* Object metadata allowed, in addition to ``X-Object-Meta-*``: ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest``. These are all replaced with every update operation.
684
* Object metadata allowed, in addition to ``X-Object-Meta-*``: ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest``, ``X-Object-Public``. These are all replaced with every update operation.
607 685
* Multi-range object GET support as outlined in RFC2616.
608 686
* Object hashmap retrieval through GET and the ``format`` parameter.
609 687
* Partial object updates through POST, using the ``Content-Length``, ``Content-Type``, ``Content-Range`` and ``Transfer-Encoding`` headers.
610 688
* Object ``MOVE`` support.
689
* Time-variant account/container listings via the ``until`` parameter.
690
* Object versions - parameter ``version`` in HEAD/GET (list versions with GET), ``X-Object-Version-*`` meta in replies, ``X-Source-Version`` in PUT/COPY/MOVE.
691
* Publicly accessible objects via ``https://hostname/public``. Control with ``X-Object-Public``.
611 692

  
612 693
Clarifications/suggestions:
613 694

  
......
618 699
* The ``Accept`` header may be used in requests instead of the ``format`` parameter to specify the desired reply format. The parameter overrides the header.
619 700
* Container/object lists use a ``200`` return code if the reply is of type json/xml. The reply will include an empty json/xml.
620 701
* In headers, dates are formatted according to RFC 1123. In extended information listings, dates are formatted according to ISO 8601.
702
* The ``Last-Modified`` header value always reflects the actual latest change timestamp, regardless of time control parameters and version requests. Time precondition checks with ``If-Modified-Since`` and ``If-Unmodified-Since`` headers are applied to this value.
621 703
* While ``X-Object-Manifest`` can be set and unset, large object support is not yet implemented (**TBD**).
622 704

  
623 705
The Pithos Client
......
643 725
* Moved to trash and then deleted.
644 726
* Shared with specific permissions.
645 727
* Made public (shared with non-Pithos users).
646
* Set to monitor changes via version tracking.
728
* Restored from previous versions.
647 729

  
648 730
Some of these functions are performed by the client software and some by the Pithos server. Client-driven functionality is based on specific metadata that should be handled equally across implementations. These metadata names are discussed in the next chapter. 
649 731

  

Also available in: Unified diff