Revision c9af0703 docs/source/devguide.rst
b/docs/source/devguide.rst | ||
---|---|---|
6 | 6 |
|
7 | 7 |
Pithos is a storage service implemented by GRNET (http://www.grnet.gr). Data is stored as objects, organized in containers, belonging to an account. This hierarchy of storage layers has been inspired by the OpenStack Object Storage (OOS) API and similar CloudFiles API by Rackspace. The Pithos API follows the OOS API as closely as possible. One of the design requirements has been to be able to use Pithos with clients built for the OOS, without changes. |
8 | 8 |
|
9 |
However, to be able to take full advantage of the Pithos infrastructure, client software should be aware of the extensions that differentiate Pithos from OOS. Pithos objects can be updated, or appended to. They can also be versioned, meaning that the server will track changes, assign version numbers and allow reading previous instances.
|
|
9 |
However, to be able to take full advantage of the Pithos infrastructure, client software should be aware of the extensions that differentiate Pithos from OOS. Pithos objects can be updated, or appended to. Automatic version management, allows taking account and container listings back in time, as well as reading previous instances of objects.
|
|
10 | 10 |
|
11 |
The storage backend of Pithos is block oriented, which allows for efficient, deduplicated data placement. The block structure of objects is exposed at the API layer, in order to encourage external software to implement advanced data management operations.
|
|
11 |
The storage backend of Pithos is block oriented, permitting efficient, deduplicated data placement. The block structure of objects is exposed at the API layer, in order to encourage external software to implement advanced data management operations.
|
|
12 | 12 |
|
13 | 13 |
This document's goals are: |
14 | 14 |
|
... | ... | |
25 | 25 |
========================= ================================ |
26 | 26 |
Revision Description |
27 | 27 |
========================= ================================ |
28 |
0.3 (June 10, 2011) Allow for publicly available objects via ``https://hostname/public``. |
|
29 |
\ Support time-variant account/container listings. |
|
30 |
\ Add source version when duplicating with PUT/COPY/MOVE. |
|
31 |
\ Request version in object HEAD/GET requests (list versions with GET). |
|
28 | 32 |
0.2 (May 31, 2011) Add object meta listing and filtering in containers. |
29 | 33 |
\ Include underlying storage characteristics in container meta. |
30 | 34 |
\ Support for partial object updates through POST. |
... | ... | |
36 | 40 |
The Pithos API |
37 | 41 |
-------------- |
38 | 42 |
|
39 |
The URI requests supported by Pithos follow one of the following forms:
|
|
43 |
The URI requests supported by the Pithos API follow one of the following forms:
|
|
40 | 44 |
|
41 | 45 |
* Top level: ``https://hostname/v1/`` |
42 | 46 |
* Account level: ``https://hostname/v1/<account>`` |
43 | 47 |
* Container level: ``https://hostname/v1/<account>/<container>`` |
44 | 48 |
* Object level: ``https://hostname/v1/<account>/<container>/<object>`` |
45 | 49 |
|
46 |
All requests must include an ``X-Auth-Token``, except from those that refer to publicly available files (**TBD**). The process of obtaining the token is still to be determined (**TBD**).
|
|
50 |
All requests must include an ``X-Auth-Token``. The process of obtaining the token is still to be determined (**TBD**). |
|
47 | 51 |
|
48 | 52 |
The allowable request operations and respective return codes per level are presented in the remainder of this chapter. Common to all requests are the following return codes. |
49 | 53 |
|
... | ... | |
95 | 99 |
HEAD |
96 | 100 |
"""" |
97 | 101 |
|
98 |
No request parameters/headers. |
|
102 |
====================== =================================== |
|
103 |
Request Parameter Name Value |
|
104 |
====================== =================================== |
|
105 |
until Optional timestamp |
|
106 |
====================== =================================== |
|
107 |
|
|
108 |
| |
|
99 | 109 |
|
100 | 110 |
========================== ===================== |
101 | 111 |
Reply Header Name Value |
... | ... | |
105 | 115 |
X-Account-Bytes-Used The total number of bytes stored |
106 | 116 |
X-Account-Bytes-Remaining The total number of bytes remaining (**TBD**) |
107 | 117 |
X-Account-Last-Login The last login (**TBD**) |
118 |
X-Account-Until-Timestamp The last account modification date until the timestamp provided |
|
108 | 119 |
X-Account-Meta-* Optional user defined metadata |
109 |
Last-Modified The last object modification date
|
|
120 |
Last-Modified The last account modification date (regardless of ``until``)
|
|
110 | 121 |
========================== ===================== |
111 | 122 |
|
112 | 123 |
| |
... | ... | |
136 | 147 |
limit The amount of results requested (default is 10000) |
137 | 148 |
marker Return containers with name lexicographically after marker |
138 | 149 |
format Optional extended reply type (can be ``json`` or ``xml``) |
150 |
until Optional timestamp |
|
139 | 151 |
====================== ========================= |
140 | 152 |
|
141 | 153 |
The reply is a list of container names. Account headers (as in a ``HEAD`` request) will also be included. |
142 | 154 |
If a ``format=xml`` or ``format=json`` argument is given, extended information on the containers will be returned, serialized in the chosen format. |
143 | 155 |
For each container, the information will include all container metadata (names will be in lower case and with hyphens replaced with underscores): |
144 | 156 |
|
145 |
=================== ============================ |
|
146 |
Name Description |
|
147 |
=================== ============================ |
|
148 |
name The name of the container |
|
149 |
count The number of objects inside the container |
|
150 |
bytes The total size of the objects inside the container |
|
151 |
last_modified The last object modification date |
|
152 |
x_container_meta_* Optional user defined metadata |
|
153 |
=================== ============================ |
|
157 |
=========================== ============================ |
|
158 |
Name Description |
|
159 |
=========================== ============================ |
|
160 |
name The name of the container |
|
161 |
count The number of objects inside the container |
|
162 |
bytes The total size of the objects inside the container |
|
163 |
last_modified The last container modification date (regardless of ``until``) |
|
164 |
x_container_until_timestamp The last container modification date until the timestamp provided |
|
165 |
x_container_meta_* Optional user defined metadata |
|
166 |
=========================== ============================ |
|
154 | 167 |
|
155 | 168 |
For examples of container details returned in JSON/XML formats refer to the OOS API documentation. |
156 | 169 |
|
... | ... | |
205 | 218 |
HEAD |
206 | 219 |
"""" |
207 | 220 |
|
208 |
No request parameters/headers. |
|
221 |
====================== =================================== |
|
222 |
Request Parameter Name Value |
|
223 |
====================== =================================== |
|
224 |
until Optional timestamp |
|
225 |
====================== =================================== |
|
209 | 226 |
|
210 |
========================== =============================== |
|
211 |
Reply Header Name Value |
|
212 |
========================== =============================== |
|
213 |
X-Container-Object-Count The total number of objects in the container |
|
214 |
X-Container-Bytes-Used The total number of bytes of all objects stored |
|
215 |
X-Container-Meta-* Optional user defined metadata |
|
216 |
X-Container-Object-Meta A list with all meta keys used by objects |
|
217 |
X-Container-Block-Size The block size used by the storage backend |
|
218 |
X-Container-Block-Hash The hash algorithm used for block identifiers in object hashmaps |
|
219 |
Last-Modified The last object modification date |
|
220 |
========================== =============================== |
|
227 |
| |
|
228 |
|
|
229 |
=========================== =============================== |
|
230 |
Reply Header Name Value |
|
231 |
=========================== =============================== |
|
232 |
X-Container-Object-Count The total number of objects in the container |
|
233 |
X-Container-Bytes-Used The total number of bytes of all objects stored |
|
234 |
X-Container-Block-Size The block size used by the storage backend |
|
235 |
X-Container-Block-Hash The hash algorithm used for block identifiers in object hashmaps |
|
236 |
X-Container-Until-Timestamp The last container modification date until the timestamp provided |
|
237 |
X-Container-Object-Meta A list with all meta keys used by objects |
|
238 |
X-Container-Meta-* Optional user defined metadata |
|
239 |
Last-Modified The last container modification date (regardless of ``until``) |
|
240 |
=========================== =============================== |
|
221 | 241 |
|
222 | 242 |
The keys returned in ``X-Container-Object-Meta`` are all the unique strings after the ``X-Object-Meta-`` prefix. |
223 | 243 |
|
... | ... | |
250 | 270 |
path Assume ``prefix=path`` and ``delimiter=/`` |
251 | 271 |
format Optional extended reply type (can be ``json`` or ``xml``) |
252 | 272 |
meta Return objects having the specified meta keys (can be a comma separated list) |
273 |
until Optional timestamp |
|
253 | 274 |
====================== =================================== |
254 | 275 |
|
255 | 276 |
The ``path`` parameter overrides ``prefix`` and ``delimiter``. When using ``path``, results will include objects ending in ``delimiter``. |
... | ... | |
260 | 281 |
If a ``format=xml`` or ``format=json`` argument is given, extended information on the objects will be returned, serialized in the chosen format. |
261 | 282 |
For each object, the information will include all object metadata (names will be in lower case and with hyphens replaced with underscores): |
262 | 283 |
|
263 |
=================== ====================================== |
|
264 |
Name Description |
|
265 |
=================== ====================================== |
|
266 |
name The name of the object |
|
267 |
hash The ETag of the object |
|
268 |
bytes The size of the object |
|
269 |
content_type The MIME content type of the object |
|
270 |
content_encoding The encoding of the object (optional) |
|
271 |
last_modified The last object modification date |
|
272 |
x_object_manifest Large object support |
|
273 |
x_object_meta_* Optional user defined metadata |
|
274 |
=================== ====================================== |
|
284 |
========================== ====================================== |
|
285 |
Name Description |
|
286 |
========================== ====================================== |
|
287 |
name The name of the object |
|
288 |
hash The ETag of the object |
|
289 |
bytes The size of the object |
|
290 |
content_type The MIME content type of the object |
|
291 |
content_encoding The encoding of the object (optional) |
|
292 |
content-disposition The presentation style of the object (optional) |
|
293 |
last_modified The last object modification date (regardless of version) |
|
294 |
x_object_version The object's version identifier |
|
295 |
x_object_version_timestamp The object's version timestamp |
|
296 |
x_object_manifest Large object support (optional) |
|
297 |
x_object_public Object is publicly accessible (optional) |
|
298 |
x_object_meta_* Optional user defined metadata |
|
299 |
========================== ====================================== |
|
275 | 300 |
|
276 | 301 |
Extended replies may also include virtual directory markers in separate sections of the ``json`` or ``xml`` results. |
277 | 302 |
Virtual directory markers are only included when ``delimiter`` is explicitly set. They correspond to the substrings up to and including the first occurrence of the delimiter. |
278 | 303 |
In JSON results they appear as dictionaries with only a ``"subdir"`` key. In XML results they appear interleaved with ``<object>`` tags as ``<subdir name="..." />``. |
279 | 304 |
In case there is an object with the same name as a virtual directory marker, the object will be returned. |
280 |
|
|
305 |
|
|
281 | 306 |
For examples of object details returned in JSON/XML formats refer to the OOS API documentation. |
282 | 307 |
|
283 | 308 |
=========================== =============================== |
... | ... | |
367 | 392 |
HEAD |
368 | 393 |
"""" |
369 | 394 |
|
370 |
No request parameters/headers. |
|
395 |
====================== =================================== |
|
396 |
Request Parameter Name Value |
|
397 |
====================== =================================== |
|
398 |
version Optional version identifier |
|
399 |
====================== =================================== |
|
400 |
|
|
401 |
| |
|
371 | 402 |
|
372 | 403 |
========================== =============================== |
373 | 404 |
Reply Header Name Value |
... | ... | |
375 | 406 |
ETag The ETag of the object |
376 | 407 |
Content-Length The size of the object |
377 | 408 |
Content-Type The MIME content type of the object |
378 |
Last-Modified The last object modification date |
|
409 |
Last-Modified The last object modification date (regardless of version)
|
|
379 | 410 |
Content-Encoding The encoding of the object (optional) |
380 | 411 |
Content-Disposition The presentation style of the object (optional) |
412 |
X-Object-Version The object's version identifier |
|
413 |
X-Object-Version-Timestamp The object's version timestamp |
|
381 | 414 |
X-Object-Manifest Large object support (optional) |
415 |
X-Object-Public Object is publicly accessible (optional) |
|
382 | 416 |
X-Object-Meta-* Optional user defined metadata |
383 | 417 |
========================== =============================== |
384 | 418 |
|
... | ... | |
410 | 444 |
Request Parameter Name Value |
411 | 445 |
====================== =================================== |
412 | 446 |
format Optional extended reply type (can be ``json`` or ``xml``) |
447 |
version Optional version identifier or ``list`` (specify a format if requesting a list) |
|
413 | 448 |
====================== =================================== |
414 | 449 |
|
415 |
The reply is the object's data (or part of it), except if a hashmap is requested with the ``format`` parameter. Object headers (as in a ``HEAD`` request) are always included. |
|
450 |
The reply is the object's data (or part of it), except if a hashmap is requested with the ``format`` parameter, or a version list with ``version=list`` (in which case an extended reply format must be specified). Object headers (as in a ``HEAD`` request) are always included.
|
|
416 | 451 |
|
417 | 452 |
Hashmaps expose the underlying storage format of the object. Note that each hash is computed after trimming trailing null bytes of the corresponding block. |
418 | 453 |
|
... | ... | |
420 | 455 |
|
421 | 456 |
:: |
422 | 457 |
|
423 |
{"block_hash": "sha1", "hashes": ["7295c41da03d7f916440b98e32c4a2a39351546c"], "block_size": 131072, "bytes": 242} |
|
458 |
{"block_hash": "sha1", "hashes": ["7295c41da03d7f916440b98e32c4a2a39351546c", ...], "block_size": 131072, "bytes": 242}
|
|
424 | 459 |
|
425 | 460 |
Example ``format=xml`` reply: |
426 | 461 |
|
... | ... | |
432 | 467 |
<hash>...</hash> |
433 | 468 |
</object> |
434 | 469 |
|
470 |
Version lists include the version identifier and timestamp for each available object version. Version identifiers are integers, with the only requirement that newer versions have a larger identifier than previous ones. |
|
471 |
|
|
472 |
Example ``format=json`` reply: |
|
473 |
|
|
474 |
:: |
|
475 |
|
|
476 |
{"versions": [[23, 1307700892], [28, 1307700898], ...]} |
|
477 |
|
|
478 |
Example ``format=xml`` reply: |
|
479 |
|
|
480 |
:: |
|
481 |
|
|
482 |
<?xml version="1.0" encoding="UTF-8"?> |
|
483 |
<object name="file"> |
|
484 |
<version timestamp="1307700892">23</version> |
|
485 |
<version timestamp="1307700898">28</version> |
|
486 |
<version timestamp="...">...</version> |
|
487 |
</object> |
|
488 |
|
|
435 | 489 |
The ``Range`` header may include multiple ranges, as outlined in RFC2616. Then the ``Content-Type`` of the reply will be ``multipart/byteranges`` and each part will include a ``Content-Range`` header. |
436 | 490 |
|
437 | 491 |
========================== =============================== |
... | ... | |
441 | 495 |
Content-Length The size of the data returned |
442 | 496 |
Content-Type The MIME content type of the object |
443 | 497 |
Content-Range The range of data included (only on a single range request) |
444 |
Last-Modified The last object modification date |
|
498 |
Last-Modified The last object modification date (regardless of version)
|
|
445 | 499 |
Content-Encoding The encoding of the object (optional) |
446 | 500 |
Content-Disposition The presentation style of the object (optional) |
501 |
X-Object-Version The object's version identifier |
|
502 |
X-Object-Version-Timestamp The object's version timestamp |
|
447 | 503 |
X-Object-Manifest Large object support (optional) |
504 |
X-Object-Public Object is publicly accessible (optional) |
|
448 | 505 |
X-Object-Meta-* Optional user defined metadata |
449 | 506 |
========================== =============================== |
450 | 507 |
|
... | ... | |
473 | 530 |
Transfer-Encoding Set to ``chunked`` to specify incremental uploading (if used, ``Content-Length`` is ignored) |
474 | 531 |
X-Copy-From The source path in the form ``/<container>/<object>`` |
475 | 532 |
X-Move-From The source path in the form ``/<container>/<object>`` |
533 |
X-Source-Version The source version to copy/move from |
|
476 | 534 |
Content-Encoding The encoding of the object (optional) |
477 | 535 |
Content-Disposition The presentation style of the object (optional) |
478 | 536 |
X-Object-Manifest Large object support (optional) |
537 |
X-Object-Public Object is publicly accessible (optional) |
|
479 | 538 |
X-Object-Meta-* Optional user defined metadata |
480 | 539 |
==================== ================================ |
481 | 540 |
|
... | ... | |
508 | 567 |
Content-Type The MIME content type of the object (optional) |
509 | 568 |
Content-Encoding The encoding of the object (optional) |
510 | 569 |
Content-Disposition The presentation style of the object (optional) |
570 |
X-Source-Version The source version to copy/move from |
|
511 | 571 |
X-Object-Manifest Large object support (optional) |
572 |
X-Object-Public Object is publicly accessible (optional) |
|
512 | 573 |
X-Object-Meta-* Optional user defined metadata |
513 | 574 |
==================== ================================ |
514 | 575 |
|
... | ... | |
540 | 601 |
Content-Encoding The encoding of the object (optional) |
541 | 602 |
Content-Disposition The presentation style of the object (optional) |
542 | 603 |
X-Object-Manifest Large object support (optional) |
604 |
X-Object-Public Object is publicly accessible (optional) |
|
543 | 605 |
X-Object-Meta-* Optional user defined metadata |
544 | 606 |
==================== ================================ |
545 | 607 |
|
546 |
The ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest`` and ``X-Object-Meta-*`` headers are considered to be user defined metadata. The update operation will overwrite all previous values and remove any keys not supplied. |
|
608 |
The ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest``, ``X-Object-Public`` and ``X-Object-Meta-*`` headers are considered to be user defined metadata. The update operation will overwrite all previous values and remove any keys not supplied.
|
|
547 | 609 |
|
548 | 610 |
To update an object: |
549 | 611 |
|
... | ... | |
590 | 652 |
204 (No Content) The request succeeded |
591 | 653 |
=========================== ============================== |
592 | 654 |
|
655 |
Public Objects |
|
656 |
^^^^^^^^^^^^^^ |
|
657 |
|
|
658 |
Objects that are marked as public, via the ``X-Object-Public`` meta, are also available at the corresponding URI ``https://hostname/public/<account>/<container>/<object>`` for ``HEAD`` or ``GET``. Requests for public objects do not need to include an ``X-Auth-Token``. Pithos will ignore request parameters and only include the following headers in the reply (all ``X-Object-*`` meta is hidden). |
|
659 |
|
|
660 |
========================== =============================== |
|
661 |
Reply Header Name Value |
|
662 |
========================== =============================== |
|
663 |
ETag The ETag of the object |
|
664 |
Content-Length The size of the data returned |
|
665 |
Content-Type The MIME content type of the object |
|
666 |
Content-Range The range of data included (only on a single range request) |
|
667 |
Last-Modified The last object modification date (regardless of version) |
|
668 |
Content-Encoding The encoding of the object (optional) |
|
669 |
Content-Disposition The presentation style of the object (optional) |
|
670 |
========================== =============================== |
|
593 | 671 |
|
594 | 672 |
Summary |
595 | 673 |
^^^^^^^ |
... | ... | |
603 | 681 |
* All metadata replies, at all levels, include latest modification information. |
604 | 682 |
* At all levels, a ``GET`` request may use ``If-Modified-Since`` and ``If-Unmodified-Since`` headers. |
605 | 683 |
* Container/object lists include all associated metadata if the reply is of type json/xml. Some names are kept to their OOS API equivalents for compatibility. |
606 |
* Object metadata allowed, in addition to ``X-Object-Meta-*``: ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest``. These are all replaced with every update operation. |
|
684 |
* Object metadata allowed, in addition to ``X-Object-Meta-*``: ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest``, ``X-Object-Public``. These are all replaced with every update operation.
|
|
607 | 685 |
* Multi-range object GET support as outlined in RFC2616. |
608 | 686 |
* Object hashmap retrieval through GET and the ``format`` parameter. |
609 | 687 |
* Partial object updates through POST, using the ``Content-Length``, ``Content-Type``, ``Content-Range`` and ``Transfer-Encoding`` headers. |
610 | 688 |
* Object ``MOVE`` support. |
689 |
* Time-variant account/container listings via the ``until`` parameter. |
|
690 |
* Object versions - parameter ``version`` in HEAD/GET (list versions with GET), ``X-Object-Version-*`` meta in replies, ``X-Source-Version`` in PUT/COPY/MOVE. |
|
691 |
* Publicly accessible objects via ``https://hostname/public``. Control with ``X-Object-Public``. |
|
611 | 692 |
|
612 | 693 |
Clarifications/suggestions: |
613 | 694 |
|
... | ... | |
618 | 699 |
* The ``Accept`` header may be used in requests instead of the ``format`` parameter to specify the desired reply format. The parameter overrides the header. |
619 | 700 |
* Container/object lists use a ``200`` return code if the reply is of type json/xml. The reply will include an empty json/xml. |
620 | 701 |
* In headers, dates are formatted according to RFC 1123. In extended information listings, dates are formatted according to ISO 8601. |
702 |
* The ``Last-Modified`` header value always reflects the actual latest change timestamp, regardless of time control parameters and version requests. Time precondition checks with ``If-Modified-Since`` and ``If-Unmodified-Since`` headers are applied to this value. |
|
621 | 703 |
* While ``X-Object-Manifest`` can be set and unset, large object support is not yet implemented (**TBD**). |
622 | 704 |
|
623 | 705 |
The Pithos Client |
... | ... | |
643 | 725 |
* Moved to trash and then deleted. |
644 | 726 |
* Shared with specific permissions. |
645 | 727 |
* Made public (shared with non-Pithos users). |
646 |
* Set to monitor changes via version tracking.
|
|
728 |
* Restored from previous versions.
|
|
647 | 729 |
|
648 | 730 |
Some of these functions are performed by the client software and some by the Pithos server. Client-driven functionality is based on specific metadata that should be handled equally across implementations. These metadata names are discussed in the next chapter. |
649 | 731 |
|
Also available in: Unified diff