Revision e9285524 docs/source/devguide.rst
b/docs/source/devguide.rst | ||
---|---|---|
6 | 6 |
|
7 | 7 |
Pithos is a storage service implemented by GRNET (http://www.grnet.gr). Data is stored as objects, organized in containers, belonging to an account. This hierarchy of storage layers has been inspired by the OpenStack Object Storage (OOS) API and similar CloudFiles API by Rackspace. The Pithos API follows the OOS API as closely as possible. One of the design requirements has been to be able to use Pithos with clients built for the OOS, without changes. |
8 | 8 |
|
9 |
However, to be able to take full advantage of the Pithos infrastructure, client software should be aware of the extensions that differentiate Pithos from OOS. Pithos objects can be updated, or appended to. They can also be versioned, meaning that the server will track changes, assign version numbers and allow reading previous instances. |
|
10 |
|
|
11 |
The storage backend of Pithos is block oriented, which allows for efficient, deduplicated data placement. The block structure of objects is exposed at the API layer, in order to encourage external software to implement advanced data management operations. |
|
12 |
|
|
9 | 13 |
This document's goals are: |
10 | 14 |
|
11 | 15 |
* Define the Pithos ReST API that allows the storage and retrieval of data and metadata via HTTP calls |
... | ... | |
21 | 25 |
========================= ================================ |
22 | 26 |
Revision Description |
23 | 27 |
========================= ================================ |
24 |
0.2 (May 25, 2011) Add object meta listing and filtering in containers. |
|
28 |
0.2 (May 29, 2011) Add object meta listing and filtering in containers. |
|
29 |
\ Support for partial object updates through POST. |
|
30 |
\ Expose object hashmaps through GET. |
|
25 | 31 |
\ Support for multi-range object GET requests. |
26 | 32 |
0.1 (May 17, 2011) Initial release. Based on OpenStack Object Storage Developer Guide API v1 (Apr. 15, 2011). |
27 | 33 |
========================= ================================ |
... | ... | |
38 | 44 |
|
39 | 45 |
All requests must include an ``X-Auth-Token``, except from those that refer to publicly available files (**TBD**). The process of obtaining the token is still to be determined (**TBD**). |
40 | 46 |
|
41 |
The allowable request operations and corresponding return codes per level are presented in the remainder of this chapter. Common to all requests are the following return codes.
|
|
47 |
The allowable request operations and respective return codes per level are presented in the remainder of this chapter. Common to all requests are the following return codes.
|
|
42 | 48 |
|
43 | 49 |
========================= ================================ |
44 | 50 |
Return Code Description |
... | ... | |
253 | 259 |
Name Description |
254 | 260 |
=================== ====================================== |
255 | 261 |
name The name of the object |
256 |
hash The MD5 hash of the object
|
|
262 |
hash The ETag of the object
|
|
257 | 263 |
bytes The size of the object |
258 | 264 |
content_type The MIME content type of the object |
259 | 265 |
content_encoding The encoding of the object (optional) |
... | ... | |
359 | 365 |
========================== =============================== |
360 | 366 |
Reply Header Name Value |
361 | 367 |
========================== =============================== |
362 |
ETag The MD5 hash of the object
|
|
368 |
ETag The ETag of the object
|
|
363 | 369 |
Content-Length The size of the object |
364 | 370 |
Content-Type The MIME content type of the object |
365 | 371 |
Last-Modified The last object modification date |
... | ... | |
391 | 397 |
If-Unmodified-Since Retrieve if object has not changed since provided timestamp |
392 | 398 |
==================== ================================ |
393 | 399 |
|
394 |
The reply is the object's data (or part of it). Object headers (as in a ``HEAD`` request) will also be included. |
|
400 |
| |
|
401 |
|
|
402 |
====================== =================================== |
|
403 |
Request Parameter Name Value |
|
404 |
====================== =================================== |
|
405 |
format Optional extended reply type (can be ``json`` or ``xml``) |
|
406 |
====================== =================================== |
|
407 |
|
|
408 |
The reply is the object's data (or part of it), except if a hashmap is requested with the ``format`` parameter. Object headers (as in a ``HEAD`` request) are always included. |
|
409 |
|
|
410 |
Hashmaps expose the underlying storage format of the object: |
|
411 |
|
|
412 |
* Blocksize of 4MB. |
|
413 |
* Blocks stored indexed by SHA256 hash. |
|
414 |
* Hash is computed after trimming trailing null bytes. |
|
415 |
|
|
416 |
Example ``format=json`` reply: |
|
417 |
|
|
418 |
:: |
|
419 |
|
|
420 |
{"hashes": ["7295c41da03d7f916440b98e32c4a2a39351546c", ...], "bytes": 24223726} |
|
421 |
|
|
422 |
Example ``format=xml`` reply: |
|
423 |
|
|
424 |
:: |
|
425 |
|
|
426 |
<?xml version="1.0" encoding="UTF-8"?> |
|
427 |
<object name="file" bytes="24223726"> |
|
428 |
<hash>7295c41da03d7f916440b98e32c4a2a39351546c</hash> |
|
429 |
<hash>...</hash> |
|
430 |
</object> |
|
395 | 431 |
|
396 | 432 |
The ``Range`` header may include multiple ranges, as outlined in RFC2616. Then the ``Content-Type`` of the reply will be ``multipart/byteranges`` and each part will include a ``Content-Range`` header. |
397 | 433 |
|
398 | 434 |
========================== =============================== |
399 | 435 |
Reply Header Name Value |
400 | 436 |
========================== =============================== |
401 |
ETag The MD5 hash of the object
|
|
437 |
ETag The ETag of the object
|
|
402 | 438 |
Content-Length The size of the data returned |
403 | 439 |
Content-Type The MIME content type of the object |
404 | 440 |
Content-Range The range of data included (only on a single range request) |
... | ... | |
494 | 530 |
==================== ================================ |
495 | 531 |
Request Header Name Value |
496 | 532 |
==================== ================================ |
533 |
Content-Length The size of the data written (optional, to update) |
|
534 |
Content-Type The MIME content type of the object (optional, to update) |
|
535 |
Content-Range The range of data supplied (optional, to update) |
|
536 |
Transfer-Encoding Set to ``chunked`` to specify incremental uploading (if used, ``Content-Length`` is ignored) |
|
497 | 537 |
Content-Encoding The encoding of the object (optional) |
498 | 538 |
Content-Disposition The presentation style of the object (optional) |
499 | 539 |
X-Object-Manifest Large object support (optional) |
500 | 540 |
X-Object-Meta-* Optional user defined metadata |
501 | 541 |
==================== ================================ |
502 | 542 |
|
503 |
No reply content/headers. |
|
543 |
The ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest`` and ``X-Object-Meta-*`` headers are considered to be user defined metadata. The update operation will overwrite all previous values and remove any keys not supplied. |
|
544 |
|
|
545 |
To update an object: |
|
546 |
|
|
547 |
* Supply ``Content-Length`` (except if using chunked transfers), ``Content-Type`` and ``Content-Range`` headers. |
|
548 |
* Set ``Content-Type`` to ``application/octet-stream``. |
|
549 |
* Set ``Content-Range`` as specified in RFC2616, with the following differences: |
|
550 |
|
|
551 |
* Client software MAY omit ``last-byte-pos`` of if the length of the range being transferred is unknown or difficult to determine. |
|
552 |
* Client software SHOULD not specify the ``instance-length`` (use a ``*``), unless there is a reason for performing a size check at the server. |
|
553 |
* If ``Content-Range`` used has a ``byte-range-resp-spec = *``, data supplied will be appended to the object. |
|
554 |
|
|
555 |
A data update will trigger an ETag change. The new ETag will not correspond to the object's MD5 sum (**TBD**) and will be included in reply headers. |
|
504 | 556 |
|
505 |
The allowed headers are considered to be user defined metadata. The update operation will overwrite all previous values and remove any keys not supplied. |
|
557 |
No reply content. No reply headers if only metadata is updated. |
|
558 |
|
|
559 |
========================== =============================== |
|
560 |
Reply Header Name Value |
|
561 |
========================== =============================== |
|
562 |
ETag The new ETag of the object (data updated) |
|
563 |
========================== =============================== |
|
564 |
|
|
565 |
| |
|
506 | 566 |
|
507 | 567 |
=========================== ============================== |
508 | 568 |
Return Code Description |
509 | 569 |
=========================== ============================== |
510 |
202 (Accepted) The request has been accepted |
|
570 |
202 (Accepted) The request has been accepted (not a data update) |
|
571 |
204 (No Content) The request succeeded (data updated) |
|
572 |
416 (Range Not Satisfiable) The supplied range is out of limits or invalid size |
|
511 | 573 |
=========================== ============================== |
512 | 574 |
|
513 | 575 |
|
... | ... | |
538 | 600 |
* Container/object lists include all associated metadata if the reply is of type json/xml. Some names are kept to their OOS API equivalents for compatibility. |
539 | 601 |
* Object metadata allowed, in addition to ``X-Object-Meta-*``: ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest``. These are all replaced with every update operation. |
540 | 602 |
* Multi-range object GET support as outlined in RFC2616. |
603 |
* Object hashmap retrieval through GET and the ``format`` parameter. |
|
604 |
* Partial object updates through POST, using the ``Content-Length``, ``Content-Type``, ``Content-Range`` and ``Transfer-Encoding`` headers. |
|
541 | 605 |
* Object ``MOVE`` support. |
542 | 606 |
|
543 | 607 |
Clarifications/suggestions: |
Also available in: Unified diff