1 Pithos v2 Developer Guide
2 =========================
7 Pithos is a storage service implemented by GRNET (http://www.grnet.gr). Data is stored as objects, organized in containers, belonging to an account. This hierarchy of storage layers has been inspired by the OpenStack Object Storage (OOS) API and similar CloudFiles API by Rackspace. The Pithos API follows the OOS API as closely as possible. One of the design requirements has been to be able to use Pithos with clients built for the OOS, without changes.
9 However, to be able to take full advantage of the Pithos infrastructure, client software should be aware of the extensions that differentiate Pithos from OOS. Pithos objects can be updated, or appended to. They can also be versioned, meaning that the server will track changes, assign version numbers and allow reading previous instances.
11 The storage backend of Pithos is block oriented, which allows for efficient, deduplicated data placement. The block structure of objects is exposed at the API layer, in order to encourage external software to implement advanced data management operations.
13 This document's goals are:
15 * Define the Pithos ReST API that allows the storage and retrieval of data and metadata via HTTP calls
16 * Specify metadata semantics and user interface guidelines for a common experience across client software implementations
18 The present document is meant to be read alongside the OOS API documentation. Thus, it is suggested that the reader is familiar with associated technologies, the OOS API as well as the first version of the Pithos API. This document refers to the second version of Pithos. Information on the first version of the storage API can be found at http://code.google.com/p/gss.
20 Whatever marked as to be determined (**TBD**), should not be considered by implementors.
25 ========================= ================================
27 ========================= ================================
28 0.2 (May 31, 2011) Add object meta listing and filtering in containers.
29 \ Include underlying storage characteristics in container meta.
30 \ Support for partial object updates through POST.
31 \ Expose object hashmaps through GET.
32 \ Support for multi-range object GET requests.
33 0.1 (May 17, 2011) Initial release. Based on OpenStack Object Storage Developer Guide API v1 (Apr. 15, 2011).
34 ========================= ================================
39 The URI requests supported by Pithos follow one of the following forms:
41 * Top level: ``https://hostname/v1/``
42 * Account level: ``https://hostname/v1/<account>``
43 * Container level: ``https://hostname/v1/<account>/<container>``
44 * Object level: ``https://hostname/v1/<account>/<container>/<object>``
46 All requests must include an ``X-Auth-Token``, except from those that refer to publicly available files (**TBD**). The process of obtaining the token is still to be determined (**TBD**).
48 The allowable request operations and respective return codes per level are presented in the remainder of this chapter. Common to all requests are the following return codes.
50 ========================= ================================
51 Return Code Description
52 ========================= ================================
53 400 (Bad Request) The request is invalid
54 401 (Unauthorized) Request not allowed
55 404 (Not Found) The requested resource was not found
56 503 (Service Unavailable) The request cannot be completed because of an internal error
57 ========================= ================================
64 ========= ==================
66 ========= ==================
67 GET Authentication. This is kept for compatibility with the OOS API
68 ========= ==================
73 If the ``X-Auth-User`` and ``X-Auth-Key`` headers are given, a dummy ``X-Auth-Token`` and ``X-Storage-Url`` will be replied, which can be used as a guest token/namespace for testing Pithos.
75 ================ =====================
76 Return Code Description
77 ================ =====================
78 204 (No Content) The request succeeded
79 ================ =====================
87 ========= ==================
89 ========= ==================
90 HEAD Retrieve account metadata
92 POST Update account metadata
93 ========= ==================
98 No request parameters/headers.
100 ========================== =====================
101 Reply Header Name Value
102 ========================== =====================
103 X-Account-Container-Count The total number of containers
104 X-Account-Object-Count The total number of objects (**TBD**)
105 X-Account-Bytes-Used The total number of bytes stored
106 X-Account-Bytes-Remaining The total number of bytes remaining (**TBD**)
107 X-Account-Last-Login The last login (**TBD**)
108 X-Account-Meta-* Optional user defined metadata
109 Last-Modified The last object modification date
110 ========================== =====================
114 ================ =====================
115 Return Code Description
116 ================ =====================
117 204 (No Content) The request succeeded
118 ================ =====================
124 ==================== ===========================
125 Request Header Name Value
126 ==================== ===========================
127 If-Modified-Since Retrieve if account has changed since provided timestamp
128 If-Unmodified-Since Retrieve if account has not changed since provided timestamp
129 ==================== ===========================
133 ====================== =========================
134 Request Parameter Name Value
135 ====================== =========================
136 limit The amount of results requested (default is 10000)
137 marker Return containers with name lexicographically after marker
138 format Optional extended reply type (can be ``json`` or ``xml``)
139 ====================== =========================
141 The reply is a list of container names. Account headers (as in a ``HEAD`` request) will also be included.
142 If a ``format=xml`` or ``format=json`` argument is given, extended information on the containers will be returned, serialized in the chosen format.
143 For each container, the information will include all container metadata (names will be in lower case and with hyphens replaced with underscores):
145 =================== ============================
147 =================== ============================
148 name The name of the container
149 count The number of objects inside the container
150 bytes The total size of the objects inside the container
151 last_modified The last object modification date
152 x_container_meta_* Optional user defined metadata
153 =================== ============================
155 For examples of container details returned in JSON/XML formats refer to the OOS API documentation.
157 =========================== =====================
158 Return Code Description
159 =========================== =====================
160 200 (OK) The request succeeded
161 204 (No Content) The account has no containers (only for non-extended replies)
162 304 (Not Modified) The account has not been modified
163 412 (Precondition Failed) The condition set can not be satisfied
164 =========================== =====================
166 Will use a ``200`` return code if the reply is of type json/xml.
172 ==================== ===========================
173 Request Header Name Value
174 ==================== ===========================
175 X-Account-Meta-* Optional user defined metadata
176 ==================== ===========================
178 No reply content/headers.
180 The update operation will overwrite all user defined metadata.
182 ================ ===============================
183 Return Code Description
184 ================ ===============================
185 202 (Accepted) The request has been accepted
186 ================ ===============================
194 ========= ============================
195 Operation Description
196 ========= ============================
197 HEAD Retrieve container metadata
199 PUT Create/update container
200 POST Update container metadata
201 DELETE Delete container
202 ========= ============================
208 No request parameters/headers.
210 ========================== ===============================
211 Reply Header Name Value
212 ========================== ===============================
213 X-Container-Object-Count The total number of objects in the container
214 X-Container-Bytes-Used The total number of bytes of all objects stored
215 X-Container-Meta-* Optional user defined metadata
216 X-Container-Object-Meta A list with all meta keys used by objects
217 X-Container-Block-Size The block size used by the storage backend
218 X-Container-Block-Hash The hash algorithm used for block identifiers in object hashmaps
219 Last-Modified The last object modification date
220 ========================== ===============================
222 The keys returned in ``X-Container-Object-Meta`` are all the unique strings after the ``X-Object-Meta-`` prefix.
224 ================ ===============================
225 Return Code Description
226 ================ ===============================
227 204 (No Content) The request succeeded
228 ================ ===============================
234 ==================== ===========================
235 Request Header Name Value
236 ==================== ===========================
237 If-Modified-Since Retrieve if container has changed since provided timestamp
238 If-Unmodified-Since Retrieve if container has not changed since provided timestamp
239 ==================== ===========================
243 ====================== ===================================
244 Request Parameter Name Value
245 ====================== ===================================
246 limit The amount of results requested (default is 10000)
247 marker Return containers with name lexicographically after marker
248 prefix Return objects starting with prefix
249 delimiter Return objects up to the delimiter (discussion follows)
250 path Assume ``prefix=path`` and ``delimiter=/``
251 format Optional extended reply type (can be ``json`` or ``xml``)
252 meta Return objects having the specified meta keys (can be a comma separated list)
253 ====================== ===================================
255 The ``path`` parameter overrides ``prefix`` and ``delimiter``. When using ``path``, results will include objects ending in ``delimiter``.
257 The keys given with ``meta`` will be matched with the strings after the ``X-Object-Meta-`` prefix.
259 The reply is a list of object names. Container headers (as in a ``HEAD`` request) will also be included.
260 If a ``format=xml`` or ``format=json`` argument is given, extended information on the objects will be returned, serialized in the chosen format.
261 For each object, the information will include all object metadata (names will be in lower case and with hyphens replaced with underscores):
263 =================== ======================================
265 =================== ======================================
266 name The name of the object
267 hash The ETag of the object
268 bytes The size of the object
269 content_type The MIME content type of the object
270 content_encoding The encoding of the object (optional)
271 last_modified The last object modification date
272 x_object_manifest Large object support
273 x_object_meta_* Optional user defined metadata
274 =================== ======================================
276 Extended replies may also include virtual directory markers in separate sections of the ``json`` or ``xml`` results.
277 Virtual directory markers are only included when ``delimiter`` is explicitly set. They correspond to the substrings up to and including the first occurrence of the delimiter.
278 In JSON results they appear as dictionaries with only a ``"subdir"`` key. In XML results they appear interleaved with ``<object>`` tags as ``<subdir name="..." />``.
279 In case there is an object with the same name as a virtual directory marker, the object will be returned.
281 For examples of object details returned in JSON/XML formats refer to the OOS API documentation.
283 =========================== ===============================
284 Return Code Description
285 =========================== ===============================
286 200 (OK) The request succeeded
287 204 (No Content) The account has no containers (only for non-extended replies)
288 304 (Not Modified) The container has not been modified
289 412 (Precondition Failed) The condition set can not be satisfied
290 =========================== ===============================
292 Will use a ``200`` return code if the reply is of type json/xml.
298 ==================== ================================
299 Request Header Name Value
300 ==================== ================================
301 X-Container-Meta-* Optional user defined metadata
302 ==================== ================================
304 No reply content/headers.
306 ================ ===============================
307 Return Code Description
308 ================ ===============================
309 201 (Created) The container has been created
310 202 (Accepted) The request has been accepted
311 ================ ===============================
317 ==================== ================================
318 Request Header Name Value
319 ==================== ================================
320 X-Container-Meta-* Optional user defined metadata
321 ==================== ================================
323 No reply content/headers.
325 The update operation will overwrite all user defined metadata.
327 ================ ===============================
328 Return Code Description
329 ================ ===============================
330 202 (Accepted) The request has been accepted
331 ================ ===============================
337 No request parameters/headers.
339 No reply content/headers.
341 ================ ===============================
342 Return Code Description
343 ================ ===============================
344 204 (No Content) The request succeeded
345 409 (Conflict) The container is not empty
346 ================ ===============================
354 ========= =================================
355 Operation Description
356 ========= =================================
357 HEAD Retrieve object metadata
359 PUT Write object data or copy/move object
362 POST Update object metadata/data
364 ========= =================================
370 No request parameters/headers.
372 ========================== ===============================
373 Reply Header Name Value
374 ========================== ===============================
375 ETag The ETag of the object
376 Content-Length The size of the object
377 Content-Type The MIME content type of the object
378 Last-Modified The last object modification date
379 Content-Encoding The encoding of the object (optional)
380 Content-Disposition The presentation style of the object (optional)
381 X-Object-Manifest Large object support (optional)
382 X-Object-Meta-* Optional user defined metadata
383 ========================== ===============================
387 ================ ===============================
388 Return Code Description
389 ================ ===============================
390 204 (No Content) The request succeeded
391 ================ ===============================
397 ==================== ================================
398 Request Header Name Value
399 ==================== ================================
400 Range Optional range of data to retrieve
401 If-Match Retrieve if ETags match
402 If-None-Match Retrieve if ETags don't match
403 If-Modified-Since Retrieve if object has changed since provided timestamp
404 If-Unmodified-Since Retrieve if object has not changed since provided timestamp
405 ==================== ================================
409 ====================== ===================================
410 Request Parameter Name Value
411 ====================== ===================================
412 format Optional extended reply type (can be ``json`` or ``xml``)
413 ====================== ===================================
415 The reply is the object's data (or part of it), except if a hashmap is requested with the ``format`` parameter. Object headers (as in a ``HEAD`` request) are always included.
417 Hashmaps expose the underlying storage format of the object. Note that each hash is computed after trimming trailing null bytes of the corresponding block.
419 Example ``format=json`` reply:
423 {"block_hash": "sha1", "hashes": ["7295c41da03d7f916440b98e32c4a2a39351546c"], "block_size": 131072, "bytes": 242}
425 Example ``format=xml`` reply:
429 <?xml version="1.0" encoding="UTF-8"?>
430 <object name="file" bytes="24223726" block_size="131072" block_hash="sha1">
431 <hash>7295c41da03d7f916440b98e32c4a2a39351546c</hash>
435 The ``Range`` header may include multiple ranges, as outlined in RFC2616. Then the ``Content-Type`` of the reply will be ``multipart/byteranges`` and each part will include a ``Content-Range`` header.
437 ========================== ===============================
438 Reply Header Name Value
439 ========================== ===============================
440 ETag The ETag of the object
441 Content-Length The size of the data returned
442 Content-Type The MIME content type of the object
443 Content-Range The range of data included (only on a single range request)
444 Last-Modified The last object modification date
445 Content-Encoding The encoding of the object (optional)
446 Content-Disposition The presentation style of the object (optional)
447 X-Object-Manifest Large object support (optional)
448 X-Object-Meta-* Optional user defined metadata
449 ========================== ===============================
453 =========================== ==============================
454 Return Code Description
455 =========================== ==============================
456 200 (OK) The request succeeded
457 206 (Partial Content) The range request succeeded
458 304 (Not Modified) The object has not been modified
459 412 (Precondition Failed) The condition set can not be satisfied
460 416 (Range Not Satisfiable) The requested range is out of limits
461 =========================== ==============================
467 ==================== ================================
468 Request Header Name Value
469 ==================== ================================
470 ETag The MD5 hash of the object (optional to check written data)
471 Content-Length The size of the data written
472 Content-Type The MIME content type of the object
473 Transfer-Encoding Set to ``chunked`` to specify incremental uploading (if used, ``Content-Length`` is ignored)
474 X-Copy-From The source path in the form ``/<container>/<object>``
475 X-Move-From The source path in the form ``/<container>/<object>``
476 Content-Encoding The encoding of the object (optional)
477 Content-Disposition The presentation style of the object (optional)
478 X-Object-Manifest Large object support (optional)
479 X-Object-Meta-* Optional user defined metadata
480 ==================== ================================
484 ========================== ===============================
485 Reply Header Name Value
486 ========================== ===============================
487 ETag The MD5 hash of the object (on create)
488 ========================== ===============================
492 =========================== ==============================
493 Return Code Description
494 =========================== ==============================
495 201 (Created) The object has been created
496 411 (Length Required) Missing ``Content-Length`` or ``Content-Type`` in the request
497 422 (Unprocessable Entity) The MD5 checksum of the data written to the storage system does not match the (optionally) supplied ETag value
498 =========================== ==============================
504 ==================== ================================
505 Request Header Name Value
506 ==================== ================================
507 Destination The destination path in the form ``/<container>/<object>``
508 Content-Type The MIME content type of the object (optional)
509 Content-Encoding The encoding of the object (optional)
510 Content-Disposition The presentation style of the object (optional)
511 X-Object-Manifest Large object support (optional)
512 X-Object-Meta-* Optional user defined metadata
513 ==================== ================================
515 No reply content/headers.
517 =========================== ==============================
518 Return Code Description
519 =========================== ==============================
520 201 (Created) The object has been created
521 =========================== ==============================
533 ==================== ================================
534 Request Header Name Value
535 ==================== ================================
536 Content-Length The size of the data written (optional, to update)
537 Content-Type The MIME content type of the object (optional, to update)
538 Content-Range The range of data supplied (optional, to update)
539 Transfer-Encoding Set to ``chunked`` to specify incremental uploading (if used, ``Content-Length`` is ignored)
540 Content-Encoding The encoding of the object (optional)
541 Content-Disposition The presentation style of the object (optional)
542 X-Object-Manifest Large object support (optional)
543 X-Object-Meta-* Optional user defined metadata
544 ==================== ================================
546 The ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest`` and ``X-Object-Meta-*`` headers are considered to be user defined metadata. The update operation will overwrite all previous values and remove any keys not supplied.
550 * Supply ``Content-Length`` (except if using chunked transfers), ``Content-Type`` and ``Content-Range`` headers.
551 * Set ``Content-Type`` to ``application/octet-stream``.
552 * Set ``Content-Range`` as specified in RFC2616, with the following differences:
554 * Client software MAY omit ``last-byte-pos`` of if the length of the range being transferred is unknown or difficult to determine.
555 * Client software SHOULD not specify the ``instance-length`` (use a ``*``), unless there is a reason for performing a size check at the server.
556 * If ``Content-Range`` used has a ``byte-range-resp-spec = *``, data supplied will be appended to the object.
558 A data update will trigger an ETag change. The new ETag will not correspond to the object's MD5 sum (**TBD**) and will be included in reply headers.
560 No reply content. No reply headers if only metadata is updated.
562 ========================== ===============================
563 Reply Header Name Value
564 ========================== ===============================
565 ETag The new ETag of the object (data updated)
566 ========================== ===============================
570 =========================== ==============================
571 Return Code Description
572 =========================== ==============================
573 202 (Accepted) The request has been accepted (not a data update)
574 204 (No Content) The request succeeded (data updated)
575 416 (Range Not Satisfiable) The supplied range is out of limits or invalid size
576 =========================== ==============================
582 No request parameters/headers.
584 No reply content/headers.
586 =========================== ==============================
587 Return Code Description
588 =========================== ==============================
589 204 (No Content) The request succeeded
590 =========================== ==============================
596 List of differences from the OOS API:
598 * Support for ``X-Account-Meta-*`` style headers at the account level. Use ``POST`` to update.
599 * Support for ``X-Container-Meta-*`` style headers at the account level. Can be set when creating via ``PUT``. Use ``POST`` to update.
600 * Header ``X-Container-Object-Meta`` at the container level and parameter ``meta`` in container listings.
601 * Headers ``X-Container-Block-*`` at the container level, exposing the underlying storage characteristics.
602 * All metadata replies, at all levels, include latest modification information.
603 * At all levels, a ``GET`` request may use ``If-Modified-Since`` and ``If-Unmodified-Since`` headers.
604 * Container/object lists include all associated metadata if the reply is of type json/xml. Some names are kept to their OOS API equivalents for compatibility.
605 * Object metadata allowed, in addition to ``X-Object-Meta-*``: ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest``. These are all replaced with every update operation.
606 * Multi-range object GET support as outlined in RFC2616.
607 * Object hashmap retrieval through GET and the ``format`` parameter.
608 * Partial object updates through POST, using the ``Content-Length``, ``Content-Type``, ``Content-Range`` and ``Transfer-Encoding`` headers.
609 * Object ``MOVE`` support.
611 Clarifications/suggestions:
613 * Authentication is done by another system. The token is used in the same way, but it is obtained differently. The top level ``GET`` request is kept compatible with the OOS API and allows for guest/testing operations.
614 * Some processing is done in the variable part of all ``X-*-Meta-*`` headers. If it includes underscores, they will be converted to dashes and the first letter of all intra-dash strings will be capitalized.
615 * A ``GET`` reply for a level will include all headers of the corresponding ``HEAD`` request.
616 * To avoid conflicts between objects and virtual directory markers in container listings, it is recommended that object names do not end with the delimiter used.
617 * The ``Accept`` header may be used in requests instead of the ``format`` parameter to specify the desired reply format. The parameter overrides the header.
618 * Container/object lists use a ``200`` return code if the reply is of type json/xml. The reply will include an empty json/xml.
619 * In headers, dates are formatted according to RFC 1123. In extended information listings, dates are formatted according to ISO 8601.
620 * While ``X-Object-Manifest`` can be set and unset, large object support is not yet implemented (**TBD**).
628 Hopefully this API will allow for a multitude of client implementations, each supporting a different device or operating system. All clients will be able to manipulate containers and objects - even software only designed for OOS API compatibility. But a Pithos interface should not be only about showing containers and folders. There are some extra user interface elements and functionalities that should be common to all implementations.
630 Upon entrance to the service, a user is presented with the following elements - which can be represented as folders or with other related icons:
632 * The ``home`` element, which is used as the default entry point to the user's "files". Objects under ``home`` are represented in the usual hierarchical organization of folders and files.
633 * The ``trash`` element, which contains files that have been marked for deletion, but can still be recovered.
634 * The ``shared`` element, which contains all objects shared by the user to other users of the system.
635 * The ``others`` element, which contains all objects that other users share with the user.
636 * The ``tags`` element, which lists the names of tags the user has defined. This can be an entry point to list all files that have been assigned a specific tag or manage tags in general (remove a tag completely, rename a tag etc.).
637 * The ``groups`` element, which contains the names of groups the user has defined. Each group consists of a user list. Group creation, deletion, and manipulation is carried out by actions originating here.
639 Objects in Pithos can be:
641 * Assigned custom tags.
642 * Moved to trash and then deleted.
643 * Shared with specific permissions.
644 * Made public (shared with non-Pithos users).
645 * Set to monitor changes via version tracking.
647 Some of these functions are performed by the client software and some by the Pithos server. Client-driven functionality is based on specific metadata that should be handled equally across implementations. These metadata names are discussed in the next chapter.
649 Conventions and Metadata Specification
650 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
652 Pithos clients should use the ``pithos`` container for all Pithos objects. Object names use the ``/`` delimiter to impose a hierarchy of folders and files.
654 At the object level, tags are implemented by managing metadata keys. The client software should allow the user to use any string as a tag (except ``trash``) and then set the corresponding ``X-Object-Meta-<tag>`` key at the server. The API extensions provided, allow for listing all tags in a container and filtering object listings based on one or more tags. The tag list is sufficient for implementing the ``tags`` element, either as a special, virtual folder (as done in the first version of Pithos), or as an application menu.
656 To manage the deletion of files use the same API and the ``X-Object-Meta-Trash`` key. The string ``trash`` can not be used as a tag. The ``trash`` element should be presented as a folder, although with no hierarchy.
658 The metadata specification is summarized in the following table.
660 =========================== ==============================
662 =========================== ==============================
663 X-Object-Meta-Trash Set to ``true`` if the object has been moved to the trash
664 X-Object-Meta-* Use for other tags that apply to the object
665 =========================== ==============================