Pithos API v. 2

Basic Assumptions

The Pithos API follows version 1.1 of the Rackspace Cloud Files API as closely as possible.

The present document is meant to be read alongside the Cloud Files API documentation, and explains how the Pithos functionality is implemented in the terms of it.

Authorization

All requests must be authorized, except from those that refer to publicly available files. Authorization is via an Authorization header signed by the token returned to the user as a result of a successful login process.

URI Forms

The URI requests supported by Pithos start by:

https://hostname/v2/<account>

where <account> is the username of the user in Pithos.

Retrieve Account Metadata

A HEAD request against an account, that against a URI of the form:

https://hostname/v2/<account>

will return the account metadata as HTTP headers. The following headers are used by Pithos:

Header Name Value
X-Account-Container-Count The total number of containers for the account
X-Account-Bytes-Used The total number of bytes stored in Pithos for the account

Return Codes

Code Description
204 (No Content) The request succeeds
401 (Unauthorized) Request for invalid account or invalid access token

List Containers

A GET request to a URI of the following form:

https://hostname/v2/<account>

will return a list of all existing containers of the user, ordered by name. Note that this list will contain some Special Containers.

List Container Details

If a format=xml or format=json argument is given, extended information on the containers will be returned, serialized in the chosen format. For each container, the information will include:

Name Description
name The name of the container
count The number of objects inside the container
bytes The total size of the objects inside the container

The following is an example of object details returned in JSON format:


{"name":"test_container_1",
"count":2,
"bytes":78}

If more than one container's details are returned, they are returned in a JSON array, with each element as above.

The following is an example of object details returned in XML format:


<?xml version="1.0" encoding="UTF-8"?>

<account name="FooBar"/>
  <container>
    <name>test_container_1</name>
    <count>2</count>
    <bytes>78</bytes>
  </container>
</account>

If more than one container's details are returned, they are put in a sequence of <container> elements.

Return Codes

Code Description
200 (OK) The request succeeds
204 (No Content) The account has no containers
401 (Unauthorized) Request for invalid account or invalid access token

Create Container

PUT operations against a container are used to create that container. Containers may be assigned custo metadata by including additional HTTP headers on the PUT request. The custom metadata is assigned to a container via HTTP headers identified with the X-Container-Meta- prefix.

Return Codes

Code Description
201 (Created) The container was created as requested
202 (Accepted) The container already exists
401 (Unauthorized) Request for invalid account or invalid access token

Delete Container

DELETE operations against a container are used to permanently remove that container. The container must be empty before it can be deleted. The user cannot delete the Home Container.

Return Codes

Code Description
204 (No Content) The container was successfully deleted
401 (Unauthorized) Request for invalid account or invalid access token
404 (Not Found) The requested container was not found
409 (Conflict) The container is not empty, or is the user's Home Container

Retrieve Container Metadata

HEAD operations against a storage container are used to determine the number of objects, the total bytes of all objects stored in the container, and any additional metadata. These are returned as HTTP headers. The following headers are used by Pithos:

Header Name Value
X-Container-Object-Count The total number of objects in the container
X-Container-Bytes-Used The total number of bytes of all objects stored in the container

Return Codes

Code Description
204 (No Content) The container exists
401 (Unauthorized) Request for invalid account or invalid access token
404 (Not Found) The container does not exist

List Objects

GET operations are used to list objects. Objects in a location are listed by a GET request to a URI of the following form:

https://hostname/v2/<account>/<container>[?parm=value]

A request without a parameter will return a list of all the user's files in the specified <container. A request with parameter prefix, for string value x, will return all object names beginning with x. If x contains a path, this is a way to return objects in a pseudo-directory (since the Cloud Files API does not support true directories). A request with parameter path, for string value x, will return all objects in the pseudo path, provided pseudo-hierarchical directories are defined as described in the Cloud Files API.

The response body will contain a list of objects, one object per line. Return code 204 (No Content) will be passed back if there are no files to be listed (for the specified path, if any). If an incorrect account is specified, the HTTP return code will be 404 (Not Found).

List Object Details

If a format=xml or format=json argument is given, extended information for the objects will be returned. For each object, the information will include:

Name Description
name The name of the object
hash The MD5 hash of the object
bytes The size of the object
content_type The MIME content type of the object
created The object creation date
last_modified The last object modification date
deleted True if the object has been moved to the trash, false otherwise
versioned True if the object is versioned, false otherwise
version The version number of the object
uri The URI of the object
tags The tags that apply to the object
public True if the object is readable by all, false otherwise
permissions The permissions that apply to the object

The following is an example of object details returned in JSON format:


{"name":"test_obj_1",
"hash":"4281c348eaf83e70ddce0e07221c3d28",
"bytes":14,
"content_type":"application/octet-stream",
"created":"2009-01-02T04:25:34.611226",
"last_modified":"2009-02-03T05:26:32.612278",
"deleted":"false",
"versioned":"true",
"version":14,
"uri":"http://hostname/pithos/aaitest@grnet.gr/files/Documents/doc.txt" 
"tags": [
  "work",
  "personal" 
],
"public":false,
"permissions": [
  {"modifyACL":true,
  "write":true,
  "read":true,
  "user":"aaitest@grnet.gr" 
  },
  {"modifyACL":false,
  "write":true,
  "read":true,
  "group":"Work" 
  }
]
},

If more than one object's details are returned, they are returned in a JSON array, with each element as above.

The following is an example of object details returned in XML format:


<?xml version="1.0" encoding="UTF-8"?>

<container name="test_container_1">
  <object>
    <name>test_obj_1</name>
    <hash>4281c348eaf83e70ddce0e07221c3d28</hash>
    <bytes>14</bytes>
    <content_type>application/octet-stream</content_type>
    <created>2009-01-02T04:25:34.611226</created>
    <last_modified>2009-02-03T05:26:32.612278</last_modified>
    <deleted>false</deleted>
    <versioned>true</versioned>
    <version>14</version>   
    <uri>http://hostname/pithos/aaitest@grnet.gr/files/Documents/doc.txt</uri>
    <tags>
      <tag>work</tag>
      <tag>personal</tag>
    </tags>
    <public>false</public>
    <permissions>
      <permission>
        <user>aaigest@grnet.gr</user>
        <read>true</read>
        <write>true</write>
        <modify_acl>true</modify_acl>
      </permission>
      <permission>
        <user>aaigest@grnet.gr</user>
        <read>true</read>
        <write>true</write>
        <modify_acl>true</modify_acl>
      </permission>
    </permissions>
  </object>
</container>

If more than one object's details are returned, they are put in a sequence of <object> elements.

Return Codes

Code Description
200 (OK) The request succeeds
204 (No Content) The container is empty or does not exist for the specified account
401 (Unauthorized) Invalid access token
404 (Not Found) Incorrect account

Retrieve Object Metadata

HEAD operations against an object are used to retrieve an object's metadata. For versioned objects, previous versions can be retrieved by using the X-Object-Meta-Version header.

Metadata is returned as HTTP headers. The following headers are used by Pithos:

Header Name Value
X-Object-Meta-Created The object creation timestamp in ISO 8601 separate date and time UTC format
X-Object-Meta-Created-By The user that created the object
X-Object-Meta-Modified-By The user that last modified the object
X-Object-Meta-Modification-Date The object last modification timestamp in ISO 8601 separate date and time UTC format
X-Object-Meta-Deleted True if the object has been moved to the trash, false otherwise
X-Object-Meta-Versioned True if the object is versioned
X-Object-Meta-Version The numerical version of the object (positive integer value)
X-Object-Meta-Size The size of the object in bytes
X-Object-Meta-Content The MIME content type
X-Object-Meta-URI The URI of the object
X-Object-Meta-Signature True if the object is a signature file
X-Object-Meta-Deltafile True if the object is a delta file
X-Object-Meta-Tag A tag that applies to the object
X-Object-Meta-Public True if the object is readable by all
X-Object-Meta-Permissions-<user>-Read True if <user> can read the object, false otherwise
X-Object-Meta-Permissions-<user>-Write True if <user> can update the object, false otherwise
X-Object-Meta-Permissions-<user>-ModifyACL True if <user> can modify permissions for the object, false otherwise
X-Object-Meta-Permissions-<group>-Read True if <group> can read the object, false otherwise
X-Object-Meta-Permissions-<group>-Write True if <group> can update the object, false otherwise
X-Object-Meta-Permissions-<group>-ModifyACL True if <group> can modify permissions for the object, false otherwise

Return Codes

Code Description
200 (OK) Success
401 (Unauthorized) Invalid access token
404 (Not Found) No such object exists

Create Object

PUT operations are used to write or overwrite an object's metadata and content. Objects can be uploaded in segments by using a common prefix and a manifest file as described in the Cloud Files API. Objects of unknown size may be uploaded by specifying an HTTP header of Transfer-Encoding: chunked and not using a Content-Length header, as described in the Cloud Files API. A PUT operation can
associate metadata with an object, if appropriate HTTP headers are passed. Upon successful creation, the MD5 hash of the object is returned in the ETag header.

Return Codes

Code Description
201 (Created) Successful write
401 (Unauthorized) Invalid access token
404 (Not Found) The requested object does not exist
412 (Length Required) Missing Content-Length or Content-Header in the request
422 (Unprocessable Entity) The MD5 checksum of the data written to the storage system does not match the (optionally) supplied ETag value

Update Object

If a PUT operation is carried out to an existing object, it is overwritten by the new object. If the existing object has versioning enabled, the object's version is increased by one, the new object becomes the current version, and the existing object is available as the previous version. Upon successful update, the MD5 hash of the object is returned in the ETag header.

Return Codes

Code Description
201 (Created) Successful write
401 (Unauthorized) Invalid access token or insufficient permissions
404 (Not Found) The requested object does not exist
412 (Length Required) Missing Content-Length or Content-Header in the request
422 (Unprocessable Entity) The MD5 checksum of the data written to the storage system does not match the (optionally) supplied ETag value

Update Object Metadata

POST operations against an object are used to set and overwrite arbitrary key/value metadata. Key names must be prefixed with X-Object-Meta. A POST request will delete all existing metadata added by a previous PUT or POST. The Pithos metadata are described in Retrieve Object Metadata

Return Codes

Code Description
202 (Accepted) Success
401 (Unauthorized) Invalid access token
404 (Not Found) The requested object does not exist

Retrieve Object

GET operations against an object are used to retrieve an object's data. For versioned objects, previous versions can be retrieved by using the X-Object-Meta-Version header.

As in the Cloud Files API, conditional GET requests can be carried out by using the following headers:

  • If-Match
  • If-None-Match
  • If-Modified-Since
  • If-Unmodified-Since

Return Codes

Code Description
200 (OK) Success
401 (Unauthorized) Invalid access token
404 (Not Found) No such object exists

Retrieve Object Differentially

To retrieve an object differentially, that is, to retrieve only the changes that have happened to an object, the user has to send to the server a signature file that describes the contents of the file. The signature file is calculated by the rsync algorithm. Hence, using rdiff:

$ rdiff signature foo sigfile

The signature file is sent to the server via a PUT operation. The PUT operation has the HTTP header Content-Type with application/signature as its value and the signature file in the body of the message. The server receives the signature file and using it calculates a delta file containing the changes between the client object and the server object. Using rdiff this is:

$ rdiff delta sigfile foo > deltafile

The server sends back the name of the delta file in the response to the PUT operation. To patch the file, the client issues a GET operation on the delta file and patches the file. Using rdiff this is:

$ rdiff patch foo deltafile > foo

The signature file and the delta file are addressable and can be removed after appropriate requests by the client.

Return Codes

As differential retrieval is actually a sequence of operations, return codes are returned as defined for each individual operation.

Large Object Retrieval

Large objects can be retrieved in parts by using the Range header, which is fully supported by Pithos. For differential retrieval of large objects, the client and the server may go through the following:

  • The signature file can be sent in parts. Each part is sent in a separate PUT operation. The parts must share a common prefix and their names must sort in the correct order, were they to be concatenated. Then a manifest file is uploaded. The manifest file is simply a zero-byte file with the extra X-Object-Manifest:<container>/<prefix> header, where <container> is the container containing the object segments <prefix> is the common prefix for all the segments. Note that after the manifest has been uploaded, the individual segments and the manifest file are addressable and can be removed by the client with the appropriate requests. The signature file is then reconstituted as <container>/<prefix>, and it is addressable.
  • When the signature is reconstituted, the server produces the delta file. The delta file is addressable, and its filename is returned in the body of the PUT operation response. The client can download the delta file in parts by using the Range header and then remove the signature and the delta file.

Return Codes

As large object retrieval is actually a sequence of operations, return codes are returned as defined for each individual operation.

Update Object Differentially

To update an object differentially is to send to the server only the changes that have been effected to the object. The process is initiated by a GET operation with the X-Pithos-Signature header in the request. The server then calculates and returns the signature as described in Retrieve Object Differentially. It also returns the MD5 hash of the complete file in the ETag header. The client calculates the delta file and issues a PUT operation with the MD5 hash it has received in the ETag header, the X-Pithos-Delta header in the request, and the delta file in the body of the message. The server receives the request, and if the MD5 hash received matches the MD5 hash of the object it currently stores, it updates the object and returns the MD5 hash of the updated object in the ETag header of the response.

Return Codes

As differential object update is actually a sequence of operations, return codes are returned as defined for each individual operation.

Large Object Creation and Updating

As described in the Cloud Files API, it is possible to send large files in parts. This can be done in both plain PUT operations and differential PUT operations.

  • In plain PUT operations, the segments are uploaded with the proviso that they must share a common prefix and their names must sort in the correct order, were they to be concatenated. Then a manifest file is uploaded. The manifest file is simply a zero-byte file with the extra X-Object-Manifest:<container>/<prefix> header, where <container> is the container containing the object segments <prefix> is the common prefix for all the segments. Note that after the manifest has been uploaded, the individual segments and the manifest file are addressable and can be removed by the client issuing the appropriate requests. The concatenated file is available as <container>/<prefix>.
  • In differential PUT operations, parts of the delta file are uploaded as described above. Then the client issues a PUT operation with the MD5 value it has received in the ETag header, the X-Pithos-Delta header in the request, and the X-Object-Manifest header with the delta file parts as the <prefix>. The delta file is then reconstituted on the server and processed as in Update Object Differentially. The delta parts and the manifest file are addressable and can be removed by the client issuing the appropriate requests.

Return Codes

As large object creation and updating is actually a sequence of operations, return codes are returned as defined for each individual operation.

Retrieve Container Metadata

HEAD operations against a container are used to determine the number of objects, and the total bytes of all objects stored in the container. The object count and utilization are returned in the X-Container-Object-Count and in the X-Container-Bytes-Used headers respectively.

Return Codes

Code Description
204 (No Content) The container exists
401 (Unauthorized) Invalid access token
404 (Not Found) The requested container does not exist

Retrieve Object Metadata

HEAD opeations on an object are used to retrieve object metadata.

Return Codes

Code Description
200 (OK) Success
401 (Unauthorized) Invalid access token
404 (Not Found) The requested object does not exist

Copy Operation

There are two ways to move an object to another object. One way is to is to do a PUT to the new object (the target) location, but add the X-Copy-From header to designate the source of the data. The other way is to do a COPY to the existing object and include the Destination header to specify the target of the move.

If a pseudo-directory object is copied, Pithos will copy all its contents.

Return Codes

Code Description
201 (Created) Successful write
401 (Unauthorized) Invalid access token or cannot write to destination
404 (Not Found) The requested object does not exist

Move Operation

There are two ways to move an object to another object. One way is to is to do a PUT to the new object (the target) location, but add the X-Move-From header to designate the source of the data. The other way is to do a MOVE to the existing object and include the Destination header to specify the target of the move.

If a pseudo-directory object is moved, Pithos will move all its contents.

Return Codes

Code Description
201 (Created) Successful write
401 (Unauthorized) Invalid access token or cannot write to destination
404 (Not Found) The requested object does not exist

Trash Operation

An object can be moved to a trash area by doing a move operation to the special trash container. An object can be restored from the trash by moving it outside it. An object can be permanently deleted from the trash by a DELETE operation on it. When an object is moved to the trash, its URI metadata field is not changed (see below for metadata).

Return Codes

Code Description
204 (No Content) The request succeeds
401 (Unauthorized) Invalid access token or insufficient permissions
404 (Not Found) The object does not exist

Delete Operation

DELETE operations on an object are used to permanently remove that object from the storage system (metadata and data). Deleting an object is processed immediately.

Return Codes

Code Description
204 (No Content) The request succeeds
401 (Unauthorized) Invalid access token or insufficient permissions
404 (Not Found) The object does not exist

Search for Users Operation

Searching for users can be carried out by issuing a GET request to URIs of the following form:

https://hostname/v2/users/<prefix>aaitest</prefix>

The system will respond with a list (possibly empty) of users whose username starts with the given prefix. In JSON format:


[ 
  {"username":"<prefix><rest_1>", "home":<homedir_1>},
  {"username":"<prefix><rest_2>", "home,:"<homedir_2>}
]

In XML format:


<?xml version="1.0" encoding="UTF-8"?>

<users>
  <user>
    <username>username_1</username>
    <home>homedir_1</home>    
  </user>
  <user>
    <username>username_2</username>
    <home>homedir_2</home>
  </user>
</users>

Special Containers

In addition to the Cloud Files API, the Pithos service uses special containers:

  • A container called trash contains files that have been marked for deletion, but can still be recovered by the user. The uri metadata field of the file is its original URI.
  • A container called shared contains objects of size zero or one and Content-Type of application/directory or application/file. The names of the objects are the names of files shared by the user to other users of the system. The uri metadata field of the file is the URI of the original file.
  • A container called others contains objects of size zero or one and Content-Type of application/directory or application/file. The names of the objects are the names of files that other users share with the user. The uri metadata field of the file is the URI of the original file.
  • A container called tags contains objects of size zero or one and Content-Type of application/tag. The names of the objects are the names of tags the user has defined. Tags are created and deleted by actions on the tags storage container.
  • A container called groups contains objects of Content-Type application/group. The names of the objects are the names of groups the user has defined. The contents of the objects are the users belonging in the group, with one user per line. Group creation, deletion, and manipulation is carried out by actions in the groups container.

Home Container

For each user, Pithos automatically creates a container called home, acting as the user's Home Container. The Home Container is used as the default entry point for interfaces to the user's Pithos account. A DELETE request against a user's Home Container will return 409 (Conflict).