Pithos is a storage service implemented by GRNET (http://www.grnet.gr). Data is stored as objects, organized in containers, belonging to an account. This hierarchy of storage layers has been inspired by the OpenStack Object Storage (OOS) API and similar CloudFiles API by Rackspace. The Pithos API follows the OOS API as closely as possible. One of the design requirements has been to be able to use Pithos with clients built for the OOS, without changes.
-However, to be able to take full advantage of the Pithos infrastructure, client software should be aware of the extensions that differentiate Pithos from OOS. Pithos objects can be updated, or appended to. Automatic version management, allows taking account and container listings back in time, as well as reading previous instances of objects.
+However, to be able to take full advantage of the Pithos infrastructure, client software should be aware of the extensions that differentiate Pithos from OOS. Pithos objects can be updated, or appended to. Pithos will store sharing permissions per object and enforce corresponding authorization policies. Automatic version management, allows taking account and container listings back in time, as well as reading previous instances of objects.
The storage backend of Pithos is block oriented, permitting efficient, deduplicated data placement. The block structure of objects is exposed at the API layer, in order to encourage external software to implement advanced data management operations.
========================= ================================
Revision Description
========================= ================================
-0.4 (June 22, 2011) Support updating/deleting individual metadata with ``POST``.
+0.4 (June 30, 2011) Object permissions and account groups.
+\ Control versioning behavior and container quotas with container policy directives.
+\ Support updating/deleting individual metadata with ``POST``.
0.3 (June 14, 2011) Large object support with ``X-Object-Manifest``.
\ Allow for publicly available objects via ``https://hostname/public``.
\ Support time-variant account/container listings.
X-Account-Bytes-Remaining The total number of bytes remaining (**TBD**)
X-Account-Last-Login The last login (**TBD**)
X-Account-Until-Timestamp The last account modification date until the timestamp provided
+X-Account-Group-* Optional user defined groups
X-Account-Meta-* Optional user defined metadata
Last-Modified The last account modification date (regardless of ``until``)
========================== =====================
bytes The total size of the objects inside the container
last_modified The last container modification date (regardless of ``until``)
x_container_until_timestamp The last container modification date until the timestamp provided
+x_container_policy_* Container behavior and limits
x_container_meta_* Optional user defined metadata
=========================== ============================
====================== ============================================
Request Parameter Name Value
====================== ============================================
-update Do not replace metadata (no value parameter)
+update Do not replace metadata/groups (no value parameter)
====================== ============================================
|
==================== ===========================
Request Header Name Value
==================== ===========================
+X-Account-Group-* Optional user defined groups
X-Account-Meta-* Optional user defined metadata
==================== ===========================
No reply content/headers.
The operation will overwrite all user defined metadata, except if ``update`` is defined.
+To create a group, include an ``X-Account-Group-*`` header with the name in the key and a comma separated list of user names in the value. If no ``X-Account-Group-*`` header is present, no changes will be applied to groups. The ``update`` parameter also applies to groups. To delete a specific group, use ``update`` and an empty header value.
================ ===============================
Return Code Description
X-Container-Block-Hash The hash algorithm used for block identifiers in object hashmaps
X-Container-Until-Timestamp The last container modification date until the timestamp provided
X-Container-Object-Meta A list with all meta keys used by objects
+X-Container-Policy-* Container behavior and limits
X-Container-Meta-* Optional user defined metadata
Last-Modified The last container modification date (regardless of ``until``)
=========================== ===============================
-The keys returned in ``X-Container-Object-Meta`` are all the unique strings after the ``X-Object-Meta-`` prefix.
+The keys returned in ``X-Container-Object-Meta`` are all the unique strings after the ``X-Object-Meta-`` prefix. See container ``PUT`` for a reference of policy directives.
================ ===============================
Return Code Description
x_object_version_timestamp The object's version timestamp
x_object_modified_by The user that committed the object's version
x_object_manifest Object parts prefix in ``<container>/<object>`` form (optional)
-x_object_public Object is publicly accessible (optional) (**TBD**)
+x_object_sharing Object permissions (optional)
+x_object_shared_by Object inheriting permissions (optional)
+x_object_public Object's publicly accessible URI (optional)
x_object_meta_* Optional user defined metadata
========================== ======================================
==================== ================================
Request Header Name Value
==================== ================================
+X-Container-Policy-* Container behavior and limits
X-Container-Meta-* Optional user defined metadata
==================== ================================
No reply content/headers.
+
+If no policy is defined, the container will be created with the default values.
+Available policy directives:
+
+* ``versioning``: Set to ``auto``, ``manual`` or ``none`` (default is ``manual``)
+* ``quota``: Size limit in KB (default is ``0`` - unlimited)
================ ===============================
Return Code Description
====================== ============================================
Request Parameter Name Value
====================== ============================================
-update Do not replace metadata (no value parameter)
+update Do not replace metadata/policy (no value parameter)
====================== ============================================
|
==================== ================================
Request Header Name Value
==================== ================================
+X-Container-Policy-* Container behavior and limits
X-Container-Meta-* Optional user defined metadata
==================== ================================
No reply content/headers.
The operation will overwrite all user defined metadata, except if ``update`` is defined.
+To change policy, include an ``X-Container-Policy-*`` header with the name in the key. If no ``X-Container-Policy-*`` header is present, no changes will be applied to policy. The ``update`` parameter also applies to policy - deleted values will revert to defaults. To delete/revert a specific policy directive, use ``update`` and an empty header value. See container ``PUT`` for a reference of policy directives.
================ ===============================
Return Code Description
X-Object-Version-Timestamp The object's version timestamp
X-Object-Modified-By The user that comitted the object's version
X-Object-Manifest Object parts prefix in ``<container>/<object>`` form (optional)
-X-Object-Public Object is publicly accessible (optional) (**TBD**)
+X-Object-Sharing Object permissions (optional)
+X-Object-Shared-By Object inheriting permissions (optional)
+X-Object-Public Object's publicly accessible URI (optional)
X-Object-Meta-* Optional user defined metadata
========================== ===============================
X-Object-Version-Timestamp The object's version timestamp
X-Object-Modified-By The user that comitted the object's version
X-Object-Manifest Object parts prefix in ``<container>/<object>`` form (optional)
-X-Object-Public Object is publicly accessible (optional) (**TBD**)
+X-Object-Sharing Object permissions (optional)
+X-Object-Shared-By Object inheriting permissions (optional)
+X-Object-Public Object's publicly accessible URI (optional)
X-Object-Meta-* Optional user defined metadata
========================== ===============================
Content-Encoding The encoding of the object (optional)
Content-Disposition The presentation style of the object (optional)
X-Object-Manifest Object parts prefix in ``<container>/<object>`` form (optional)
-X-Object-Public Object is publicly accessible (optional) (**TBD**)
+X-Object-Sharing Object permissions (optional)
+X-Object-Public Object is publicly accessible (optional)
X-Object-Meta-* Optional user defined metadata
==================== ================================
ETag The MD5 hash of the object (on create)
========================== ===============================
-|
+The ``X-Object-Sharing`` header may include either a ``read=...`` comma-separated user/group list, or a ``write=...`` comma-separated user/group list, or both separated by a semicolon (``;``). To publish the object, set ``X-Object-Public`` to ``true``. To unpublish, set to ``false``, or use an empty header value.
=========================== ==============================
Return Code Description
Content-Disposition The presentation style of the object (optional)
X-Source-Version The source version to copy from
X-Object-Manifest Object parts prefix in ``<container>/<object>`` form (optional)
-X-Object-Public Object is publicly accessible (optional) (**TBD**)
+X-Object-Sharing Object permissions (optional)
+X-Object-Public Object is publicly accessible (optional)
X-Object-Meta-* Optional user defined metadata
==================== ================================
-Refer to ``POST`` for a description of request headers. Metadata is also copied, updated with any values defined.
+Refer to ``PUT``/``POST`` for a description of request headers. Metadata is also copied, updated with any values defined. Sharing/publishing options are not copied.
No reply content/headers.
Content-Encoding The encoding of the object (optional)
Content-Disposition The presentation style of the object (optional)
X-Object-Manifest Object parts prefix in ``<container>/<object>`` form (optional)
-X-Object-Public Object is publicly accessible (optional) (**TBD**)
+X-Object-Sharing Object permissions (optional)
+X-Object-Public Object is publicly accessible (optional)
X-Object-Meta-* Optional user defined metadata
==================== ================================
-The ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest``, ``X-Object-Public`` (**TBD**) and ``X-Object-Meta-*`` headers are considered to be user defined metadata. An operation without the ``update`` parameter will overwrite all previous values and remove any keys not supplied. When using ``update`` any metadata with an empty value will be deleted.
+The ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest`` and ``X-Object-Meta-*`` headers are considered to be user defined metadata. An operation without the ``update`` parameter will overwrite all previous values and remove any keys not supplied. When using ``update`` any metadata with an empty value will be deleted.
+
+To change permissions, include an ``X-Object-Sharing`` header (as defined in ``PUT``). To publish, include an ``X-Object-Public`` header, with a value of ``true``. If no such headers are defined, no changes will be applied to sharing/public. Use empty values to remove permissions/unpublish (unpublishing also works with ``false`` as a header value). Sharing options are applied to the object - not its versions.
To update an object's data:
204 (No Content) The request succeeded
=========================== ==============================
-Public Objects
-^^^^^^^^^^^^^^
+Sharing and Public Objects
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Read and write control in Pithos is managed by setting appropriate permissions with the ``X-Object-Sharing`` header. The permissions are applied using prefix-based inheritance. Thus, each set of authorization directives is applied to all objects sharing the same prefix with the object where the corresponding ``X-Object-Sharing`` header is defined. For simplicity, nested/overlapping permissions are not allowed. Setting ``X-Object-Sharing`` will fail, if the object is already "covered", or another object with a longer common-prefix name already has permissions. When retrieving an object, the ``X-Object-Shared-By`` header reports where it gets its permissions from. If not present, the object is the actual source of authorization directives.
-Objects that are marked as public, via the ``X-Object-Public`` meta (**TBD**), are also available at the corresponding URI ``https://hostname/public/<account>/<container>/<object>`` for ``HEAD`` or ``GET``. Requests for public objects do not need to include an ``X-Auth-Token``. Pithos will ignore request parameters and only include the following headers in the reply (all ``X-Object-*`` meta is hidden).
+Objects that are marked as public, via the ``X-Object-Public`` meta, are also available at the corresponding URI returned for ``HEAD`` or ``GET``. Requests for public objects do not need to include an ``X-Auth-Token``. Pithos will ignore request parameters and only include the following headers in the reply (all ``X-Object-*`` meta is hidden).
========================== ===============================
Reply Header Name Value
List of differences from the OOS API:
* Support for ``X-Account-Meta-*`` style headers at the account level. Use ``POST`` to update.
-* Support for ``X-Container-Meta-*`` style headers at the account level. Can be set when creating via ``PUT``. Use ``POST`` to update.
+* Support for ``X-Container-Meta-*`` style headers at the container level. Can be set when creating via ``PUT``. Use ``POST`` to update.
* Header ``X-Container-Object-Meta`` at the container level and parameter ``meta`` in container listings.
+* Container policies to manage behavior and limits.
* Headers ``X-Container-Block-*`` at the container level, exposing the underlying storage characteristics.
* All metadata replies, at all levels, include latest modification information.
* At all levels, a ``GET`` request may use ``If-Modified-Since`` and ``If-Unmodified-Since`` headers.
* Container/object lists include all associated metadata if the reply is of type json/xml. Some names are kept to their OOS API equivalents for compatibility.
-* Object metadata allowed, in addition to ``X-Object-Meta-*``: ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest``, ``X-Object-Public`` (**TBD**). These are all replaced with every update operation, except if using the ``update`` parameter (in which case individual keys can also be deleted). Deleting meta by providing empty values also works when copying/moving an object.
+* Object metadata allowed, in addition to ``X-Object-Meta-*``: ``Content-Encoding``, ``Content-Disposition``, ``X-Object-Manifest``. These are all replaced with every update operation, except if using the ``update`` parameter (in which case individual keys can also be deleted). Deleting meta by providing empty values also works when copying/moving an object.
* Multi-range object GET support as outlined in RFC2616.
* Object hashmap retrieval through GET and the ``format`` parameter.
* Partial object updates through POST, using the ``Content-Length``, ``Content-Type``, ``Content-Range`` and ``Transfer-Encoding`` headers.
* Object ``MOVE`` support.
* Time-variant account/container listings via the ``until`` parameter.
* Object versions - parameter ``version`` in HEAD/GET (list versions with GET), ``X-Object-Version-*`` meta in replies, ``X-Source-Version`` in PUT/COPY.
-* Publicly accessible objects via ``https://hostname/public``. Control with ``X-Object-Public`` (**TBD**).
+* Sharing/publishing with ``X-Object-Sharing``, ``X-Object-Public`` at the object level. Permissions may include groups defined with ``X-Account-Group-*`` at the account level. These apply to the object - not its versions.
+* Support for prefix-based inheritance when enforcing permissions. Parent object carrying the authorization directives is reported in ``X-Object-Shared-By``.
* Large object support with ``X-Object-Manifest``.
* Trace the user that created/modified an object with ``X-Object-Modified-By``.
put_account_headers, get_container_headers, put_container_headers, get_object_headers, put_object_headers,
update_manifest_meta, update_sharing_meta, validate_modification_preconditions,
validate_matching_preconditions, split_container_object_string, copy_or_move_object,
- get_int_parameter, get_content_length, get_content_range, get_sharing, raw_input_socket,
+ get_int_parameter, get_content_length, get_content_range, raw_input_socket,
socket_read_iterator, object_data_response, put_object_block, hashmap_hash, api_method)
from pithos.backends import backend
from pithos.backends.base import NotAllowedError
if x[1] is not None:
try:
meta = backend.get_container_meta(request.user, v_account, x[0], until)
+ policy = backend.get_container_policy(request.user, v_account, x[0])
+ for k, v in policy.iteritems():
+ meta['X-Container-Policy-' + k] = v
container_meta.append(printable_header_dict(meta))
except NotAllowedError:
raise Unauthorized('Access denied')
try:
meta = backend.get_container_meta(request.user, v_account, v_container, until)
meta['object_meta'] = backend.list_object_meta(request.user, v_account, v_container, until)
+ policy = backend.get_container_policy(request.user, v_account, v_container)
except NotAllowedError:
raise Unauthorized('Access denied')
except NameError:
raise ItemNotFound('Container does not exist')
response = HttpResponse(status=204)
- put_container_headers(response, meta)
+ put_container_headers(response, meta, policy)
return response
@api_method('PUT')
# unauthorized (401),
# badRequest (400)
- meta = get_container_headers(request)
+ meta, policy = get_container_headers(request)
try:
- backend.put_container(request.user, v_account, v_container)
+ backend.put_container(request.user, v_account, v_container, policy)
ret = 201
except NotAllowedError:
raise Unauthorized('Access denied')
# unauthorized (401),
# badRequest (400)
- meta = get_container_headers(request)
+ meta, policy = get_container_headers(request)
replace = True
if 'update' in request.GET:
replace = False
+ if policy:
+ try:
+ backend.update_container_policy(request.user, v_account, v_container, policy, replace)
+ except NotAllowedError:
+ raise Unauthorized('Access denied')
+ except NameError:
+ raise ItemNotFound('Container does not exist')
+ except ValueError:
+ raise BadRequest('Invalid policy header')
try:
backend.update_container_meta(request.user, v_account, v_container, meta, replace)
except NotAllowedError:
try:
meta = backend.get_container_meta(request.user, v_account, v_container, until)
meta['object_meta'] = backend.list_object_meta(request.user, v_account, v_container, until)
+ policy = backend.get_container_policy(request.user, v_account, v_container)
except NotAllowedError:
raise Unauthorized('Access denied')
except NameError:
validate_modification_preconditions(request, meta)
response = HttpResponse()
- put_container_headers(response, meta)
+ put_container_headers(response, meta, policy)
path = request.GET.get('path')
prefix = request.GET.get('prefix')
copy_or_move_object(request, v_account, src_container, src_name, v_container, v_object, move=False)
return HttpResponse(status=201)
- meta = get_object_headers(request)
- permissions = get_sharing(request)
+ meta, permissions, public = get_object_headers(request)
content_length = -1
if request.META.get('HTTP_TRANSFER_ENCODING') != 'chunked':
content_length = get_content_length(request)
# unauthorized (401),
# badRequest (400)
- meta = get_object_headers(request)
- permissions = get_sharing(request)
+ meta, permissions, public = get_object_headers(request)
content_type = meta.get('Content-Type')
if content_type:
del(meta['Content-Type']) # Do not allow changing the Content-Type.
"""Get all prefix-* request headers in a dict. Reformat keys with format_header_key()."""
prefix = 'HTTP_' + prefix.upper().replace('-', '_')
- return dict([(format_header_key(k[5:]), v.replace('_', '')) for k, v in request.META.iteritems() if k.startswith(prefix) and len(k) > len(prefix)])
+ # TODO: Document or remove '~' replacing.
+ return dict([(format_header_key(k[5:]), v.replace('~', '')) for k, v in request.META.iteritems() if k.startswith(prefix) and len(k) > len(prefix)])
def get_account_headers(request):
meta = get_header_prefix(request, 'X-Account-Meta-')
def get_container_headers(request):
meta = get_header_prefix(request, 'X-Container-Meta-')
- return meta
+ policy = dict([(k[19:].lower(), v.replace(' ', '')) for k, v in get_header_prefix(request, 'X-Container-Policy-').iteritems()])
+ return meta, policy
-def put_container_headers(response, meta):
+def put_container_headers(response, meta, policy):
response['X-Container-Object-Count'] = meta['count']
response['X-Container-Bytes-Used'] = meta['bytes']
response['Last-Modified'] = http_date(int(meta['modified']))
response['X-Container-Block-Hash'] = backend.hash_algorithm
if 'until_timestamp' in meta:
response['X-Container-Until-Timestamp'] = http_date(int(meta['until_timestamp']))
+ for k, v in policy.iteritems():
+ response[format_header_key('X-Container-Policy-' + k).encode('utf-8')] = v.encode('utf-8')
def get_object_headers(request):
meta = get_header_prefix(request, 'X-Object-Meta-')
meta['Content-Disposition'] = request.META['HTTP_CONTENT_DISPOSITION']
if request.META.get('HTTP_X_OBJECT_MANIFEST'):
meta['X-Object-Manifest'] = request.META['HTTP_X_OBJECT_MANIFEST']
- return meta
+ return meta, get_sharing(request), get_public(request)
-def put_object_headers(response, meta, public=False):
+def put_object_headers(response, meta, restricted=False):
response['ETag'] = meta['hash']
response['Content-Length'] = meta['bytes']
response['Content-Type'] = meta.get('Content-Type', 'application/octet-stream')
response['Last-Modified'] = http_date(int(meta['modified']))
- if not public:
+ if not restricted:
response['X-Object-Modified-By'] = meta['modified_by']
response['X-Object-Version'] = meta['version']
response['X-Object-Version-Timestamp'] = http_date(int(meta['version_timestamp']))
def copy_or_move_object(request, v_account, src_container, src_name, dest_container, dest_name, move=False):
"""Copy or move an object."""
- meta = get_object_headers(request)
- permissions = get_sharing(request)
+ meta, permissions, public = get_object_headers(request)
src_version = request.META.get('HTTP_X_SOURCE_VERSION')
try:
if move:
raise BadRequest('Bad X-Object-Sharing header value')
return ret
+def get_public(request):
+ """Parse an X-Object-Public header from the request.
+
+ Raises BadRequest on error.
+ """
+
+ public = request.META.get('HTTP_X_OBJECT_PUBLIC')
+ if public is None:
+ return None
+
+ public = public.replace(' ', '').lower()
+ if public == 'true':
+ return True
+ elif public == 'false' or public == '':
+ return False
+ raise BadRequest('Bad X-Object-Public header value')
+
def raw_input_socket(request):
"""Return the socket for reading the rest of the request."""
self.hash_algorithm = 'sha1'
self.block_size = 128 * 1024 # 128KB
+ self.default_policy = {'quota': 0, 'versioning': 'auto'}
+
basepath = os.path.split(db)[0]
if basepath and not os.path.exists(basepath):
os.makedirs(basepath)
"""Return a dictionary with the container policy."""
logger.debug("get_container_policy: %s %s", account, container)
- return {}
+ if user != account:
+ raise NotAllowedError
+ path = self._get_containerinfo(account, container)[0]
+ return self._get_policy(path)
def update_container_policy(self, user, account, container, policy, replace=False):
"""Update the policy associated with the account."""
logger.debug("update_container_policy: %s %s %s %s", account, container, policy, replace)
- return
+ if user != account:
+ raise NotAllowedError
+ path = self._get_containerinfo(account, container)[0]
+ self._check_policy(policy)
+ if replace:
+ for k, v in self.default_policy.iteritems():
+ if k not in policy:
+ policy[k] = v
+ for k, v in policy.iteritems():
+ sql = 'insert or replace into policy (name, key, value) values (?, ?, ?)'
+ self.con.execute(sql, (path, k, v))
+ self.con.commit()
def put_container(self, user, account, container, policy=None):
"""Create a new container with the given name."""
try:
path, version_id, mtime = self._get_containerinfo(account, container)
except NameError:
- path = os.path.join(account, container)
- version_id = self._put_version(path, user)
+ pass
else:
raise NameError('Container already exists')
+ if policy:
+ self._check_policy(policy)
+ path = os.path.join(account, container)
+ version_id = self._put_version(path, user)
+ for k, v in self.default_policy.iteritems():
+ if k not in policy:
+ policy[k] = v
+ for k, v in policy.iteritems():
+ sql = 'insert or replace into policy (name, key, value) values (?, ?, ?)'
+ self.con.execute(sql, (path, k, v))
+ self.con.commit()
def delete_container(self, user, account, container):
"""Delete the container with the given name."""
c = self.con.execute(sql, (account,))
return dict([(x[0], x[1].split(',')) for x in c.fetchall()])
+ def _check_policy(self, policy):
+ for k in policy.keys():
+ if policy[k] == '':
+ policy[k] = self.default_policy.get(k)
+ for k, v in policy.iteritems():
+ if k == 'quota':
+ q = int(v) # May raise ValueError.
+ if q < 0:
+ raise ValueError
+ elif k == 'versioning':
+ if v not in ['auto', 'manual', 'none']:
+ raise ValueError
+ else:
+ raise ValueError
+
+ def _get_policy(self, path):
+ sql = 'select key, value from policy where name = ?'
+ c = self.con.execute(sql, (path,))
+ return dict(c.fetchall())
+
def _is_allowed(self, user, account, container, name, op='read'):
if user == account:
return True