Revision dad708b4
b/docs/dev-guide.rst | ||
---|---|---|
44 | 44 |
Storage API (Pithos+) |
45 | 45 |
===================== |
46 | 46 |
|
47 |
This is the Pithos+ File Storage API:
|
|
47 |
This is the Pithos+ Object Storage API:
|
|
48 | 48 |
|
49 | 49 |
.. toctree:: |
50 | 50 |
:maxdepth: 2 |
51 | 51 |
|
52 |
File Storage API <pithos-api-guide>
|
|
52 |
Object Storage API <pithos-api-guide>
|
|
53 | 53 |
|
54 | 54 |
Implementing new clients |
55 | 55 |
======================== |
... | ... | |
218 | 218 |
* Updating a state (either local or remote) implies downloading, uploading or |
219 | 219 |
deleting the appropriate file. |
220 | 220 |
|
221 |
Recommended Practices and Examples |
|
222 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|
223 |
|
|
224 |
Assuming an authentication token is obtained, the following high-level |
|
225 |
operations are available - shown with ``curl``: |
|
226 |
|
|
227 |
* Get account information :: |
|
228 |
|
|
229 |
curl -X HEAD -D - \ |
|
230 |
-H "X-Auth-Token: 0000" \ |
|
231 |
https://pithos.dev.grnet.gr/v1/user |
|
232 |
|
|
233 |
* List available containers :: |
|
234 |
|
|
235 |
curl -X GET -D - \ |
|
236 |
-H "X-Auth-Token: 0000" \ |
|
237 |
https://pithos.dev.grnet.gr/v1/user |
|
238 |
|
|
239 |
* Get container information :: |
|
240 |
|
|
241 |
curl -X HEAD -D - \ |
|
242 |
-H "X-Auth-Token: 0000" \ |
|
243 |
https://pithos.dev.grnet.gr/v1/user/pithos |
|
244 |
|
|
245 |
* Add a new container :: |
|
246 |
|
|
247 |
curl -X PUT -D - \ |
|
248 |
-H "X-Auth-Token: 0000" \ |
|
249 |
https://pithos.dev.grnet.gr/v1/user/test |
|
250 |
|
|
251 |
* Delete a container :: |
|
252 |
|
|
253 |
curl -X DELETE -D - \ |
|
254 |
-H "X-Auth-Token: 0000" \ |
|
255 |
https://pithos.dev.grnet.gr/v1/user/test |
|
256 |
|
|
257 |
* List objects in a container :: |
|
258 |
|
|
259 |
curl -X GET -D - \ |
|
260 |
-H "X-Auth-Token: 0000" \ |
|
261 |
https://pithos.dev.grnet.gr/v1/user/pithos |
|
262 |
|
|
263 |
* List objects in a container (extended reply) :: |
|
264 |
|
|
265 |
curl -X GET -D - \ |
|
266 |
-H "X-Auth-Token: 0000" \ |
|
267 |
https://pithos.dev.grnet.gr/v1/user/pithos?format=json |
|
268 |
|
|
269 |
It is recommended that extended replies are cached and subsequent requests |
|
270 |
utilize the ``If-Modified-Since`` header. |
|
271 |
|
|
272 |
* List metadata keys used by objects in a container |
|
273 |
|
|
274 |
Will be in the ``X-Container-Object-Meta`` reply header, included in |
|
275 |
container information or object list (``HEAD`` or ``GET``). (**TBD**) |
|
276 |
|
|
277 |
* List objects in a container having a specific meta defined :: |
|
278 |
|
|
279 |
curl -X GET -D - \ |
|
280 |
-H "X-Auth-Token: 0000" \ |
|
281 |
https://pithos.dev.grnet.gr/v1/user/pithos?meta=favorites |
|
282 |
|
|
283 |
* Retrieve an object :: |
|
284 |
|
|
285 |
curl -X GET -D - \ |
|
286 |
-H "X-Auth-Token: 0000" \ |
|
287 |
https://pithos.dev.grnet.gr/v1/user/pithos/README.txt |
|
288 |
|
|
289 |
* Retrieve an object (specific ranges of data) :: |
|
290 |
|
|
291 |
curl -X GET -D - \ |
|
292 |
-H "X-Auth-Token: 0000" \ |
|
293 |
-H "Range: bytes=0-9" \ |
|
294 |
https://pithos.dev.grnet.gr/v1/user/pithos/README.txt |
|
295 |
|
|
296 |
This will return the first 10 bytes. To get the first 10, bytes 30-39 and the |
|
297 |
last 100 use ``Range: bytes=0-9,30-39,-100``. |
|
298 |
|
|
299 |
* Add a new object (folder type) (**TBD**) :: |
|
300 |
|
|
301 |
curl -X PUT -D - \ |
|
302 |
-H "X-Auth-Token: 0000" \ |
|
303 |
-H "Content-Type: application/directory" \ |
|
304 |
https://pithos.dev.grnet.gr/v1/user/pithos/folder |
|
305 |
|
|
306 |
* Add a new object :: |
|
307 |
|
|
308 |
curl -X PUT -D - \ |
|
309 |
-H "X-Auth-Token: 0000" \ |
|
310 |
-H "Content-Type: text/plain" \ |
|
311 |
-T EXAMPLE.txt |
|
312 |
https://pithos.dev.grnet.gr/v1/user/pithos/folder/EXAMPLE.txt |
|
313 |
|
|
314 |
* Update an object :: |
|
315 |
|
|
316 |
curl -X POST -D - \ |
|
317 |
-H "X-Auth-Token: 0000" \ |
|
318 |
-H "Content-Length: 10" \ |
|
319 |
-H "Content-Type: application/octet-stream" \ |
|
320 |
-H "Content-Range: bytes 10-19/*" \ |
|
321 |
-d "0123456789" \ |
|
322 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
|
323 |
|
|
324 |
This will update bytes 10-19 with the data specified. |
|
325 |
|
|
326 |
* Update an object (append) :: |
|
327 |
|
|
328 |
curl -X POST -D - \ |
|
329 |
-H "X-Auth-Token: 0000" \ |
|
330 |
-H "Content-Length: 10" \ |
|
331 |
-H "Content-Type: application/octet-stream" \ |
|
332 |
-H "Content-Range: bytes */*" \ |
|
333 |
-d "0123456789" \ |
|
334 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
|
335 |
|
|
336 |
* Update an object (truncate) :: |
|
337 |
|
|
338 |
curl -X POST -D - \ |
|
339 |
-H "X-Auth-Token: 0000" \ |
|
340 |
-H "X-Source-Object: /folder/EXAMPLE.txt" \ |
|
341 |
-H "Content-Range: bytes 0-0/*" \ |
|
342 |
-H "X-Object-Bytes: 0" \ |
|
343 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
|
344 |
|
|
345 |
This will truncate the object to 0 bytes. |
|
346 |
|
|
347 |
* Add object metadata :: |
|
348 |
|
|
349 |
curl -X POST -D - \ |
|
350 |
-H "X-Auth-Token: 0000" \ |
|
351 |
-H "X-Object-Meta-First: first_meta_value" \ |
|
352 |
-H "X-Object-Meta-Second: second_meta_value" \ |
|
353 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
|
354 |
|
|
355 |
* Delete object metadata :: |
|
356 |
|
|
357 |
curl -X POST -D - \ |
|
358 |
-H "X-Auth-Token: 0000" \ |
|
359 |
-H "X-Object-Meta-First: first_meta_value" \ |
|
360 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
|
361 |
|
|
362 |
Metadata can only be "set". To delete ``X-Object-Meta-Second``, reset all |
|
363 |
metadata. |
|
364 |
|
|
365 |
* Delete an object :: |
|
366 |
|
|
367 |
curl -X DELETE -D - \ |
|
368 |
-H "X-Auth-Token: 0000" \ |
|
369 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
|
370 |
|
b/docs/index.rst | ||
---|---|---|
14 | 14 |
:maxdepth: 1 |
15 | 15 |
|
16 | 16 |
Identity Management (codename: astakos) <astakos> |
17 |
File Storage Service (codename: pithos+) <pithos>
|
|
17 |
Object Storage Service (codename: pithos+) <pithos>
|
|
18 | 18 |
Compute/Network Service (codename: cyclades) <cyclades> |
19 | 19 |
Image Registry (codename: plankton) <plankton> |
20 | 20 |
Billing Service (codename: aquarium) <http://docs.dev.grnet.gr/aquarium/latest/index.html> |
b/docs/pithos-api-guide.rst | ||
---|---|---|
1 | 1 |
Pithos+ API |
2 | 2 |
=========== |
3 | 3 |
|
4 |
This is the Pithos+ API guide. |
|
4 |
Introduction |
|
5 |
------------ |
|
5 | 6 |
|
7 |
Pithos is a storage service implemented by GRNET (http://www.grnet.gr). Data is stored as objects, organized in containers, belonging to an account. This hierarchy of storage layers has been inspired by the OpenStack Object Storage (OOS) API and similar CloudFiles API by Rackspace. The Pithos API follows the OOS API as closely as possible. One of the design requirements has been to be able to use Pithos with clients built for the OOS, without changes. |
|
6 | 8 |
|
7 |
Overview |
|
8 |
-------- |
|
9 |
However, to be able to take full advantage of the Pithos infrastructure, client software should be aware of the extensions that differentiate Pithos from OOS. Pithos objects can be updated, or appended to. Pithos will store sharing permissions per object and enforce corresponding authorization policies. Automatic version management, allows taking account and container listings back in time, as well as reading previous instances of objects. |
|
9 | 10 |
|
10 |
Pithos+ data is stored as objects, organized in containers, belonging to an |
|
11 |
account. This hierarchy of storage layers has been inspired by the OpenStack |
|
12 |
Object Storage (OOS) API and similar CloudFiles API by Rackspace. The Pithos |
|
13 |
API follows the OOS API as closely as possible. One of the design requirements |
|
14 |
has been to be able to use Pithos with clients built for the OOS, without |
|
15 |
changes. |
|
16 |
|
|
17 |
However, to be able to take full advantage of the Pithos infrastructure, client |
|
18 |
software should be aware of the extensions that differentiate Pithos from OOS. |
|
11 |
The storage backend of Pithos is block oriented, permitting efficient, deduplicated data placement. The block structure of objects is exposed at the API layer, in order to encourage external software to implement advanced data management operations. |
|
19 | 12 |
|
20 | 13 |
This document's goals are: |
21 | 14 |
|
22 |
* Define the Pithos ReST API that allows the storage and retrieval of data and
|
|
23 |
metadata via HTTP calls
|
|
24 |
* Specify metadata semantics and user interface guidelines for a common |
|
25 |
experience across client software implementations
|
|
15 |
* Define the Pithos ReST API that allows the storage and retrieval of data and metadata via HTTP calls
|
|
16 |
* Specify metadata semantics and user interface guidelines for a common experience across client software implementations
|
|
17 |
|
|
18 |
The present document is meant to be read alongside the OOS API documentation. Thus, it is suggested that the reader is familiar with associated technologies, the OOS API as well as the first version of the Pithos API. This document refers to the second version of Pithos. Information on the first version of the storage API can be found at http://code.google.com/p/gss.
|
|
26 | 19 |
|
27 |
The present document is meant to be read alongside the OOS API documentation. |
|
28 |
Thus, it is suggested that the reader is familiar with associated technologies, |
|
29 |
the OOS API as well as the first version of the Pithos API. This document |
|
30 |
refers to the version of Pithos+. Information on the first version of the |
|
31 |
storage API can be found at http://code.google.com/p/gss. |
|
20 |
Whatever marked as to be determined (**TBD**), should not be considered by implementors. |
|
32 | 21 |
|
33 |
Whatever marked as to be determined (**TBD**), should not be considered by |
|
34 |
implementors. |
|
22 |
More info about Pithos can be found here: https://code.grnet.gr/projects/pithos |
|
35 | 23 |
|
36 | 24 |
Document Revisions |
37 | 25 |
^^^^^^^^^^^^^^^^^^ |
... | ... | |
93 | 81 |
0.1 (May 17, 2011) Initial release. Based on OpenStack Object Storage Developer Guide API v1 (Apr. 15, 2011). |
94 | 82 |
========================= ================================ |
95 | 83 |
|
96 |
Users and Authentication |
|
97 |
------------------------ |
|
84 |
Pithos Users and Authentication
|
|
85 |
-------------------------------
|
|
98 | 86 |
|
99 |
In Pithos+, each user is uniquely identified by a token. All API requests |
|
100 |
require a token and each token is internally resolved to an account string. The |
|
101 |
API uses the account string to identify the user's own files, thus whether a |
|
102 |
request is local or cross-account. |
|
87 |
In Pithos, each user is uniquely identified by a token. All API requests require a token and each token is internally resolved to an account string. The API uses the account string to identify the user's own files, thus whether a request is local or cross-account. |
|
103 | 88 |
|
104 |
Pithos+ does not keep a user database. For development and testing purposes, |
|
105 |
user identifiers and their corresponding tokens can be defined in the settings |
|
106 |
file. However, Pithos is designed with an external authentication service in |
|
107 |
mind. This service must handle the details of validating user credentials and |
|
108 |
communicate with Pithos via a middleware software component that, given a |
|
109 |
token, fills in the internal request account variable. |
|
89 |
Pithos does not keep a user database. For development and testing purposes, user identifiers and their corresponding tokens can be defined in the settings file. However, Pithos is designed with an external authentication service in mind. This service must handle the details of validating user credentials and communicate with Pithos via a middleware software component that, given a token, fills in the internal request account variable. |
|
110 | 90 |
|
111 |
Client software using Pithos+, if not already knowing a user's identifier and |
|
112 |
token, should forward to the ``/login`` URI. The Pithos server, depending on |
|
113 |
its configuration will redirect to the appropriate login page. |
|
91 |
Client software using Pithos, if not already knowing a user's identifier and token, should forward to the ``/login`` URI. The Pithos server, depending on its configuration will redirect to the appropriate login page. |
|
114 | 92 |
|
115 | 93 |
The login URI accepts the following parameters: |
116 | 94 |
|
... | ... | |
126 | 104 |
|
127 | 105 |
A user management service that implements a login URI according to these conventions is Astakos (https://code.grnet.gr/projects/astakos), by GRNET. |
128 | 106 |
|
129 |
API Operations
|
|
107 |
The Pithos API
|
|
130 | 108 |
-------------- |
131 | 109 |
|
132 |
The URI requests supported by the Pithos+ API follow one of the following forms:
|
|
110 |
The URI requests supported by the Pithos API follow one of the following forms: |
|
133 | 111 |
|
134 | 112 |
* Top level: ``https://hostname/v1/`` |
135 | 113 |
* Account level: ``https://hostname/v1/<account>`` |
... | ... | |
1120 | 1098 |
* The ``Last-Modified`` header value always reflects the actual latest change timestamp, regardless of time control parameters and version requests. Time precondition checks with ``If-Modified-Since`` and ``If-Unmodified-Since`` headers are applied to this value. |
1121 | 1099 |
* A copy/move using ``PUT``/``COPY``/``MOVE`` will always update metadata, keeping all old values except the ones redefined in the request headers. |
1122 | 1100 |
* A ``HEAD`` or ``GET`` for an ``X-Object-Manifest`` object, will include modified ``Content-Length`` and ``ETag`` headers, according to the characteristics of the objects under the specified prefix. The ``Etag`` will be the MD5 hash of the corresponding ETags concatenated. In extended container listings there is no metadata processing. |
1101 |
|
|
1102 |
Recommended Practices and Examples |
|
1103 |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
1104 |
|
|
1105 |
Assuming an authentication token is obtained, the following high-level operations are available - shown with ``curl``: |
|
1106 |
|
|
1107 |
* Get account information :: |
|
1108 |
|
|
1109 |
curl -X HEAD -D - \ |
|
1110 |
-H "X-Auth-Token: 0000" \ |
|
1111 |
https://pithos.dev.grnet.gr/v1/user |
|
1112 |
|
|
1113 |
* List available containers :: |
|
1114 |
|
|
1115 |
curl -X GET -D - \ |
|
1116 |
-H "X-Auth-Token: 0000" \ |
|
1117 |
https://pithos.dev.grnet.gr/v1/user |
|
1118 |
|
|
1119 |
* Get container information :: |
|
1120 |
|
|
1121 |
curl -X HEAD -D - \ |
|
1122 |
-H "X-Auth-Token: 0000" \ |
|
1123 |
https://pithos.dev.grnet.gr/v1/user/pithos |
|
1124 |
|
|
1125 |
* Add a new container :: |
|
1126 |
|
|
1127 |
curl -X PUT -D - \ |
|
1128 |
-H "X-Auth-Token: 0000" \ |
|
1129 |
https://pithos.dev.grnet.gr/v1/user/test |
|
1130 |
|
|
1131 |
* Delete a container :: |
|
1132 |
|
|
1133 |
curl -X DELETE -D - \ |
|
1134 |
-H "X-Auth-Token: 0000" \ |
|
1135 |
https://pithos.dev.grnet.gr/v1/user/test |
|
1136 |
|
|
1137 |
* List objects in a container :: |
|
1138 |
|
|
1139 |
curl -X GET -D - \ |
|
1140 |
-H "X-Auth-Token: 0000" \ |
|
1141 |
https://pithos.dev.grnet.gr/v1/user/pithos |
|
1142 |
|
|
1143 |
* List objects in a container (extended reply) :: |
|
1144 |
|
|
1145 |
curl -X GET -D - \ |
|
1146 |
-H "X-Auth-Token: 0000" \ |
|
1147 |
https://pithos.dev.grnet.gr/v1/user/pithos?format=json |
|
1148 |
|
|
1149 |
It is recommended that extended replies are cached and subsequent requests utilize the ``If-Modified-Since`` header. |
|
1150 |
|
|
1151 |
* List metadata keys used by objects in a container |
|
1152 |
|
|
1153 |
Will be in the ``X-Container-Object-Meta`` reply header, included in container information or object list (``HEAD`` or ``GET``). (**TBD**) |
|
1154 |
|
|
1155 |
* List objects in a container having a specific meta defined :: |
|
1156 |
|
|
1157 |
curl -X GET -D - \ |
|
1158 |
-H "X-Auth-Token: 0000" \ |
|
1159 |
https://pithos.dev.grnet.gr/v1/user/pithos?meta=favorites |
|
1160 |
|
|
1161 |
* Retrieve an object :: |
|
1162 |
|
|
1163 |
curl -X GET -D - \ |
|
1164 |
-H "X-Auth-Token: 0000" \ |
|
1165 |
https://pithos.dev.grnet.gr/v1/user/pithos/README.txt |
|
1166 |
|
|
1167 |
* Retrieve an object (specific ranges of data) :: |
|
1168 |
|
|
1169 |
curl -X GET -D - \ |
|
1170 |
-H "X-Auth-Token: 0000" \ |
|
1171 |
-H "Range: bytes=0-9" \ |
|
1172 |
https://pithos.dev.grnet.gr/v1/user/pithos/README.txt |
|
1173 |
|
|
1174 |
This will return the first 10 bytes. To get the first 10, bytes 30-39 and the last 100 use ``Range: bytes=0-9,30-39,-100``. |
|
1175 |
|
|
1176 |
* Add a new object (folder type) (**TBD**) :: |
|
1177 |
|
|
1178 |
curl -X PUT -D - \ |
|
1179 |
-H "X-Auth-Token: 0000" \ |
|
1180 |
-H "Content-Type: application/directory" \ |
|
1181 |
https://pithos.dev.grnet.gr/v1/user/pithos/folder |
|
1182 |
|
|
1183 |
* Add a new object :: |
|
1184 |
|
|
1185 |
curl -X PUT -D - \ |
|
1186 |
-H "X-Auth-Token: 0000" \ |
|
1187 |
-H "Content-Type: text/plain" \ |
|
1188 |
-T EXAMPLE.txt |
|
1189 |
https://pithos.dev.grnet.gr/v1/user/pithos/folder/EXAMPLE.txt |
|
1190 |
|
|
1191 |
* Update an object :: |
|
1192 |
|
|
1193 |
curl -X POST -D - \ |
|
1194 |
-H "X-Auth-Token: 0000" \ |
|
1195 |
-H "Content-Length: 10" \ |
|
1196 |
-H "Content-Type: application/octet-stream" \ |
|
1197 |
-H "Content-Range: bytes 10-19/*" \ |
|
1198 |
-d "0123456789" \ |
|
1199 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
|
1200 |
|
|
1201 |
This will update bytes 10-19 with the data specified. |
|
1202 |
|
|
1203 |
* Update an object (append) :: |
|
1204 |
|
|
1205 |
curl -X POST -D - \ |
|
1206 |
-H "X-Auth-Token: 0000" \ |
|
1207 |
-H "Content-Length: 10" \ |
|
1208 |
-H "Content-Type: application/octet-stream" \ |
|
1209 |
-H "Content-Range: bytes */*" \ |
|
1210 |
-d "0123456789" \ |
|
1211 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
|
1212 |
|
|
1213 |
* Update an object (truncate) :: |
|
1214 |
|
|
1215 |
curl -X POST -D - \ |
|
1216 |
-H "X-Auth-Token: 0000" \ |
|
1217 |
-H "X-Source-Object: /folder/EXAMPLE.txt" \ |
|
1218 |
-H "Content-Range: bytes 0-0/*" \ |
|
1219 |
-H "X-Object-Bytes: 0" \ |
|
1220 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
|
1221 |
|
|
1222 |
This will truncate the object to 0 bytes. |
|
1223 |
|
|
1224 |
* Add object metadata :: |
|
1225 |
|
|
1226 |
curl -X POST -D - \ |
|
1227 |
-H "X-Auth-Token: 0000" \ |
|
1228 |
-H "X-Object-Meta-First: first_meta_value" \ |
|
1229 |
-H "X-Object-Meta-Second: second_meta_value" \ |
|
1230 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
|
1231 |
|
|
1232 |
* Delete object metadata :: |
|
1233 |
|
|
1234 |
curl -X POST -D - \ |
|
1235 |
-H "X-Auth-Token: 0000" \ |
|
1236 |
-H "X-Object-Meta-First: first_meta_value" \ |
|
1237 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
|
1238 |
|
|
1239 |
Metadata can only be "set". To delete ``X-Object-Meta-Second``, reset all metadata. |
|
1240 |
|
|
1241 |
* Delete an object :: |
|
1242 |
|
|
1243 |
curl -X DELETE -D - \ |
|
1244 |
-H "X-Auth-Token: 0000" \ |
|
1245 |
https://pithos.dev.grnet.gr/v1/user/folder/EXAMPLE.txt |
b/docs/pithos.rst | ||
---|---|---|
1 |
.. _pithos: |
|
1 |
Object Storage Service (Pithos+) |
|
2 |
================================ |
|
2 | 3 |
|
3 |
File Storage Service (pithos+) |
|
4 |
Pithos+ is an online storage service based on the OpenStack Object |
|
5 |
Storage API with several important extensions. It uses a |
|
6 |
block-based mechanism to allow users to upload, download, and share |
|
7 |
files, keep different versions of a file, and attach policies to them. |
|
8 |
It follows a layered, modular implementation. Pithos+ was designed to |
|
9 |
be used as a storage service by the total set of the Greek research |
|
10 |
and academic community (counting tens of thousands of users) but is |
|
11 |
free and open to use by anybody, under a BSD-2 clause license. |
|
12 |
|
|
13 |
A presentation of Pithos+ features and architecture is :download:`here <pithos-plus.pdf>`. |
|
14 |
|
|
15 |
Introduction |
|
16 |
------------ |
|
17 |
|
|
18 |
In 2008 the Greek Research and Technology Network (GRNET) decided |
|
19 |
to offer an online storage service to the Greek research and academic |
|
20 |
community. The service, called Pithos, was implemented in 2008-2009, |
|
21 |
and was made available in spring 2009. It now has more than |
|
22 |
12,000 users. |
|
23 |
|
|
24 |
In 2011 GRNET decided to offer a new, evolved online storage |
|
25 |
service, to be called Pithos+. Pithos+ is designed to address the |
|
26 |
main requirements expressed by the Pithos users in the first two years of |
|
27 |
operation: |
|
28 |
|
|
29 |
* Provide both a web-based client and native desktop clients for |
|
30 |
the most common operating systems. |
|
31 |
* Allow not only uploading, downloading, and sharing, but also |
|
32 |
synchronization capabilities so that uses are able to select folders |
|
33 |
and have then synchronized automatically with their online accounts. |
|
34 |
* Allow uploading of large files, regardless of browser |
|
35 |
capabilities (depending on the version, browsers may place a 2 |
|
36 |
GBytes upload limit). |
|
37 |
* Improve upload speed; not an issue as long as the user is on a |
|
38 |
computer connected to the GRNET backbone, but it becomes important |
|
39 |
over ADSL connections. |
|
40 |
* Allow access by |
|
41 |
non-Shibboleth (http://shibboleth.internet2.edu/). |
|
42 |
accounts. Pithos delegates user authentication to the Greek |
|
43 |
Shibboleth federation, in which all research and academic |
|
44 |
institutions belong. However, it is desirable to have the option to |
|
45 |
open up Pithos to non-Shibboleth authenticated users as well. |
|
46 |
* Use open standards as far as possible. |
|
47 |
|
|
48 |
In what follows we describe the main features of Pithos+, the elements |
|
49 |
of its design and the capabilities it affords. We touch on related |
|
50 |
work and we provide some discussion on our experiences and thoughts on |
|
51 |
the future. |
|
52 |
|
|
53 |
Pithos+ Features |
|
54 |
---------------- |
|
55 |
|
|
56 |
Pithos+ is based on the OpenStack Object Storage API (Pithos |
|
57 |
used a home-grown API). We decided to adopt an open standard |
|
58 |
API in order to leverage existing clients that implement the |
|
59 |
API. In this way, a user can access Pithos+ with a standard |
|
60 |
OpenStack client - although users will want to use a Pithos+ client to |
|
61 |
use features going beyond those offered by the OpenStack API. |
|
62 |
The strategy paid off during Pithos+ development itself, as we were |
|
63 |
able to access and test the service with existing clients, while also |
|
64 |
developing new clients based on open source OpenStack clients. |
|
65 |
|
|
66 |
The major extensions on the OpenStack API are: |
|
67 |
|
|
68 |
* The use of block-based storage in lieu of an object-based one. |
|
69 |
OpenStack stores objects, which may be files, but this is not |
|
70 |
necessary - large files (longer than 5GBytes), for instance, must be |
|
71 |
stored as a series of distinct objects accompanied by a manifest. |
|
72 |
Pithos+ stores blocks, so objects can be of unlimited size. |
|
73 |
* Permissions on individual files and folders. Note that folders |
|
74 |
do not exist in the OpenStack API, but are simulated by |
|
75 |
appropriate conventions, an approach we have kept in Pithos+ to |
|
76 |
avoid incompatibility. |
|
77 |
* Fully-versioned objects. |
|
78 |
* Metadata-based queries. Users are free to set metadata on their |
|
79 |
objects, and they can list objects meeting metadata criteria. |
|
80 |
* Policies, such as whether to enable object versioning and to |
|
81 |
enforce quotas. This is particularly important for sharing object |
|
82 |
containers, since the user may want to avoid running out of space |
|
83 |
because of collaborators writing in the shared storage. |
|
84 |
* Partial upload and download based on HTTP request |
|
85 |
headers and parameters. |
|
86 |
* Object updates, where data may even come from other objects |
|
87 |
already stored in Pithos+. This allows users to compose objects from |
|
88 |
other objects without uploading data. |
|
89 |
* All objects are assigned UUIDs on creation, which can be |
|
90 |
used to reference them regardless of their path location. |
|
91 |
|
|
92 |
Pithos+ Design |
|
93 |
-------------- |
|
94 |
|
|
95 |
Pithos+ is built on a layered architecture (see Figure). |
|
96 |
The Pithos+ server speaks HTTP with the outside world. The HTTP |
|
97 |
operations implement an extended OpenStack Object Storage API. |
|
98 |
The back end is a library meant to be used by internal code and |
|
99 |
other front ends. For instance, the back end library, apart from being |
|
100 |
used in Pithos+ for implementing the OpenStack Object Storage API, |
|
101 |
is also used in our implementation of the OpenStack Image |
|
102 |
Service API. Moreover, the back end library allows specification |
|
103 |
of different namespaces for metadata, so that the same object can be |
|
104 |
viewed by different front end APIs with different sets of |
|
105 |
metadata. Hence the same object can be viewed as a file in Pithos+, |
|
106 |
with one set of metadata, or as an image with a different set of |
|
107 |
metadata, in our implementation of the OpenStack Image Service. |
|
108 |
|
|
109 |
The data component provides storage of block and the information |
|
110 |
needed to retrieve them, while the metadata component is a database of |
|
111 |
nodes and permissions. At the current implementation, data is saved to |
|
112 |
the filesystem and metadata in an SQL database. In the future, |
|
113 |
data will be saved to some distributed block storage (we are currently |
|
114 |
evaluating RADOS - http://ceph.newdream.net/category/rados), and metadata to a NoSQL database. |
|
115 |
|
|
116 |
.. image:: images/pithos-layers.png |
|
117 |
|
|
118 |
Block-based Storage for the Client |
|
119 |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
120 |
|
|
121 |
Since an object is saved as a set of blocks in Pithos+, object |
|
122 |
operations are no longer required to refer to the whole object. We can |
|
123 |
handle parts of objects as needed when uploading, downloading, or |
|
124 |
copying and moving data. |
|
125 |
|
|
126 |
In particular, a client, provided it has access permissions, can |
|
127 |
download data from Pithos+ by issuing a ``GET`` request on an |
|
128 |
object. If the request includes the ``hashmap`` parameter, then the |
|
129 |
request refers to a hashmap, that is, a set containing the |
|
130 |
object's block hashes. The reply is of the form:: |
|
131 |
|
|
132 |
{"block_hash": "sha1", |
|
133 |
"hashes": ["7295c41da03d7f916440b98e32c4a2a39351546c", ...], |
|
134 |
"block_size":131072, |
|
135 |
"bytes": 242} |
|
136 |
|
|
137 |
The client can then compare the hashmap with the hashmap computed from |
|
138 |
the local file. Any missing parts can be downloaded with ``GET`` |
|
139 |
requests with an additional ``Range`` header containing the hashes |
|
140 |
of the blocks to be retrieved. The integrity of the file can be |
|
141 |
checked against the ``X-Object-Hash`` header, returned by the |
|
142 |
server and containing the root Merkle hash of the object's |
|
143 |
hashmap. |
|
144 |
|
|
145 |
When uploading a file to Pithos+, only the missing blocks will be |
|
146 |
submitted to the server, with the following algorithm: |
|
147 |
|
|
148 |
* Calculate the hash value for each block of the object to be |
|
149 |
uploaded. |
|
150 |
* Send a hashmap ``PUT`` request for the object. This is a |
|
151 |
``PUT`` request with a ``hashmap`` request parameter appended |
|
152 |
to it. If the parameter is not present, the object's data (or part |
|
153 |
of it) is provided with the request. If the parameter is present, |
|
154 |
the object hashmap is provided with the request. |
|
155 |
* If the server responds with status 201 (Created), the blocks are |
|
156 |
already on the server and we do not need to do anything more. |
|
157 |
* If the server responds with status 409 (Conflict), the server’s |
|
158 |
response body contains the hashes of the blocks that do not exist on |
|
159 |
the server. Then, for each hash value in the server’s response (or all |
|
160 |
hashes together) send a ``POST`` request to the server with the |
|
161 |
block's data. |
|
162 |
|
|
163 |
In effect, we are deduplicating data based on their block hashes, |
|
164 |
transparently to the users. This results to perceived instantaneous |
|
165 |
uploads when material is already present in Pithos+ storage. |
|
166 |
|
|
167 |
Block-based Storage Processing |
|
4 | 168 |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
5 | 169 |
|
6 |
Pithos+ is the synnefo File Storage Service and implements the OpenStack Object |
|
7 |
Storage API + synnefo extensions. |
|
170 |
Hashmaps themselves are saved in blocks. All blocks are persisted to |
|
171 |
storage using content-based addressing. It follows that to read a |
|
172 |
file, Pithos+ performs the following operations: |
|
8 | 173 |
|
174 |
* The client issues a request to get a file, via HTTP ``GET``. |
|
175 |
* The API front end asks from the back end the metadata |
|
176 |
of the object. |
|
177 |
* The back end checks the permissions of the object and, if they |
|
178 |
allow access to it, returns the object's metadata. |
|
179 |
* The front end evaluates any HTTP headers (such as |
|
180 |
``If-Modified-Since``, ``If-Match``, etc.). |
|
181 |
* If the preconditions are met, the API front end requests |
|
182 |
from the back end the object's hashmap (hashmaps are indexed by the |
|
183 |
full path). |
|
184 |
* The back end will read and return to the API front end the |
|
185 |
object's hashmap from the underlying storage. |
|
186 |
* Depending on the HTTP ``Range`` header, the |
|
187 |
API front end asks from the back end the required blocks, giving |
|
188 |
their corresponding hashes. |
|
189 |
* The back end fetches the blocks from the underlying storage, |
|
190 |
passes them to the API front end, which returns them to the client. |
|
9 | 191 |
|
10 |
Introduction |
|
11 |
============ |
|
12 |
|
|
13 |
Pithos is a storage service implemented by GRNET (http://www.grnet.gr). Data is |
|
14 |
stored as objects, organized in containers, belonging to an account. This |
|
15 |
hierarchy of storage layers has been inspired by the OpenStack Object Storage |
|
16 |
(OOS) API and similar CloudFiles API by Rackspace. The Pithos API follows the |
|
17 |
OOS API as closely as possible. One of the design requirements has been to be |
|
18 |
able to use Pithos with clients built for the OOS, without changes. |
|
19 |
|
|
20 |
However, to be able to take full advantage of the Pithos infrastructure, client |
|
21 |
software should be aware of the extensions that differentiate Pithos from OOS. |
|
22 |
Pithos objects can be updated, or appended to. Pithos will store sharing |
|
23 |
permissions per object and enforce corresponding authorization policies. |
|
24 |
Automatic version management, allows taking account and container listings back |
|
25 |
in time, as well as reading previous instances of objects. |
|
26 |
|
|
27 |
The storage backend of Pithos is block oriented, permitting efficient, |
|
28 |
deduplicated data placement. The block structure of objects is exposed at the |
|
29 |
API layer, in order to encourage external software to implement advanced data |
|
30 |
management operations. |
|
31 |
|
|
32 |
|
|
33 |
Pithos Users and Authentication |
|
34 |
=============================== |
|
35 |
|
|
36 |
In Pithos, each user is uniquely identified by a token. All API requests |
|
37 |
require a token and each token is internally resolved to an account string. The |
|
38 |
API uses the account string to identify the user's own files, thus whether a |
|
39 |
request is local or cross-account. |
|
40 |
|
|
41 |
Pithos does not keep a user database. For development and testing purposes, |
|
42 |
user identifiers and their corresponding tokens can be defined in the settings |
|
43 |
file. However, Pithos is designed with an external authentication service in |
|
44 |
mind. This service must handle the details of validating user credentials and |
|
45 |
communicate with Pithos via a middleware software component that, given a |
|
46 |
token, fills in the internal request account variable. |
|
47 |
|
|
48 |
Client software using Pithos, if not already knowing a user's identifier and |
|
49 |
token, should forward to the ``/login`` URI. The Pithos server, depending on |
|
50 |
its configuration will redirect to the appropriate login page. |
|
51 |
|
|
52 |
The login URI accepts the following parameters: |
|
53 |
|
|
54 |
====================== ========================= |
|
55 |
Request Parameter Name Value |
|
56 |
====================== ========================= |
|
57 |
next The URI to redirect to when the process is finished |
|
58 |
renew Force token renewal (no value parameter) |
|
59 |
force Force logout current user (no value parameter) |
|
60 |
====================== ========================= |
|
61 |
|
|
62 |
When done with logging in, the service's login URI should redirect to the URI |
|
63 |
provided with ``next``, adding ``user`` and ``token`` parameters, which contain |
|
64 |
the account and token fields respectively. |
|
65 |
|
|
66 |
A user management service that implements a login URI according to these |
|
67 |
conventions is Astakos. |
|
68 |
|
|
69 |
|
|
70 |
Pithos+ Architecture |
|
71 |
==================== |
|
192 |
Saving data from the client to the server is done in several different |
|
193 |
ways. |
|
194 |
|
|
195 |
First, a regular HTTP ``PUT`` is the reverse of the HTTP ``GET``. |
|
196 |
The client sends the full object to the API front end. |
|
197 |
The API front end splits the object to blocks. It sends each |
|
198 |
block to the back end, which calculates its hash and saves it to |
|
199 |
storage. When the hashmap is complete, the API front end commands |
|
200 |
the back end to create a new object with the created hashmap and any |
|
201 |
associated metadata. |
|
202 |
|
|
203 |
Secondly, the client may send to the API front end a hashmap and |
|
204 |
any associated metadata, with a special formatted HTTP ``PUT``, |
|
205 |
using an appropriate URL parameter. In this case, if the |
|
206 |
back end can find the requested blocks, the object will be created as |
|
207 |
previously, otherwise it will report back the list of missing blocks, |
|
208 |
which will be passed back to the client. The client then may send the |
|
209 |
missing blocks by issuing an HTTP ``POST`` and then retry the |
|
210 |
HTTP ``PUT`` for the hashmap. This allows for very fast uploads, |
|
211 |
since it may happen that no real data uploading takes place, if the |
|
212 |
blocks are already in data storage. |
|
213 |
|
|
214 |
Copying objects does not involve data copying, but is performed by |
|
215 |
associating the object's hashmap with the new path. Moving objects, as |
|
216 |
in OpenStack, is a copy followed by a delete, again with no real data |
|
217 |
being moved. |
|
218 |
|
|
219 |
Updates to an existing object, which are not offered by OpenStack, are |
|
220 |
implemented by issuing an HTTP ``POST`` request including the |
|
221 |
offset and the length of the data. The API front end requests |
|
222 |
from the back end the hashmap of the existing object. Depending on the |
|
223 |
offset of the update (whether it falls within block boundaries or not) |
|
224 |
the front end will ask the back end to update or create new blocks. At |
|
225 |
the end, the front end will save the updated hashmap. It is also |
|
226 |
possible to pass a parameter to HTTP ``POST`` to specify that the |
|
227 |
data will come from another object, instead of being uploaded by the |
|
228 |
client. |
|
229 |
|
|
230 |
Pithos+ Back End Nodes |
|
231 |
^^^^^^^^^^^^^^^^^^^^^^ |
|
232 |
|
|
233 |
Pithos+ organizes entities in a tree hierarchy, with one tree node per |
|
234 |
path entry (see Figure). Nodes can be accounts, |
|
235 |
containers, and objects. A user may have multiple |
|
236 |
accounts, each account may have multiple containers, and each |
|
237 |
container may have multiple objects. An object may have multiple |
|
238 |
versions, and each version of an object has properties (a set of fixed |
|
239 |
metadata, like size and mtime) and arbitrary metadata. |
|
240 |
|
|
241 |
.. image:: images/pithos-backend-nodes.png |
|
242 |
|
|
243 |
The tree hierarchy has up to three levels, since, following the |
|
244 |
OpenStack API, everything is stored as an object in a container. |
|
245 |
The notion of folders or directories is through conventions that |
|
246 |
simulate pseudo-hierarchical folders. In particular, object names that |
|
247 |
contain the forward slash character and have an accompanying marker |
|
248 |
object with a ``Content-Type: application/directory`` as part of |
|
249 |
their metadata can be treated as directories by Pithos+ clients. Each |
|
250 |
node corresponds to a unique path, and we keep its parent in the |
|
251 |
account/container/object hierarchy (that is, all objects have a |
|
252 |
container as their parent). |
|
253 |
|
|
254 |
Pithos+ Back End Versions |
|
255 |
^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
256 |
|
|
257 |
For each object version we keep the root Merkle hash of the object it |
|
258 |
refers to, the size of the object, the last modification time and the |
|
259 |
user that modified the file, and its cluster. A version belongs |
|
260 |
to one of the following three clusters (see Figure): |
|
261 |
|
|
262 |
* normal, which are the current versions |
|
263 |
* history, which contain the previous versions of an object |
|
264 |
* deleted, which contain objects that have been deleted |
|
265 |
|
|
266 |
.. image:: images/pithos-backend-versions.png |
|
267 |
|
|
268 |
This versioning allows Pithos+ to offer to its user time-based |
|
269 |
contents listing of their accounts. In effect, this also allows them |
|
270 |
to take their containers back in time. This is implemented |
|
271 |
conceptually by taking a vertical line in the Figure and |
|
272 |
presenting to the user the state on the left side of the line. |
|
273 |
|
|
274 |
Pithos+ Back End Permissions |
|
275 |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
276 |
|
|
277 |
Pithos+ recognizes read and write permissions, which can be granted to |
|
278 |
individual users or groups of users. Groups as collections of users |
|
279 |
created at the account level by users themselves, and are flat - a |
|
280 |
group cannot contain or reference another group. Ownership of a file |
|
281 |
cannot be delegated. |
|
282 |
|
|
283 |
Pithos+ also recognizes a "public" permission, which means that the |
|
284 |
object is readable by all. When an object is made public, it is |
|
285 |
assigned a URL that can be used to access the object from |
|
286 |
outside Pithos+ even by non-Pithos+ users. |
|
287 |
|
|
288 |
Permissions can be assigned to objects, which may be actual files, or |
|
289 |
directories. When listing objects, the back end uses the permissions as |
|
290 |
filters for what to display, so that users will see only objects to |
|
291 |
which they have access. Depending on the type of the object, the |
|
292 |
filter may be exact (plain object), or a prefix (like ``path/*`` for |
|
293 |
a directory). When accessing objects, the same rules are used to |
|
294 |
decide whether to allow the user to read or modify the object or |
|
295 |
directory. If no permissions apply to a specific object, the back end |
|
296 |
searches for permissions on the closest directory sharing a common |
|
297 |
prefix with the object. |
|
298 |
|
|
299 |
Related Work |
|
300 |
------------ |
|
301 |
|
|
302 |
Commercial cloud providers have been offering online storage for quite |
|
303 |
some time, but the code is not published and we do not know the |
|
304 |
details of their implementation. Rackspace has used the OpenStack |
|
305 |
Object Storage in its Cloud Files product. Swift is an open source |
|
306 |
implementation of the OpenStack Object Storage API. As we have |
|
307 |
pointed out, our implementation maintains compatibility with |
|
308 |
OpenStack, while offering additional capabilities. |
|
309 |
|
|
310 |
Discussion |
|
311 |
---------- |
|
312 |
|
|
313 |
Pithos+ is implemented in Python as a Django application. We use SQLAlchemy |
|
314 |
as a database abstraction layer. It is currently about |
|
315 |
17,000 lines of code, and it has taken about 50 person months of |
|
316 |
development effort. This development was done from scratch, with no |
|
317 |
reuse of the existing Pithos code. That service was written in the |
|
318 |
J2EE framework. We decided to move from J2EE to Python for |
|
319 |
two reasons: first, J2EE proved an overkill for the original |
|
320 |
Pithos service in its years of operation. Secondly, Python was |
|
321 |
strongly favored by the GRNET operations team, who are the people |
|
322 |
taking responsibility for running the service - so their voice is |
|
323 |
heard. |
|
324 |
|
|
325 |
Apart from the service implementation, which we have been describing |
|
326 |
here, we have parallel development lines for native client tools on |
|
327 |
different operating systems (MS-Windows, Mac OS X, Android, and iOS). |
|
328 |
The desktop clients allow synchronization with local directories, a |
|
329 |
feature that existing users of Pithos have been asking for, probably |
|
330 |
influenced by services like DropBox. These clients are offered in |
|
331 |
parallel to the standard Pithos+ interface, which is a web application |
|
332 |
build on top of the API front end - we treat our own web |
|
333 |
application as just another client that has to go through the API |
|
334 |
front end, without granting it access to the back end directly. |
|
335 |
|
|
336 |
We are carrying the idea of our own services being clients to Pithos+ |
|
337 |
a step further, with new projects we have in our pipeline, in which a |
|
338 |
digital repository service will be built on top of Pithos+. It will |
|
339 |
use again the API front end, so that repository users will have |
|
340 |
all Pithos+ capabilities, and on top of them we will build additional |
|
341 |
functionality such as full text search, Dublin Core metadata storage |
|
342 |
and querying, streaming, and so on. |
|
343 |
|
|
344 |
At the time of this writing (March 2012) Pithos+ is in alpha, |
|
345 |
available to users by invitation. We will extend our user base as we |
|
346 |
move to beta in the coming months, and to our full set of users in the |
|
347 |
second half of 2012. We are eager to see how our ideas fare as we will |
|
348 |
scaling up, and we welcome any comments and suggestions. |
|
349 |
|
|
350 |
Acknowledgments |
|
351 |
--------------- |
|
352 |
|
|
353 |
Pithos+ is financially supported by Grant 296114, "Advanced Computing |
|
354 |
Services for the Research and Academic Community", of the Greek |
|
355 |
National Strategic Reference Framework. |
|
356 |
|
|
357 |
Availability |
|
358 |
------------ |
|
359 |
|
|
360 |
The Pithos+ code is available under a BSD 2-clause license from: |
|
361 |
https://code.grnet.gr/projects/pithos/repository |
|
362 |
|
|
363 |
The code can also be accessed from its source repository: |
|
364 |
https://code.grnet.gr/git/pithos/ |
|
365 |
|
|
366 |
More information and documentation is available at: |
|
367 |
http://docs.dev.grnet.gr/pithos/latest/index.html |
b/docs/quick-install-admin-guide.rst | ||
---|---|---|
11 | 11 |
have the following services running: |
12 | 12 |
|
13 | 13 |
* Identity Management (Astakos) |
14 |
* File Storage Service (Pithos+)
|
|
14 |
* Object Storage Service (Pithos+)
|
|
15 | 15 |
* Compute Service (Cyclades) |
16 | 16 |
* Image Registry Service (Plankton) |
17 | 17 |
|
... | ... | |
20 | 20 |
The Volume Storage Service (Archipelago) and the Billing Service (Aquarium) are |
21 | 21 |
not released yet. |
22 | 22 |
|
23 |
If you just want to install the File Storage Service (Pithos+), follow the guide
|
|
23 |
If you just want to install the Object Storage Service (Pithos+), follow the guide
|
|
24 | 24 |
and just stop after the "Testing of Pithos+" section. |
25 | 25 |
|
26 | 26 |
|
b/docs/quick-install-intgrt-guide.rst | ||
---|---|---|
15 | 15 |
installation, you will have the following services running: |
16 | 16 |
|
17 | 17 |
* Identity Management (Astakos) |
18 |
* File Storage Service (Pithos+)
|
|
18 |
* Object Storage Service (Pithos+)
|
|
19 | 19 |
* Compute Service (Cyclades) |
20 | 20 |
* Image Registry Service (Plankton) |
21 | 21 |
|
... | ... | |
24 | 24 |
The Volume Storage Service (Archipelago) and the Billing Service (Aquarium) are |
25 | 25 |
not released yet. |
26 | 26 |
|
27 |
If you just want to install the File Storage Service (Pithos+), follow the guide
|
|
27 |
If you just want to install the Object Storage Service (Pithos+), follow the guide
|
|
28 | 28 |
and just stop after the "Testing of Pithos+" section. |
29 | 29 |
|
30 | 30 |
Building a dev environment |
Also available in: Unified diff