Revision 5bd53e3b
b/docs/source/backends.rst | ||
---|---|---|
6 | 6 |
|
7 | 7 |
base_backend |
8 | 8 |
simple_backend |
9 |
hashfiler_backend |
b/docs/source/design.rst | ||
---|---|---|
1 |
|
|
2 |
******************** |
|
3 |
Pithos Server Design |
|
4 |
******************** |
|
5 |
|
|
6 |
Up-to-date: 2011-06-14 |
|
7 |
|
|
8 |
Key Target Features |
|
9 |
=================== |
|
10 |
|
|
11 |
OpenStack Object Storage API |
|
12 |
---------------------------- |
|
13 |
|
|
14 |
Originally derived from Rackspace's Cloudfiles API, |
|
15 |
it features, like Amazon S3, a two-level real hieararchy |
|
16 |
with containers and objects per account, while objects |
|
17 |
within each container are hosted in a flat namespace, |
|
18 |
albeit with options for 'virtual' hierarchy listings. |
|
19 |
Accounts, containers and objects may host user-defined metadata. |
|
20 |
|
|
21 |
The use of a well-known API, hopefully, will provide immediate |
|
22 |
compatibility with existing clients and provide familiarity |
|
23 |
to developers. Adoption and familiarity will also make |
|
24 |
custom extensions more accessible to developers. |
|
25 |
|
|
26 |
However, Pithos differs as a service, mainly because it (also) |
|
27 |
targets end users. This inevitably forces extensions to the API, |
|
28 |
but the plan is to keep OOS compatibility, anyway. |
|
29 |
|
|
30 |
Partial Transfers |
|
31 |
----------------- |
|
32 |
|
|
33 |
An important feature, especially for home users, is the ability |
|
34 |
to continue transfers after interruption by choice or failure, |
|
35 |
without significant loss of the partial transfer completed |
|
36 |
up to the point of interruption. |
|
37 |
|
|
38 |
The Manifest mechanism in the CloudFiles API and |
|
39 |
the the Multipart Upload mechanism in Amazon S3, |
|
40 |
both provide a means to partial transfers. |
|
41 |
Manifests is not (yet?) in the OpenStack specification |
|
42 |
but is considered for support in Pithos. |
|
43 |
|
|
44 |
The Pithos Server approach is similar, allowing appending |
|
45 |
(actually, any modification) to existing objects via |
|
46 |
HTTP 1.1 chunked transfers. Chunks reach stable storage |
|
47 |
before the whole transfer is complete and restarting |
|
48 |
a transfer from the point it was interrupted is possible |
|
49 |
by querying the status of the target object (e.g. its size). |
|
50 |
|
|
51 |
Atomic Updates |
|
52 |
-------------- |
|
53 |
|
|
54 |
However, when updating existing objects, transfers do not |
|
55 |
happen instantaneously. To support partial transfers, |
|
56 |
the content must be committed to storage before the transfer |
|
57 |
is complete. This creates a hazard: |
|
58 |
The partially committed content may temporarily set the object |
|
59 |
in a visible, inconsistent state after the transfer has started |
|
60 |
and before the transfer has completed. |
|
61 |
Furthermore, failed transfers that are never retried result |
|
62 |
in permanent corruption and data loss. |
|
63 |
|
|
64 |
The Manifest and Multipart Upload mechanisms both provide atomicity, |
|
65 |
but they only specify the creation of a new object and not |
|
66 |
the updating of an existing one. |
|
67 |
|
|
68 |
The Pithos Server approach is similar, but more flexible. |
|
69 |
There are two types of updates to an object: |
|
70 |
Those with an HTTP body as a source which is not atomic, |
|
71 |
and those with another object as a source, which is atomic. |
|
72 |
This way, clients may choose to perform atomic updates in |
|
73 |
two stages, as with Manifests and Multipart Uploads, |
|
74 |
or choose to make a hazardous update directly. |
|
75 |
|
|
76 |
Transfer bandwidth efficiency |
|
77 |
----------------------------- |
|
78 |
|
|
79 |
Another issue is the volume of data needed to be transfered |
|
80 |
when updating objects. |
|
81 |
If multiple clients update the object regularly, each one of |
|
82 |
them cannot just send a patch because the state of the file |
|
83 |
on the server is unknown. |
|
84 |
A mechanism is needed that can compute delta patches between |
|
85 |
two versions. The standard candidate is librsync. |
|
86 |
However, content-hashing provides another alternative, |
|
87 |
discussed below. |
|
88 |
|
|
89 |
Cheap versions, snapshots and clones |
|
90 |
------------------------------------ |
|
91 |
|
|
92 |
The Pithos Server supports cheap versions, snapshots and clones |
|
93 |
for its hosted objects. First, to clarify the terms. |
|
94 |
|
|
95 |
Versions |
|
96 |
are different objects with the same path, |
|
97 |
identified by an extra version identifier. |
|
98 |
Version identifiers are ordered on the time of the version creation. |
|
99 |
|
|
100 |
Snapshots |
|
101 |
are immutable copies of objects at a specific point in time, |
|
102 |
archived for future reference. |
|
103 |
|
|
104 |
Clones |
|
105 |
are mutable copies of snapshots or other objects, |
|
106 |
and have their own private history after their creation. |
|
107 |
|
|
108 |
Note that snapshots and clones may have a different path that |
|
109 |
their source object, while versions always have the same path. |
|
110 |
|
|
111 |
It is not yet decided if snapshots will be explicitly in the API, |
|
112 |
by providing a read-only object type. |
|
113 |
It is also not yet decided if clones will be explicitly in the API, |
|
114 |
as objects that have a 'source' back to their source object. |
|
115 |
|
|
116 |
Effectively, the only prerequisite for cheap versions, shapshot and |
|
117 |
clones is cheap copying of objects. This Pithos Server feature is |
|
118 |
designed in concert with content-hashing, explained below. |
|
119 |
|
|
120 |
Pluggable Storage Backend |
|
121 |
------------------------- |
|
122 |
|
|
123 |
The Pithos service is destined to run on distributed block storage servers. |
|
124 |
Therefore the Pithos Server design assumes a pluggable storage backend |
|
125 |
to a reliable, redundant object storage service. |
|
126 |
|
|
127 |
|
|
128 |
Content-Hashing: The Case for Virtualizing Files |
|
129 |
================================================ |
|
130 |
|
|
131 |
All of the key target features (except the OOS API) can be well serviced |
|
132 |
by a content-hashing/blocking approach to storage. |
|
133 |
|
|
134 |
Raw storage is becoming a basic commodity over the internet, |
|
135 |
in many forms, and especially as a 'cloud' service. |
|
136 |
It makes sense to virtualize files and separate |
|
137 |
the storage containing and serving layer |
|
138 |
from the semantically rich and application-defined file serving layer. |
|
139 |
|
|
140 |
Content hashing with a consistent blocking (chunking) scheme, |
|
141 |
inherently provides both universal content identification |
|
142 |
and logical separation of file- and content-serving. |
|
143 |
Additionally it offers many benefits and |
|
144 |
its costs have minimum impact. |
|
145 |
|
|
146 |
We iterate through the benefits and comment: |
|
147 |
|
|
148 |
* Universal identification |
|
149 |
With the reservation of hash collisions, every file and every block-aligned |
|
150 |
part of a file can be uniquely identified by its hash digest. |
|
151 |
|
|
152 |
* Get data from anyone who might serve it, easy sharing |
|
153 |
Because of universal identification, |
|
154 |
it matters not where you get the actual contents from. |
|
155 |
Just like it (almost) does not matter where you get |
|
156 |
your internet feed from. |
|
157 |
|
|
158 |
Pithos, will seek to exploit that by deploying a |
|
159 |
separate layer for serving blocks. |
|
160 |
|
|
161 |
* Universal storage provisioning, efficient content serving. |
|
162 |
Because content-hashing separates the semantically and funtionally rich |
|
163 |
file-serving layer from the content serving layer, |
|
164 |
the actual storage service has a simple get/put interface |
|
165 |
to read-only objects. |
|
166 |
|
|
167 |
This enables easy deployment of diverse systems as storage backends. |
|
168 |
It also enables the content-serving system to be separate, simpler |
|
169 |
and more performant than the file-serving system. |
|
170 |
|
|
171 |
* Cheap file copies |
|
172 |
Files become maps of hashes, and are "virtualized", in the sense |
|
173 |
that they only contain pointers to the actual content. |
|
174 |
The maps are much smaller (depending on the block size) |
|
175 |
and copying them incurs little overhead compared to copying |
|
176 |
the data. |
|
177 |
|
|
178 |
* Cheap updates |
|
179 |
A small change in a large file will result only in small changes in |
|
180 |
the hashes map, therefore only the relevant blocks need to be uploaded |
|
181 |
and updated within the map. |
|
182 |
|
|
183 |
* Data checksums |
|
184 |
Data reliability is no longer an ignorable issue and we need to |
|
185 |
checksum our data anyway. Content-hash is precicely that. |
|
186 |
|
|
187 |
The only drawback we have registered is the overhead of managing |
|
188 |
blocks that are no longer used. |
|
189 |
However, our (somewhat) educated guess is that the unused blocks will |
|
190 |
be only a small percentage of the total. |
|
191 |
In any case, our current consensus is that we clean them up |
|
192 |
in maintenance operations. |
|
193 |
|
|
194 |
|
|
195 |
Specific Issues |
|
196 |
=============== |
|
197 |
|
|
198 |
Pseudo- vs real hierarchies |
|
199 |
--------------------------- |
|
200 |
|
|
201 |
The main difference between real and pseudo-hierarchies is in the |
|
202 |
way the namespace is built. |
|
203 |
In real hierarchies, like most disk filesystems, |
|
204 |
the namespace is built by by recursive containing of directories |
|
205 |
inside a root directory. |
|
206 |
In pseudo-hierarchies, the namespace is flat, and hierarchy is defined |
|
207 |
by the lexicographical ordering of the path of each file. |
|
208 |
|
|
209 |
There are two important practical consequences: |
|
210 |
|
|
211 |
- Pseudo-hierarchies can have less overhead and perform faster lookups, |
|
212 |
but cannot move files efficiently. |
|
213 |
|
|
214 |
- Real-hierarchies can move files instantly, |
|
215 |
but every lookup must iterate through all parents of the target file. |
|
216 |
|
|
217 |
Since Pithos Server, through the OpenStack Object Storage API, |
|
218 |
has adopted a pseudo-hierarchy for the containers, |
|
219 |
it is important to not contradict this design choice |
|
220 |
with other incompatible choices. |
|
221 |
|
|
222 |
File Properties, Attributes, Features |
|
223 |
------------------------------------- |
|
224 |
|
|
225 |
For the purpose of the design of Pithos Server Internals, |
|
226 |
we define three types of metadata for files. |
|
227 |
|
|
228 |
Properties |
|
229 |
are intrinsic qualities of files, such as their size, or content-type. |
|
230 |
Properties are interpreted and handled specially by the Server code, |
|
231 |
and their changing means fundamentally altering the object. |
|
232 |
|
|
233 |
Attributes |
|
234 |
are key-value pairs attached to each file by the user, |
|
235 |
and are not intrinsic qualities of files like properties. |
|
236 |
The System may interpret some special keys as user input, |
|
237 |
but never consider attributes to be a fundamental |
|
238 |
and trusted quality of a file. |
|
239 |
In the current design, |
|
240 |
file versions do not share one attribute set, but each has its own. |
|
241 |
|
|
242 |
X-Features |
|
243 |
are like attributes, but with one key difference and one key limitation. |
|
244 |
Unlike attributes, features are attached to paths and not versions. |
|
245 |
Therefore, each file "inherits" the features that are defined |
|
246 |
by its path, or some prefix of its path. |
|
247 |
The X stands for *exclusive*, which is the limitation of x-features. |
|
248 |
There can never be two X-Feature sets in two overlapping paths. |
|
249 |
Therefore, in order to set a feature set on a path, |
|
250 |
it is necessary to purge features from overlapping paths, |
|
251 |
or the operation fails. |
|
252 |
This limitation greatly reduces the practical overhead for the Server |
|
253 |
to query features and feature inheritance for arbitrary paths. |
|
254 |
However it does not cripple their use nearly as much. |
|
255 |
One may even argue that it simplifies things for the users. |
|
256 |
|
|
257 |
Sharing |
|
258 |
------- |
|
259 |
|
|
260 |
The basic idea is that sharing is one read-list and one write-list |
|
261 |
as x-features of a path, and that users and group of users may |
|
262 |
be specified in each list, granting the corresponding access rights. |
|
263 |
Currently, the 'write' permission implies the 'read' one. |
|
264 |
More permission types are possible with the addition of relevant lists. |
|
265 |
|
b/docs/source/hashfiler.rst | ||
---|---|---|
1 |
HashFiler |
|
2 |
=========== |
|
3 |
|
|
4 |
.. automodule:: pithos.lib.hashfiler |
|
5 |
:members: |
|
6 |
:show-inheritance: |
|
7 |
:undoc-members: |
|
8 |
|
|
9 |
|
|
10 |
Blocker |
|
11 |
------- |
|
12 |
.. automodule:: pithos.lib.hashfiler.blocker |
|
13 |
:members: |
|
14 |
:show-inheritance: |
|
15 |
:undoc-members: |
|
16 |
|
|
17 |
|
|
18 |
Mapper |
|
19 |
------ |
|
20 |
.. automodule:: pithos.lib.hashfiler.mapper |
|
21 |
:members: |
|
22 |
:show-inheritance: |
|
23 |
:undoc-members: |
|
24 |
|
|
25 |
|
|
26 |
AccessController |
|
27 |
---------------- |
|
28 |
.. automodule:: pithos.lib.hashfiler.access_controller |
|
29 |
:members: |
|
30 |
:show-inheritance: |
|
31 |
:undoc-members: |
|
32 |
|
|
33 |
|
|
34 |
Noder |
|
35 |
----- |
|
36 |
.. automodule:: pithos.lib.hashfiler.noder |
|
37 |
:members: |
|
38 |
:show-inheritance: |
|
39 |
:undoc-members: |
|
40 |
|
|
41 |
|
|
42 |
Filer |
|
43 |
----- |
|
44 |
.. automodule:: pithos.lib.hashfiler.filer |
|
45 |
:members: |
|
46 |
:show-inheritance: |
|
47 |
:undoc-members: |
|
48 |
|
|
49 |
|
b/docs/source/hashfiler_backend.rst | ||
---|---|---|
1 |
HashFilerBack |
|
2 |
============= |
|
3 |
|
|
4 |
.. automodule:: pithos.backends.hashfilerback |
|
5 |
:members: |
|
6 |
:show-inheritance: |
|
7 |
:undoc-members: |
|
8 |
|
b/docs/source/index.rst | ||
---|---|---|
6 | 6 |
.. toctree:: |
7 | 7 |
:maxdepth: 2 |
8 | 8 |
|
9 |
design |
|
9 | 10 |
devguide |
10 | 11 |
adminguide |
11 | 12 |
backends |
13 |
hashfiler |
|
12 | 14 |
|
13 | 15 |
|
14 | 16 |
Indices and tables |
b/pithos/backends/__init__.py | ||
---|---|---|
34 | 34 |
from django.conf import settings |
35 | 35 |
|
36 | 36 |
from simple import SimpleBackend |
37 |
from hashfilerback import HashFilerBack |
|
37 | 38 |
|
38 | 39 |
backend = None |
39 | 40 |
options = getattr(settings, 'BACKEND', None) |
b/pithos/backends/hashfilerback.py | ||
---|---|---|
1 |
# Copyright 2011 GRNET S.A. All rights reserved. |
|
2 |
# |
|
3 |
# Redistribution and use in source and binary forms, with or |
|
4 |
# without modification, are permitted provided that the following |
|
5 |
# conditions are met: |
|
6 |
# |
|
7 |
# 1. Redistributions of source code must retain the above |
|
8 |
# copyright notice, this list of conditions and the following |
|
9 |
# disclaimer. |
|
10 |
# |
|
11 |
# 2. Redistributions in binary form must reproduce the above |
|
12 |
# copyright notice, this list of conditions and the following |
|
13 |
# disclaimer in the documentation and/or other materials |
|
14 |
# provided with the distribution. |
|
15 |
# |
|
16 |
# THIS SOFTWARE IS PROVIDED BY GRNET S.A. ``AS IS'' AND ANY EXPRESS |
|
17 |
# OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED |
|
18 |
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR |
|
19 |
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL GRNET S.A OR |
|
20 |
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, |
|
21 |
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT |
|
22 |
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF |
|
23 |
# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED |
|
24 |
# AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT |
|
25 |
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN |
|
26 |
# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE |
|
27 |
# POSSIBILITY OF SUCH DAMAGE. |
|
28 |
# |
|
29 |
# The views and conclusions contained in the software and |
|
30 |
# documentation are those of the authors and should not be |
|
31 |
# interpreted as representing official policies, either expressed |
|
32 |
# or implied, of GRNET S.A. |
|
33 |
|
|
34 |
from pithos.lib.hashfiler import (Filer, ROOTNODE, |
|
35 |
SERIAL, PARENT, PATH, SIZE, POPULATION, |
|
36 |
POPSIZE, SOURCE, MTIME, CLUSTER) |
|
37 |
from base import NotAllowedError |
|
38 |
from binascii import hexlify, unhexlify |
|
39 |
from os import makedirs, mkdir |
|
40 |
from os.path import exists, isdir, split |
|
41 |
from itertools import chain |
|
42 |
from collections import defaultdict |
|
43 |
from heapq import merge |
|
44 |
|
|
45 |
debug = 0 |
|
46 |
|
|
47 |
READ = 0 |
|
48 |
WRITE = 1 |
|
49 |
|
|
50 |
MAIN_CLUSTER = 0 |
|
51 |
TRASH_CLUSTER = 1 |
|
52 |
|
|
53 |
def backend_method(func=None, name=None, autocommit=1): |
|
54 |
if func is None: |
|
55 |
def fn(func): |
|
56 |
return backend_method(func, name=name, autocommit=autocommit) |
|
57 |
return fn |
|
58 |
|
|
59 |
name = func.__name__ if name is None else name |
|
60 |
thefunc = None |
|
61 |
if debug: |
|
62 |
def fn(self, *args, **kw): |
|
63 |
print "%s %s %s" % (name, args, kw) |
|
64 |
return func(self, *args, **kw) |
|
65 |
thefunc = fn |
|
66 |
else: |
|
67 |
thefunc = func |
|
68 |
#XXX: Can't set __doc__ in methods?! |
|
69 |
# |
|
70 |
#doc = func.__doc__ |
|
71 |
#if doc is None: |
|
72 |
# doc = '' |
|
73 |
#func.__doc__ = getattr(BaseBackend, name).__doc__ + '\n***\n' + doc |
|
74 |
if not autocommit: |
|
75 |
return func |
|
76 |
|
|
77 |
def method(self, *args, **kw): |
|
78 |
with self.hf: |
|
79 |
return thefunc(self, *args, **kw) |
|
80 |
|
|
81 |
return method |
|
82 |
|
|
83 |
|
|
84 |
class HashFilerBack(object): |
|
85 |
"""Abstract backend class that serves as a reference for actual implementations. |
|
86 |
|
|
87 |
The purpose of the backend is to provide the necessary functions for handling data |
|
88 |
and metadata. It is responsible for the actual storage and retrieval of information. |
|
89 |
|
|
90 |
Note that the account level is always valid as it is checked from another subsystem. |
|
91 |
|
|
92 |
The following variables should be available: |
|
93 |
'hash_algorithm': Suggested is 'sha256' |
|
94 |
'block_size': Suggested is 4MB |
|
95 |
""" |
|
96 |
|
|
97 |
def __init__(self, db): |
|
98 |
head, tail = split(db) |
|
99 |
if not exists(head): |
|
100 |
makedirs(head) |
|
101 |
if not isdir(head): |
|
102 |
raise RuntimeError("Cannot open database at '%s'" % (db,)) |
|
103 |
|
|
104 |
blocksize = 4*1024*1024 |
|
105 |
hashtype = 'sha256' |
|
106 |
root = head + '/root' |
|
107 |
if not exists(root): |
|
108 |
mkdir(root) |
|
109 |
db = head + '/db' |
|
110 |
params = { 'filepath': db, |
|
111 |
'mappath' : root, |
|
112 |
'blockpath': root, |
|
113 |
'hashtype': hashtype, |
|
114 |
'blocksize': blocksize } |
|
115 |
|
|
116 |
hf = Filer(**params) |
|
117 |
self.hf = hf |
|
118 |
self.block_size = blocksize |
|
119 |
self.hash_algorithm = hashtype |
|
120 |
|
|
121 |
def quit(self): |
|
122 |
self.hf.quit() |
|
123 |
|
|
124 |
def get_account(self, account, root=ROOTNODE): |
|
125 |
#XXX: how are accounts created? |
|
126 |
r = self.hf.file_lookup(root, account, create=1) |
|
127 |
if r is None: |
|
128 |
raise NameError("account '%s' not found" % (account,)) |
|
129 |
return r[0] |
|
130 |
|
|
131 |
def get_container(self, serial, name): |
|
132 |
r = self.hf.file_lookup(serial, name, create=0) |
|
133 |
if r is None: |
|
134 |
raise NameError("container '%s' not found" % (name,)) |
|
135 |
return r[0] |
|
136 |
|
|
137 |
def get_file(self, serial, path): |
|
138 |
r = self.hf.file_lookup(ROOTNODE, account, create=0) |
|
139 |
if r is None: |
|
140 |
raise NameError("file '%s' not found" % (path,)) |
|
141 |
return r |
|
142 |
|
|
143 |
@staticmethod |
|
144 |
def insert_properties(meta, props): |
|
145 |
(serial, parent, path, size, |
|
146 |
population, popsize, source, mtime, cluster) = props |
|
147 |
meta['version'] = serial |
|
148 |
meta['name'] = path |
|
149 |
meta['bytes'] = size + popsize |
|
150 |
meta['count'] = population |
|
151 |
meta['created'] = mtime |
|
152 |
meta['modified'] = mtime |
|
153 |
return meta |
|
154 |
|
|
155 |
def lookup_file(self, account, container, name, |
|
156 |
create=0, size=0, authorize=0, before=None): |
|
157 |
# XXX: account/container lookup cache? |
|
158 |
serial = self.get_account(account) |
|
159 |
serial = self.get_container(serial, container) |
|
160 |
if before is None: |
|
161 |
before = float('inf') |
|
162 |
props = self.hf.file_lookup(serial, name, |
|
163 |
create=create, size=size, before=before) |
|
164 |
if props is None: |
|
165 |
raise NameError("File '%s/%s/%s' does not exist" % |
|
166 |
(account, container, name)) |
|
167 |
return props |
|
168 |
|
|
169 |
def can_read(self, user, account, container, path): |
|
170 |
path = "%s:%s:%s" % (account, container, path) |
|
171 |
check = self.hf.access_check |
|
172 |
return check(READ, path, user) or check(WRITE, path, user) |
|
173 |
|
|
174 |
def can_write(self, user, account, container, path): |
|
175 |
path = "%s:%s:%s" % (account, container, path) |
|
176 |
check = self.hf.access_check |
|
177 |
return check(WRITE, path, user) |
|
178 |
|
|
179 |
@backend_method |
|
180 |
def get_account_meta(self, user, account, until=None): |
|
181 |
if user != account: |
|
182 |
raise NotAllowedError |
|
183 |
|
|
184 |
hf = self.hf |
|
185 |
before = float("inf") if until is None else until |
|
186 |
props = hf.file_lookup(ROOTNODE, account, create=0, before=before) |
|
187 |
if props is None: |
|
188 |
#XXX: why not return error? |
|
189 |
meta = {'name': account, |
|
190 |
'count': 0, |
|
191 |
'bytes': 0} |
|
192 |
if until is not None: |
|
193 |
meta['until_timestamp'] = 0 |
|
194 |
return meta |
|
195 |
|
|
196 |
serial = props[SERIAL] |
|
197 |
meta = dict(hf.file_get_attributes(serial)) |
|
198 |
meta['name'] = account |
|
199 |
self.insert_properties(meta, props) |
|
200 |
if until is not None: |
|
201 |
meta['until_timestamp'] = props[MTIME] |
|
202 |
return meta |
|
203 |
|
|
204 |
@backend_method |
|
205 |
def update_account_meta(self, user, account, meta, replace=False): |
|
206 |
if user != account: |
|
207 |
raise NotAllowedError |
|
208 |
|
|
209 |
hf = self.hf |
|
210 |
serial = self.get_account(account) |
|
211 |
if replace: |
|
212 |
hf.file_del_attributes(serial) |
|
213 |
|
|
214 |
hf.file_set_attributes(serial, meta.iteritems()) |
|
215 |
|
|
216 |
@backend_method |
|
217 |
def get_account_groups(self, user, account): |
|
218 |
if user != account: |
|
219 |
raise NotAllowedError |
|
220 |
|
|
221 |
groups = defaultdict(list) |
|
222 |
grouplist = self.hf.group_list("%s:" % user) |
|
223 |
for name, member in grouplist: |
|
224 |
groups[name].append(member) |
|
225 |
|
|
226 |
return groups |
|
227 |
|
|
228 |
@backend_method |
|
229 |
def update_account_groups(self, user, account, groups, replace=False): |
|
230 |
if user != account: |
|
231 |
raise NotAllowedError |
|
232 |
|
|
233 |
hf = self.hf |
|
234 |
group_addmany = hf.group_addmany |
|
235 |
for name, members in groups: |
|
236 |
name = "%s:%s" % (user, name) |
|
237 |
if not members: |
|
238 |
members = (name,) |
|
239 |
group_addmany(name, members) |
|
240 |
|
|
241 |
if not replace: |
|
242 |
return |
|
243 |
|
|
244 |
groupnames = hf.group_list("%s:" % user) |
|
245 |
group_destroy = hf.group_destroy |
|
246 |
for name in groupnames: |
|
247 |
if name not in groups: |
|
248 |
group_destroy(name) |
|
249 |
|
|
250 |
@backend_method |
|
251 |
def delete_account(self, user, account): |
|
252 |
if user != account: |
|
253 |
raise NotAllowedError |
|
254 |
|
|
255 |
hf = self.hf |
|
256 |
r = hf.file_lookup(ROOTNODE, account) |
|
257 |
if r is None or r[SERIAL] is None: |
|
258 |
raise NameError |
|
259 |
|
|
260 |
if r[POPULATION]: |
|
261 |
raise IndexError |
|
262 |
|
|
263 |
hf.file_remove(r[SERIAL]) |
|
264 |
|
|
265 |
@backend_method |
|
266 |
def put_container(self, user, account, name, policy=None): |
|
267 |
if user != account: |
|
268 |
raise NotAllowedError |
|
269 |
|
|
270 |
parent = self.get_account(account) |
|
271 |
hf = self.hf |
|
272 |
r = hf.file_lookup(parent, name) |
|
273 |
if r is not None: |
|
274 |
raise NameError("%s:%s already exists" % (account, name)) |
|
275 |
|
|
276 |
serial, mtime = hf.file_create(parent, name, 0) |
|
277 |
return serial, mtime |
|
278 |
|
|
279 |
@backend_method |
|
280 |
def delete_container(self, user, account, name): |
|
281 |
if user != account: |
|
282 |
raise NotAllowedError |
|
283 |
|
|
284 |
serial = self.get_account(account) |
|
285 |
serial = self.get_container(serial, name) |
|
286 |
r = self.hf.file_remove(serial) |
|
287 |
if r is None: |
|
288 |
raise AssertionError("This should not have happened.") |
|
289 |
elif r == 0: |
|
290 |
#XXX: if container has only trashed objects, is it empty or not? |
|
291 |
msg = "'%s:%s' not empty. will not delete." % (account, name) |
|
292 |
raise IndexError(msg) |
|
293 |
|
|
294 |
@backend_method |
|
295 |
def get_container_meta(self, user, account, name, until=None): |
|
296 |
if user != account: |
|
297 |
raise NotAllowedError |
|
298 |
|
|
299 |
serial = self.get_account(account) |
|
300 |
hf = self.hf |
|
301 |
before = float('inf') if until is None else until |
|
302 |
props = hf.file_lookup(serial, name, before=before) |
|
303 |
if props is None: |
|
304 |
raise NameError("%s:%s does not exist" % (account, name)) |
|
305 |
serial = props[0] |
|
306 |
|
|
307 |
r = hf.file_get_attributes(serial) |
|
308 |
meta = dict(r) |
|
309 |
self.insert_properties(meta, props) |
|
310 |
if until: |
|
311 |
meta['until_timestamp'] = props[MTIME] |
|
312 |
return meta |
|
313 |
|
|
314 |
@backend_method |
|
315 |
def update_container_meta(self, user, account, name, meta, replace=False): |
|
316 |
if user != account: |
|
317 |
raise NotAllowedError |
|
318 |
|
|
319 |
serial = self.get_account(account) |
|
320 |
serial = self.get_container(serial, name) |
|
321 |
hf = self.hf |
|
322 |
if replace: |
|
323 |
hf.file_del_attributes(serial) |
|
324 |
|
|
325 |
hf.file_set_attributes(serial, meta.iteritems()) |
|
326 |
|
|
327 |
@backend_method |
|
328 |
def get_container_policy(self, user, account, container): |
|
329 |
return {} |
|
330 |
|
|
331 |
@backend_method |
|
332 |
def update_container_policy(self, user, account, |
|
333 |
container, policy, replace=0): |
|
334 |
return |
|
335 |
|
|
336 |
@backend_method |
|
337 |
def list_containers(self, user, account, |
|
338 |
marker=None, limit=10000, until=None): |
|
339 |
if not marker: |
|
340 |
marker = '' |
|
341 |
if limit is None: |
|
342 |
limit = -1 |
|
343 |
serial = self.get_account(account) |
|
344 |
r = self.hf.file_list(serial, '', start=marker, limit=limit) |
|
345 |
objects, prefixes = r |
|
346 |
return [(o[PATH], o[SERIAL]) for o in objects] |
|
347 |
|
|
348 |
@backend_method |
|
349 |
def list_objects(self, user, account, container, prefix='', |
|
350 |
delimiter=None, marker=None, limit=10000, |
|
351 |
virtual=True, keys=[], until=None): |
|
352 |
if user != account: |
|
353 |
raise NotAllowedError |
|
354 |
|
|
355 |
if not marker: |
|
356 |
#XXX: '' is a valid string, should be None instead |
|
357 |
marker = None |
|
358 |
if limit is None: |
|
359 |
#XXX: limit should be an integer |
|
360 |
limit = -1 |
|
361 |
|
|
362 |
if not delimiter: |
|
363 |
delimiter = None |
|
364 |
|
|
365 |
serial = self.get_account(account) |
|
366 |
serial = self.get_container(serial, container) |
|
367 |
hf = self.hf |
|
368 |
filterq = ','.join(keys) if keys else None |
|
369 |
r = hf.file_list(serial, prefix=prefix, start=marker, |
|
370 |
versions=0, filterq=filterq, |
|
371 |
delimiter=delimiter, limit=limit) |
|
372 |
objects, prefixes = r |
|
373 |
# XXX: virtual is not needed since we get them anyway |
|
374 |
#if virtual: |
|
375 |
# return [(p, None) for p in prefixes] |
|
376 |
|
|
377 |
objects = ((o[PATH], o[MTIME]) for o in objects) |
|
378 |
prefixes = ((p, None) for p in prefixes) if virtual else () |
|
379 |
return list(merge(objects, prefixes)) |
|
380 |
|
|
381 |
@backend_method |
|
382 |
def list_object_meta(self, user, account, container, until=None): |
|
383 |
serial = self.get_account(account) |
|
384 |
serial = self.get_container(serial, container) |
|
385 |
return self.hf.file_list_attributes(serial) |
|
386 |
|
|
387 |
@backend_method |
|
388 |
def get_object_meta(self, user, account, container, name, until=None): |
|
389 |
props = self.lookup_file(account, container, name, |
|
390 |
before=until, authorize=2) |
|
391 |
serial = props[0] |
|
392 |
meta = dict(self.hf.file_get_attributes(serial)) |
|
393 |
self.insert_properties(meta, props) |
|
394 |
meta['version_timestamp'] = props[MTIME] |
|
395 |
meta['modified_by'] = 'fuck you' |
|
396 |
return meta |
|
397 |
|
|
398 |
@backend_method |
|
399 |
def update_object_meta(self, user, account, container, |
|
400 |
name, meta, replace=False): |
|
401 |
props = self.lookup_file(account, container, name, authorize=1) |
|
402 |
serial = props[0] |
|
403 |
hf = self.hf |
|
404 |
if replace: |
|
405 |
hf.file_del_attributes(serial) |
|
406 |
hf.file_set_attributes(serial, meta.iteritems()) |
|
407 |
|
|
408 |
@backend_method |
|
409 |
def get_object_permissions(self, user, account, container, name): |
|
410 |
path = "%s/%s/%s" % (account, container, name) |
|
411 |
perms = defaultdict(list) |
|
412 |
for access, ident in self.hf.access_list(path): |
|
413 |
perms[access].append(ident) |
|
414 |
perms['read'] = perms.pop(READ, []) |
|
415 |
perms['write'] = perms.pop(WRITE, []) |
|
416 |
return path, perms |
|
417 |
|
|
418 |
@backend_method |
|
419 |
def update_object_permissions(self, user, account, container, name, permissions): |
|
420 |
if user != account: |
|
421 |
return NotAllowedError |
|
422 |
|
|
423 |
hf = self.hf |
|
424 |
path = "%s/%s/%s" % (account, container, name) |
|
425 |
feature = hf.feature_create() |
|
426 |
hf.access_grant(READ, path, members=permissions['read']) |
|
427 |
hf.access_grant(WRITE, path, members=permissions['write']) |
|
428 |
r = hf.xfeature_bestow(path, feature) |
|
429 |
assert(not r) |
|
430 |
|
|
431 |
@backend_method |
|
432 |
def get_object_public(self, user, account, container, name): |
|
433 |
"""Return the public URL of the object if applicable.""" |
|
434 |
return None |
|
435 |
|
|
436 |
@backend_method |
|
437 |
def update_object_public(self, user, account, container, name, public): |
|
438 |
"""Update the public status of the object.""" |
|
439 |
return |
|
440 |
|
|
441 |
@backend_method |
|
442 |
def get_object_hashmap(self, user, account, container, name, version=None): |
|
443 |
hf = self.hf |
|
444 |
if version is None: |
|
445 |
props = self.lookup_file(account, container, name, authorize=2) |
|
446 |
else: |
|
447 |
props = hf.file_get_properties(version) |
|
448 |
serial = props[SERIAL] |
|
449 |
size = props[SIZE] |
|
450 |
hashes = [hexlify(h) for h in hf.map_retr(serial)] |
|
451 |
return size, hashes |
|
452 |
|
|
453 |
@backend_method |
|
454 |
def update_object_hashmap(self, user, account, container, |
|
455 |
name, size, hashmap, meta=None, |
|
456 |
replace_meta=0, permissions=None): |
|
457 |
hf = self.hf |
|
458 |
props = self.lookup_file(account, container, name, |
|
459 |
create=1, size=size, authorize=1) |
|
460 |
serial = props[SERIAL] |
|
461 |
if size != props[SIZE]: |
|
462 |
hf.file_increment(serial, size) |
|
463 |
hf.map_stor(serial, (unhexlify(h) for h in hashmap)) |
|
464 |
if replace_meta: |
|
465 |
hf.file_del_attributes(serial) |
|
466 |
if meta: |
|
467 |
hf.file_set_attributes(serial, meta.iteritems()) |
|
468 |
|
|
469 |
@backend_method |
|
470 |
def copy_object(self, user, account, src_container, src_name, |
|
471 |
dest_container, dest_name, dest_meta={}, |
|
472 |
replace_meta=False, permissions=None, src_version=None): |
|
473 |
hf = self.hf |
|
474 |
#FIXME: authorization |
|
475 |
if src_version: |
|
476 |
props = hf.file_get_properties(src_version) |
|
477 |
else: |
|
478 |
acc = self.get_account(account) |
|
479 |
srccont = self.get_container(acc, src_container) |
|
480 |
props = hf.file_lookup(srccont, src_name) |
|
481 |
if props is None: |
|
482 |
raise NameError |
|
483 |
srcobj = props[SERIAL] |
|
484 |
|
|
485 |
dstcont = self.get_container(acc, dest_container) |
|
486 |
serial, mtime = hf.file_copy(srcobj, dstcont, dest_name) |
|
487 |
if replace_meta: |
|
488 |
hf.file_del_attributes(serial) |
|
489 |
hf.file_set_attributes(serial, dest_meta.iteritems()) |
|
490 |
|
|
491 |
@backend_method |
|
492 |
def move_object(self, user, account, src_container, src_name, |
|
493 |
dest_container, dest_name, dest_meta={}, |
|
494 |
replace_meta=False, permissions=None, src_version=None): |
|
495 |
hf = self.hf |
|
496 |
#FIXME: authorization |
|
497 |
if src_version: |
|
498 |
props = hf.file_get_properties(src_version) |
|
499 |
else: |
|
500 |
acc = self.get_account(account) |
|
501 |
srccont = self.get_container(acc, src_container) |
|
502 |
props = hf.file_lookup(srccont, src_name) |
|
503 |
if props is None: |
|
504 |
raise NameError |
|
505 |
srcobj = props[SERIAL] |
|
506 |
|
|
507 |
dstcont = self.get_container(acc, dest_container) |
|
508 |
serial, mtime = hf.file_copy(srcobj, dstcont, dest_name) |
|
509 |
if replace_meta: |
|
510 |
hf.file_del_attributes(serial) |
|
511 |
hf.file_set_attributes(serial, dest_meta.iteritems()) |
|
512 |
hf.file_remove(srcobj) |
|
513 |
|
|
514 |
@backend_method |
|
515 |
def delete_object(self, user, account, container, name): |
|
516 |
if user != account: |
|
517 |
raise NotAllowedError |
|
518 |
|
|
519 |
props = self.lookup_file(account, container, name, authorize=WRITE) |
|
520 |
serial = props[SERIAL] |
|
521 |
hf = self.hf |
|
522 |
#XXX: see delete_container, for empty-trashed ambiguty |
|
523 |
# for now, we will just remove the object |
|
524 |
hf.file_remove(serial) |
|
525 |
#hf.file_recluster(serial, TRASH_CLUSTER) |
|
526 |
#XXX: Where to unset sharing? on file trash or on trash purge? |
|
527 |
# We'll do it on file trash, but should we trash all versions? |
|
528 |
hf.xfeature_disinherit("%s/%s/%s" % (account, container, name)) |
|
529 |
|
|
530 |
@backend_method |
|
531 |
def purge_trash(self, user, acount, container): |
|
532 |
if user != account: |
|
533 |
raise NotAllowedError |
|
534 |
|
|
535 |
serial = self.get_account(account) |
|
536 |
serial = self.get_container(serial, container) |
|
537 |
self.hf.file_purge_cluster(serial, cluster=TRASH_CLUSTER) |
|
538 |
|
|
539 |
@backend_method(autocommit=0) |
|
540 |
def get_block(self, blkhash): |
|
541 |
blocks = self.hf.block_retr((unhexlify(blkhash),)) |
|
542 |
if not blocks: |
|
543 |
raise NameError |
|
544 |
return blocks[0] |
|
545 |
|
|
546 |
@backend_method(autocommit=0) |
|
547 |
def put_block(self, data): |
|
548 |
hashes, absent = self.hf.block_stor((data,)) |
|
549 |
#XXX: why hexlify hashes? |
|
550 |
return hexlify(hashes[0]) |
|
551 |
|
|
552 |
@backend_method(autocommit=0) |
|
553 |
def update_block(self, blkhash, data, offset=0): |
|
554 |
h, e = self.hf.block_delta(unhexlify(blkhash), ((offset, data),)) |
|
555 |
return hexlify(h) |
|
556 |
|
b/pithos/backends/tests.py | ||
---|---|---|
36 | 36 |
import types |
37 | 37 |
import json |
38 | 38 |
|
39 |
from simple import SimpleBackend |
|
39 |
from hashfilerback import HashFilerBack as TheBackend |
|
40 |
#from simple import SimpleBackend as TheBackend |
|
40 | 41 |
|
41 | 42 |
|
42 | 43 |
class TestAccount(unittest.TestCase): |
43 | 44 |
def setUp(self): |
44 | 45 |
self.basepath = './test/content' |
45 |
self.b = SimpleBackend(self.basepath)
|
|
46 |
self.b = TheBackend(self.basepath)
|
|
46 | 47 |
self.account = 'test' |
47 | 48 |
|
48 | 49 |
def tearDown(self): |
... | ... | |
129 | 130 |
class TestContainer(unittest.TestCase): |
130 | 131 |
def setUp(self): |
131 | 132 |
self.basepath = './test/content' |
132 |
self.b = SimpleBackend(self.basepath)
|
|
133 |
self.b = TheBackend(self.basepath)
|
|
133 | 134 |
self.account = 'test' |
134 | 135 |
|
135 | 136 |
def tearDown(self): |
... | ... | |
244 | 245 |
class TestObject(unittest.TestCase): |
245 | 246 |
def setUp(self): |
246 | 247 |
self.basepath = './test/content' |
247 |
self.b = SimpleBackend(self.basepath)
|
|
248 |
self.b = TheBackend(self.basepath)
|
|
248 | 249 |
self.account = 'test' |
249 | 250 |
|
250 | 251 |
def tearDown(self): |
... | ... | |
397 | 398 |
self.assertRaises(NameError, self.b.update_object_meta, 'test', self.account, cname, name, {}) |
398 | 399 |
|
399 | 400 |
if __name__ == "__main__": |
400 |
unittest.main() |
|
401 |
unittest.main() |
b/pithos/lib/hashfiler/__init__.py | ||
---|---|---|
1 |
|
|
2 |
from blocker import Blocker |
|
3 |
from mapper import Mapper |
|
4 |
from noder import (Noder, inf, strnextling, ROOTNODE, |
|
5 |
SERIAL, PARENT, PATH, SIZE, POPULATION, |
|
6 |
POPSIZE, SOURCE, MTIME, CLUSTER) |
|
7 |
from access_controller import AccessController |
|
8 |
from filer import Filer |
|
9 |
|
b/pithos/lib/hashfiler/access_controller.py | ||
---|---|---|
1 |
from pithos.lib.hashfiler import strnextling |
|
2 |
|
|
3 |
class AccessController(object): |
|
4 |
|
|
5 |
def __init__(self, **params): |
|
6 |
self.params = params |
|
7 |
conn = params['connection'] |
|
8 |
cur = params['cursor'] |
|
9 |
execute = cur.execute |
|
10 |
self.execute = execute |
|
11 |
self.executemany = cur.executemany |
|
12 |
self.fetchone = cur.fetchone |
|
13 |
self.fetchall = cur.fetchall |
|
14 |
self.cur = cur |
|
15 |
self.conn = conn |
|
16 |
|
|
17 |
execute(""" create table if not exists xfeatures |
|
18 |
( path text, |
|
19 |
feature integer, |
|
20 |
primary key (path) ) """) |
|
21 |
execute(""" create unique index if not exists idx_feature |
|
22 |
on xfeatures(feature, path) """) |
|
23 |
|
|
24 |
execute(""" create table if not exists ftvals |
|
25 |
( feature integer, |
|
26 |
key integer, |
|
27 |
value text, |
|
28 |
primary key (feature, key, value) ) """) |
|
29 |
|
|
30 |
execute(""" create table if not exists members |
|
31 |
( name text, |
|
32 |
member text, |
|
33 |
primary key (name, member) ) """) |
|
34 |
|
|
35 |
execute(""" create index if not exists idx_members |
|
36 |
on members(member, name) """) |
|
37 |
|
|
38 |
def xfeature_inherit(self, path): |
|
39 |
"""Return the feature inherited by the path, or None""" |
|
40 |
q = ("select path, feature from xfeatures " |
|
41 |
"where path <= ? " |
|
42 |
"order by path desc limit 1") |
|
43 |
self.execute(q, (path,)) |
|
44 |
r = self.fetchone() |
|
45 |
if r is None: |
|
46 |
return r |
|
47 |
|
|
48 |
if path.startswith(r[0]): |
|
49 |
return r |
|
50 |
|
|
51 |
return None |
|
52 |
|
|
53 |
def xfeature_list(self, path): |
|
54 |
"""Return the list of the (prefix, feature) pairs matching path. |
|
55 |
A prefix matches path if either the prefix includes the path, |
|
56 |
or the path includes the prefix. |
|
57 |
""" |
|
58 |
inherited = self.xfeature_inherit(path) |
|
59 |
if inherited: |
|
60 |
return [inherited] |
|
61 |
|
|
62 |
q = ("select path, feature from xfeatures " |
|
63 |
"where path > ? order by path") |
|
64 |
self.execute(q, (path,)) |
|
65 |
fetchone = self.fetchone |
|
66 |
prefixes = [] |
|
67 |
append = prefixes.append |
|
68 |
while 1: |
|
69 |
r = fetchone() |
|
70 |
if r is None: |
|
71 |
break |
|
72 |
if not r[0].startswith(path): |
|
73 |
break |
|
74 |
append(r) |
|
75 |
|
|
76 |
return prefixes |
|
77 |
|
|
78 |
def xfeature_bestow(self, path, feature): |
|
79 |
"""Bestow a feature to a path. |
|
80 |
If the path already inherits a feature or |
|
81 |
bestows to paths already inheriting a feature, |
|
82 |
bestow no feature and return the list of matching paths. |
|
83 |
Otherwise, bestow the feature and return an empty iterable. |
|
84 |
""" |
|
85 |
prefixes = self.xfeature_list(path) |
|
86 |
pl = len(prefixes) |
|
87 |
if (pl > 1) or (pl == 1 and prefixes[0][0] != path): |
|
88 |
return prefixes |
|
89 |
q = "insert or replace into xfeatures values (?, ?)" |
|
90 |
self.execute(q, (path, feature)) |
|
91 |
return () |
|
92 |
|
|
93 |
def xfeature_disinherit(self, path): |
|
94 |
"""Remove the path and the feature it bestows. |
|
95 |
If the path already inherits a feature or |
|
96 |
bestows to paths already inheriting a feature, |
|
97 |
remove nothing and return the list of matching paths. |
|
98 |
Otherwise, remove the path and return an empty iterable. |
|
99 |
""" |
|
100 |
prefixes = self.xfeature_list(path) |
|
101 |
if not prefixes or not (len(prefixes) == 1 and prefixes[0][0] == path): |
|
102 |
return prefixes |
|
103 |
|
|
104 |
q = "delete from xfeatures where path = ?" |
|
105 |
self.execute(q, (path,)) |
|
106 |
return () |
|
107 |
|
|
108 |
def feature_create(self): |
|
109 |
"""Create an empty feature and return its identifier.""" |
|
110 |
return self.alloc_serial() |
|
111 |
|
|
112 |
def feature_destroy(self, feature): |
|
113 |
"""Destroy a feature and all its key, value pairs.""" |
|
114 |
q = "delete from ftvals where feture = ?" |
|
115 |
self.execute(q, (feature,)) |
|
116 |
|
|
117 |
def feature_list(self, feature): |
|
118 |
"""Return the list of all key, value pairs |
|
119 |
associated with a feature. |
|
120 |
""" |
|
121 |
q = "select key, value from ftvals where feature = ?" |
|
122 |
self.execute(q, (feature,)) |
|
123 |
return self.fetchall() |
|
124 |
|
|
125 |
def feature_set(self, feature, key, value): |
|
126 |
"""Associate a key, value pair with a feature.""" |
|
127 |
q = "insert into ftvals values (?, ?, ?)" |
|
128 |
self.execute(q, (feature, key, value)) |
|
129 |
|
|
130 |
def feature_setmany(self, feature, key, values): |
|
131 |
"""Associate the given key, and values with a feature.""" |
|
132 |
q = "insert into ftvals values (?, ?, ?)" |
|
133 |
self.executemany(q, ((feature, key, v) for v in values)) |
|
134 |
|
|
135 |
def feature_unset(self, feature, key, value): |
|
136 |
"""Disassociate a key, value pair from a feature.""" |
|
137 |
q = ("delete from ftvals where " |
|
138 |
"feature = ? and key = ? and value = ?") |
|
139 |
self.execute(q, (feature, key, value)) |
|
140 |
|
|
141 |
def feature_unsetmany(self, feature, key, values): |
|
142 |
"""Disassociate the key for the values given, from a feature.""" |
|
143 |
q = ("delete from ftvals where " |
|
144 |
"feature = ? and key = ? and value = ?") |
|
145 |
self.executemany(q, ((feature, key, v) for v in values)) |
|
146 |
|
|
147 |
def feature_get(self, feature, key): |
|
148 |
"""Return the list of values for a key of a feature.""" |
|
149 |
q = "select value from ftvals where feature = ? and key = ?" |
|
150 |
self.execute(q, (feature, key)) |
|
151 |
return [r[0] for r in self.fetchall()] |
|
152 |
|
|
153 |
def feature_clear(self, feature, key): |
|
154 |
"""Delete all key, value pairs for a key of a feature.""" |
|
155 |
q = "delete from ftvals where feature = ? and key = ?" |
|
156 |
self.execute(q, (feature, key)) |
|
157 |
|
|
158 |
def group_names(self, prefix): |
|
159 |
"""List all group name, member with name begining with prefix.""" |
|
160 |
stop = strnextling(prefix) |
|
161 |
q = "select distinct name from members where name between ? and ?" |
|
162 |
self.execute(q, (prefix, stop)) |
|
163 |
return [r[0] for r in self.fetchall()] |
|
164 |
|
|
165 |
def group_list(self, prefix): |
|
166 |
"""List all groups with name begining with prefix.""" |
|
167 |
stop = strnextling(prefix) |
|
168 |
q = "select name, member from members where name between ? and ?" |
|
169 |
self.execute(q, (prefix, stop)) |
|
170 |
return self.fetchall() |
|
171 |
|
|
172 |
def group_add(self, group, member): |
|
173 |
"""Add a member to a group.""" |
|
174 |
q = "insert or ignore into members values (?, ?)" |
|
175 |
self.execute(q, (group, member)) |
|
176 |
|
|
177 |
def group_addmany(self, group, members): |
|
178 |
"""Add members to a group.""" |
|
179 |
q = "insert or ignore into members values (?, ?)" |
|
180 |
self.executemany(q, ((group, member) for member in members)) |
|
181 |
|
|
182 |
def group_destroy(self, group): |
|
183 |
"""Destroy a group.""" |
|
184 |
q = "delete from members where name = ?" |
|
185 |
self.execute(q, (group,)) |
|
186 |
|
|
187 |
def group_del(self, group, member): |
|
188 |
"""Delete a member from a group.""" |
|
189 |
q = "delete from members where name = ? and member = ?" |
|
190 |
self.execute(q, (group, member)) |
|
191 |
|
|
192 |
def group_clear(self, group): |
|
193 |
"""Clear the members of a group.""" |
|
194 |
q = "delete from members where name = ?" |
|
195 |
self.execute(q, (group, member)) |
|
196 |
|
|
197 |
def group_members(self, group): |
|
198 |
"""Return the list of members of a group.""" |
|
199 |
q = "select member from members where name = ?" |
|
200 |
self.execute(q, (group,)) |
|
201 |
return [r[0] for r in self.fetchall()] |
|
202 |
|
|
203 |
def group_check(self, group, member): |
|
204 |
"""Check if a member is in a group.""" |
|
205 |
q = "select 1 from members where name = ? and member = ?" |
|
206 |
self.execute(q, (group, member)) |
|
207 |
return bool(self.fetchone()) |
|
208 |
|
|
209 |
def group_parents(self, group, member): |
|
210 |
"""Return a list with the groups that contain a member.""" |
|
211 |
q = "select name from members where member = ?" |
|
212 |
self.execute(q, (member,)) |
|
213 |
return [r[0] for r in self.fetchall()] |
|
214 |
|
|
215 |
def access_grant(self, access, path, member='all', members=()): |
|
216 |
"""Grant a member with an access to a path.""" |
|
217 |
xfeatures = self.xfeature_list(path) |
|
218 |
xfl = len(xfeatures) |
|
219 |
if xfl > 1 or (xfl == 1 and xfeatures[0][0] != path): |
|
220 |
return xfeatures |
|
221 |
if xfl == 0: |
|
222 |
feature = self.alloc_serial() |
|
223 |
self.xfeature_bestow(path, feature) |
|
224 |
else: |
|
225 |
fpath, feature = xfeatures[0] |
|
226 |
|
|
227 |
if members: |
|
228 |
self.feature_setmany(feature, access, members) |
|
229 |
else: |
|
230 |
self.feature_set(feature, access, member) |
|
231 |
|
|
232 |
return () |
|
233 |
|
|
234 |
def access_revoke(self, access, path, member='all', members=()): |
|
235 |
"""Revoke access to path from members. |
|
236 |
Note that this will not revoke access for members |
|
237 |
that are indirectly granted access through group membership. |
|
238 |
""" |
|
239 |
# XXX: Maybe provide a force_revoke that will kick out |
|
240 |
# all groups containing the given members? |
|
241 |
xfeatures = self.xfeature_list(path) |
|
242 |
xfl = len(xfeatures) |
|
243 |
if xfl != 1 or xfeatures[0][0] != path: |
|
244 |
return xfeatures |
|
245 |
|
|
246 |
fpath, feature = xfeatures[0] |
|
247 |
|
|
248 |
if members: |
|
249 |
self.feature_unsetmany(feature, access, members=members) |
|
250 |
else: |
|
251 |
self.feature_unset(feature, access, member) |
|
252 |
|
|
253 |
# XXX: provide a meaningful return value? |
|
254 |
|
|
255 |
return () |
|
256 |
|
|
257 |
def access_check(self, access, path, member): |
|
258 |
"""Return true if the member has this access to the path.""" |
|
259 |
r = self.xfeature_inherit(path) |
|
260 |
if not r: |
|
261 |
return 0 |
|
262 |
|
|
263 |
fpath, feature = r |
|
264 |
memberset = set(self.feature_get(feature, access)) |
|
265 |
if member in memberset: |
|
266 |
return 1 |
|
267 |
|
|
268 |
for group in self.group_parents(self, member): |
|
269 |
if group in memberset: |
|
270 |
return 1 |
|
271 |
|
|
272 |
return 0 |
|
273 |
|
|
274 |
def access_list(self, path): |
|
275 |
"""Return the list of (access, member) pairs for the path.""" |
|
276 |
r = self.xfeature_inherit(path) |
|
277 |
if not r: |
|
278 |
return () |
|
279 |
|
|
280 |
fpath, feature = r |
|
281 |
return self.feature_list(feature) |
|
282 |
|
|
283 |
def access_list_paths(self, member): |
|
284 |
"""Return the list of (access, path) pairs granted to member.""" |
|
285 |
q = ("select distinct key, path from xfeatures inner join " |
|
286 |
" (select distinct feature, key from ftvals inner join " |
|
287 |
" (select name as value from members " |
|
288 |
" where member = ? union select ?) " |
|
289 |
" using (value)) " |
|
290 |
"using (feature)") |
|
291 |
|
|
292 |
self.execute(q, (member, member)) |
|
293 |
return self.fetchall() |
|
294 |
|
b/pithos/lib/hashfiler/blocker.py | ||
---|---|---|
1 |
#!/usr/bin/env python |
|
2 |
|
|
3 |
from sqlite3 import connect, OperationalError |
|
4 |
from os import chdir, makedirs, fsync, SEEK_CUR, SEEK_SET, unlink |
|
5 |
from os.path import isdir, realpath, exists, join |
|
6 |
from hashlib import new as newhasher |
|
7 |
from binascii import hexlify, unhexlify |
|
8 |
from time import time |
|
9 |
|
|
10 |
from pithos.lib.hashfiler.context_file import ContextFile |
|
11 |
|
|
12 |
|
|
13 |
class Blocker(object): |
|
14 |
"""Blocker. |
|
15 |
Required contstructor parameters: blocksize, blockpath, hashtype. |
|
16 |
""" |
|
17 |
|
|
18 |
blocksize = None |
|
19 |
blockpath = None |
|
20 |
hashtype = None |
|
21 |
|
|
22 |
def __init__(self, **params): |
|
23 |
blocksize = params['blocksize'] |
|
24 |
blockpath = params['blockpath'] |
|
25 |
blockpath = realpath(blockpath) |
|
26 |
if not isdir(blockpath): |
|
27 |
if not exists(blockpath): |
|
28 |
makedirs(blockpath) |
|
29 |
else: |
|
30 |
raise ValueError("blockpath '%s' is not a directory" % (blockpath,)) |
|
31 |
|
|
32 |
hashtype = params['hashtype'] |
|
33 |
try: |
|
34 |
hasher = newhasher(hashtype) |
|
35 |
except ValueError: |
|
36 |
msg = "hashtype '%s' is not available from hashlib" |
|
37 |
raise ValueError(msg % (hashtype,)) |
|
38 |
|
|
39 |
hasher.update("") |
|
40 |
emptyhash = hasher.digest() |
|
41 |
|
|
42 |
self.blocksize = blocksize |
|
43 |
self.blockpath = blockpath |
|
44 |
self.hashtype = hashtype |
|
45 |
self.hashlen = len(emptyhash) |
|
46 |
self.emptyhash = emptyhash |
|
47 |
|
|
48 |
def get_rear_block(self, blkhash, create=0): |
|
49 |
name = join(self.blockpath, hexlify(blkhash)) |
|
50 |
return ContextFile(name, create) |
|
51 |
|
|
52 |
def check_rear_block(self, blkhash): |
|
53 |
name = join(self.blockpath, hexlify(blkhash)) |
|
54 |
return exists(name) |
|
55 |
|
|
56 |
def block_hash(self, data): |
|
57 |
"""Hash a block of data""" |
|
58 |
hasher = newhasher(self.hashtype) |
|
59 |
hasher.update(data.rstrip('\x00')) |
|
60 |
return hasher.digest() |
|
61 |
|
|
62 |
def block_ping(self, hashes): |
|
63 |
"""Check hashes for existence and |
|
64 |
return those missing from block storage. |
|
65 |
""" |
|
66 |
missing = [] |
|
67 |
append = missing.append |
|
68 |
for i, h in enumerate(hashes): |
|
69 |
if not self.check_rear_block(h): |
|
70 |
append(i) |
|
71 |
return missing |
|
72 |
|
|
73 |
def block_retr(self, hashes): |
|
74 |
"""Retrieve blocks from storage by their hashes.""" |
|
75 |
blocksize = self.blocksize |
|
76 |
blocks = [] |
|
77 |
append = blocks.append |
|
78 |
block = None |
|
79 |
|
|
80 |
for h in hashes: |
|
81 |
with self.get_rear_block(h, 0) as rbl: |
|
82 |
if not rbl: |
|
83 |
break |
|
84 |
for block in rbl.sync_read_chunks(blocksize, 1, 0): |
|
85 |
break # there should be just one block there |
|
86 |
if not block: |
|
87 |
break |
|
88 |
append(block) |
|
89 |
|
|
90 |
return blocks |
|
91 |
|
|
92 |
def block_stor(self, blocklist): |
|
93 |
"""Store a bunch of blocks and return (hashes, missing). |
|
94 |
Hashes is a list of the hashes of the blocks, |
|
95 |
missing is a list of indices in that list indicating |
|
96 |
which blocks were missing from the store. |
|
97 |
""" |
|
98 |
block_hash = self.block_hash |
|
99 |
hashlist = [block_hash(b) for b in blocklist] |
|
100 |
mf = None |
|
101 |
missing = self.block_ping(hashlist) |
|
102 |
for i in missing: |
|
103 |
with self.get_rear_block(hashlist[i], 1) as rbl: |
|
104 |
rbl.sync_write(blocklist[i]) #XXX: verify? |
|
105 |
|
|
106 |
return hashlist, missing |
|
107 |
|
|
108 |
def block_delta(self, blkhash, offdata=()): |
|
109 |
"""Construct and store a new block from a given block |
|
110 |
and a list of (offset, data) 'patches'. Return: |
|
111 |
(the hash of the new block, if the block already existed) |
|
112 |
""" |
|
113 |
if not offdata: |
|
114 |
return None, None |
|
115 |
|
|
116 |
blocksize = self.blocksize |
|
117 |
block = self.block_retr((blkhash,)) |
|
118 |
if not block: |
|
119 |
return None, None |
|
120 |
|
|
121 |
block = block[0] |
|
122 |
newblock = '' |
|
123 |
idx = 0 |
|
124 |
size = 0 |
|
125 |
trunc = 0 |
|
126 |
for off, data in offdata: |
|
127 |
if not data: |
|
128 |
trunc = 1 |
|
129 |
break |
|
130 |
newblock += block[idx:off] + data |
|
131 |
size += off - idx + len(data) |
|
132 |
if size >= blocksize: |
|
133 |
break |
|
134 |
off = size |
|
135 |
|
|
136 |
if not trunc: |
|
137 |
newblock += block[size:len(block)] |
|
138 |
|
|
139 |
h, a = self.block_stor((newblock,)) |
|
140 |
return h[0], 1 if a else 0 |
|
141 |
|
|
142 |
def block_hash_file(self, openfile): |
|
143 |
"""Return the list of hashes (hashes map) |
|
144 |
for the blocks in a buffered file. |
|
145 |
Helper method, does not affect store. |
|
146 |
""" |
|
147 |
hashes = [] |
|
148 |
append = hashes.append |
|
149 |
block_hash = self.block_hash |
|
150 |
|
|
151 |
for block in file_sync_read_chunks(openfile, self.blocksize, 1, 0): |
|
152 |
append(block_hash(block)) |
|
153 |
|
|
154 |
return hashes |
|
155 |
|
|
156 |
def block_stor_file(self, openfile): |
|
157 |
"""Read blocks from buffered file object and store them. Return: |
|
158 |
(bytes read, list of hashes, list of hashes that were missing) |
|
159 |
""" |
|
160 |
blocksize = self.blocksize |
|
161 |
block_stor = self.block_stor |
|
162 |
hashlist = [] |
|
163 |
hextend = hashlist.extend |
|
164 |
storedlist = [] |
|
165 |
sextend = storedlist.extend |
|
166 |
lastsize = 0 |
|
167 |
|
|
168 |
for block in file_sync_read_chunks(openfile, blocksize, 1, 0): |
|
169 |
hl, sl = block_stor((block,)) |
|
170 |
hextend(hl) |
|
171 |
sextend(sl) |
|
172 |
lastsize = len(block) |
|
173 |
|
|
174 |
size = (len(hashlist) -1) * blocksize + lastsize if hashlist else 0 |
|
175 |
return size, hashlist, storedlist |
|
176 |
|
Also available in: Unified diff