1 Aquarium Development Guide
2 ==========================
4 The development guide includes descriptions of the APIs and extention points
5 offered by Aquarium. It also includes design and development setup information.
10 Aquarium's architectural design is mainly driven by two requirements: scaling
11 and fault tolerance. Aquarium's functionality is based on event sourcing.
12 `Event sourcing <http://en.wikipedia.org/wiki/Domain-driven_design>`_
13 assumes that all changes to application state are stored as a
14 sequence of events, in an immutable log. With such a log at hand, a system can
15 rebuild the current application state by replaying the events in order. The event
16 sourcing design pattern has some very interesting properties, which made it
17 particularity suitable for basing Aquarium on it:
19 - Multiple models can be used in order to process the events, concurrently. This means that Aquarium can provide a limited data view to its REST API and a more detailed one to a helpdesk frontend.
21 - It is possible to perform queries on past system states by stopping the event replay at a certain point of interest. This would prove very possible for a future debugging interface.
23 - In a carefully implemented event sourcing system, application crashes are not destructive, as long as event replay is fast enough and no state is inserted to the application without being recorded to the event log first.
25 - After event log replay, new events only cause updates in the system’s in-memory state, which can be done very fast.
32 An overview of the Aquarium architecture is presented in the figure above. The
33 system is modeled as a collection of logically and functionally isolated
34 components, which communicate by message passing. Withing each component, a
35 number of actors take care of concurrently processing incoming messages through
36 a load balancer component which is the gateway to requests targeted to the
37 component. Each component is also monitored by its own supervisor; should an
38 actor fail, the supervisor will automatically restart it. The architecture
39 allows certain application paths to fail individually while the system is still
40 responsive, while also enabling future distribution of multiple components on
43 The system receives input mainly from two sources: a queue for resource and
44 user events and a REST API for credits and resource state queries. The queue
45 component reads messages from a configurable number of queues and persists them
46 in the application’s immutable log store. Both input components then forward
47 incoming messages to a network of dispatcher handlers which do not do any
48 processing by themselves, but know where the user actors lay. Actual processing
49 of billing events is done within the user actors. Finally, a separate network
50 of actors take care of scheduling periodic tasks, such as refiling of user
51 credits; it does so by issuing events to the appropriate queue.
54 ----------------------
56 The accounting subsystem deals with charging users for services used and
57 providing them with credits in order to be able to use the provided services.
58 As with the rest of the Aquarium, the architecture is open-ended: the accounting
59 system does not know in advance which services it supports or what resources
60 are being offered. The configuration of the accounting system is done
61 using a Domain Specific Language (DSL) described below.
63 Data exchange with external systems is done through events, which are
64 persisted to an *immutable log*.
66 The accounting system is a generic event-processing engine that is configured by a
67 DSL. The DSL is mostly based on the
68 `YAML <http://en.wikipedia.org/wiki/Yaml>`_
69 format. The DSL supports limited algorithm definitions through integration of the Javascript language as defined below.
74 - *Credit*: A credit is the unit of currency used in Aquarium. It may or may not
75 correspond to real money.
76 - *Resource*: A resource represents an entity that can be charged for its usage. The
77 currently charged resources are: Time of VM usage, bytes uploaded and downloaded and bytes used for storage
78 - *Resource Event*: A resource event is generated from an external source and are permanently appended in an immutable event log. A raw event carries information about changes in an external system that could affect the status of a user's wallet (See more about `Resource Events`_).
79 - *AccountingEntry*: An accounting entry is the result of processing a resource event and is what gets stored to the user's wallet.
80 - *Price List*: A price list contains information of the cost of a resource.
81 A pricelist is only applied within a specified time frame.
82 - *Algorithm*: An algorithm specifies the way the charging calculation is done. It can be vary depending on resource usage, time of raw event or other information.
83 - *Credit Plan*: Defines a periodic operation of refiling a user's wallet with a
84 configurable amount of credits.
85 - *Agreement*: An agreement associates pricelists with algorithms and credit
86 plans. An agreement is assigned to one or more users/credit holders.
87 - *Billing Period*: A billing period defines a recurring timeslot at the end
88 of which the accumulated resource usage is accounted for and reset.
93 The Aquarium policy DSL allows the hierarchical definition of agreements, by
94 means of compositing ingredients and specifying validity periods for individual
95 components or for the policies themselves. The DSL also allows overriding
96 between items of the same class (i.e. an algorithm definition can override
97 certain fields of another algorithm definition, while both definitions can
98 be referenced individually).
100 The top-level schema for the DSL is as follows.
119 ... [see Creditplans]
129 Time frames allow the specification of applicability periods for algorithms,
130 pricelists and agreements. A timeframe is by default continuous and has a
131 starting point; if there is no ending point, the timeframe is considered open
132 and its ending point is the time at the time of evaluation.
134 A time frame definition can contain repeating time ranges that dissect it and
135 consequently constrain the applicability of the time frame to the defined
136 ranges only. A range always has a start and end point. A range is repeated
137 within a timeframe, until the timeframe end point is reached. In case a
138 repeating range ends later than the containing timeframe, the ending time is
139 adjusted to match that of the timeframe.
141 The definition of the starting and ending point of a time range is done in a
142 syntax reminisent of the `cron <http://en.wikipedia.org/wiki/Cron>`_ format.
147 from: %d # Milliseconds since the epoch
148 to: %d # [opt] Milliseconds since the epoch
149 repeat: # [opt] Defines a repetion list
150 - every: # [opt] A repetion entry
151 start: "min hr dom moy dow" # 5-elem cron string
152 end: "min hr dom moy dow" # 5-elem cron string
155 The following declaration defines a timeframe starting at the designated
156 timestamp and ending at the time of evaluation.
161 from: 1293703200 #(30/12/2010 10:00)
163 The following declaration defines a timeframe of one year, within which the
164 applicability of the specified policy, agreement or pricelist is constrained to
165 time ranges from 12:00 Mon to 14:00 Fri (first ``every`` definition)
166 and 15:00 Sat to 15:00 Sun.
171 from: 1293703200 #(30/12/2010 10:00)
172 to: 1325239200 #(30/12/2011 10:00)
175 start: "00 12 * * Mon"
178 start: "00 15 * * Sat"
184 A resource represents an entity that can be charged for. Aquarium does not
185 assume a fixed set of resource types and is extensible to any number of
186 resources. A resource has a ``name`` and a ``unit``; both are free form
187 strings. The resource name is used to uniquely identify the resource both inside
188 Aquarium and among external systems.
190 A resource definition also has a two fields that define how a resource is
191 charged and whether a user can be assigned more instances of a resource.
192 Specifically, the ``costpolicy`` field can have the following values:
194 - `continuous:` For ``continuous`` resources, the charging algorithm calculates the
195 total amount of resource usage over time, per billing period. Each new
196 resource event modifies the resource usage counter and forces Aquarium
197 to calculate a new cost for the previous amount of resource usage. A typical
198 example of a continuous resource is disk space.
199 - `onoff:` ``onoff`` resources are a category of continuous resources where the
200 resource can only be in two states, on or off. In such cases, maintaining a usage
201 counter is not necessary; the charging algorithm uses time as the unit of
202 calculation. Virtual machine time is a typical example.
203 - `discrete:` ``discrete`` resources are charged for instantly for the
204 reported resource value. Examples are bandwidth and every resource whose usage
205 is not a function of time (books, hits to an API etc).
207 Regarding resource complexity, a resource can either be labeled complex
208 or not. In the former case, a resource can have more than one instances per
209 user, and resource usage is tracked individually per instance. The
210 ``instance-id`` field in the resource event message (See `Resource Events`_)
211 helps Aquarium separate resource instances at charge time.
213 The following resource definition defines the `bandwidthup`
227 An algorithm specifies the algorithm used to perform the cost calculation, by
228 combining the reported resource usage with the applicable pricelist. As opposed
229 to price lists, algorithms define behaviours, which have certain
236 bandwidthup: {price} times {volume}
237 bandwidthdown: {price} times {volume}
238 vmtime: {price} times {volume}
239 diskspace: {price} times {volume}
246 A price list defines the prices applicable for a resource within a validity
247 period. Prices are attached to resource types and denote the policies that
248 should be deducted from an entity's wallet in response to the entity's resource
249 usage within a given charging period (currently, a month). The format is the
254 pricelist: # Pricelist structure definition
255 name: apricelist # Name for the price list, no spaces, must be unique
256 [extends: anotherpl] # [Optional] Inheritance operation: all optional fields
257 # are inherited from the named pricelist
258 bandwidthup: # Price for used upstream bandwidth per MB
259 bandwidthdown: # Price for used downstream bandwidth per MB
260 vmtime: # Price for time
261 diskspace: # Price for used diskspace, per MB
263 [see Timeframe format]
268 Credit plans define how user accounts are refilled with credits. Apart from
269 the usual ``name`` and ``effective`` attributes, a credit plan has an ``at``
270 attribute (a five-field Cron string) which defines how offen the refilling
271 operation will run and a ``credits`` attribute which defines the number of
272 credits to add to the user's wallet.
286 An agreement is the result of combining an with algorithm with a pricelist
287 and a creditplan. As the
288 accounting DSL's main purpose is to facilitate the construction of agreements
289 (which are then associated to users), the agreement is the centerpiece of
290 the language. An agreement is defined in full using the following template:
295 name: someuniqname # Unique name for
296 extends: other # [opt] name of inhereted agreement
297 pricelist: plname # Name of declared pricelist
298 resourse: value # [opt] Overiding of price for resource
299 algorithm: polname # Name of declared policy
300 resourse: value # [opt] Overiding of algorithm for resourse
302 An agreement definition can either reuse the pricelists, algorithms and creditplans
303 defined above (referenced by name) or define the effective algorithm or pricelist
305 If a ``pricelist`` or ``algorithm`` name has not been defined explicitely (and
306 therefore referenced by name), all prices or algorithms for the declared
307 resources must be defined in either the ``agreement`` or one of its parents.
309 As with all DSL resources, agreements can be overriden by other agreements.
321 Aquarium communicates with external systems through events published on an `AMQP <http://en.wikipedia.org/wiki/AMQP>`_ queue. Aquarium only understands events in the
322 `JSON <http://www.json.org/>`_ format.
324 Aquarium events share a common base format consisting of the following fields:
326 .. code-block:: javascript
330 occurredMillis: 12345,
331 receivedMillis: 12346
334 - *id:* [``string``] A per message unique string. Should be able to identify messages of the same type uniquely across Aquarium clients. Preferably a SHA-1.
335 - *occurredMillis:* [``long``] The timestamp at the event creation time. In milliseconds since the epoch.
336 - *receivedMillis:* [``long``] For Aquarium internal use. Clients should not set a value. If a value is set, it will be overwritten upon receipt.
338 In the following sections, we describe the exact format of each one of the concrete messages that Aquarium can process.
343 A resource event is sent by Aquarium clients to signify a change in a resource's
344 state. This change is processed by Aquarium's accounting system according to
345 the provisions of the configured policy in order to create entries to the user's
348 .. code-block:: javascript
352 occurredMillis: 1321020852,
353 receivedMillis: 1321020852,
354 clientID: "platform-wide-unique-ID",
355 userID: "administrator@admin.grnet.gr",
357 instanceId: "vmtime-01.02.123X.Z",
366 The meaning of the fields is as follows:
369 - *occurredMillis:* As above.
370 - *receivedMillis:* As above.
371 - *clientID:* ``string`` A unique name for each message producer.
372 - *userID:* ``string`` The ID of the user that will be charged for the resource usage details reported in the resource event.
373 - *resource* ``string`` The name of the resource as declared in the Aquarium DSL. See `Resources`_ for more.
374 - *instanceId* ``string`` If the resource is complex, then this field is set to a unique identifier for the specific instance of the resource. In case of a non-complex resource, Aquarium does not examine this value.
375 - *eventVersion* ``string`` The event version. Currently fixed to "1".
376 - *value*: ``double`` The value of resource usage. Depends on the cost policy defined for the resource as follows:
377 + For ``continuous`` resources, the value indicates the amount of resource usage since the last resource event for the specific resource.
378 + For ``onoff`` resources, it is set to 1 when the resource is actively used and to 0 when the resource usage has stopped.
379 + For ``discrete`` resources, the field indicates the amount of resource usage at the time of the event.
380 - *details*: ``map[string, string]`` A map/dictionary indicating extra metadata for this resource event. Aquarium does not process this metadata. The field must always be present, even if it is empty.
385 The charging algorithm
386 ----------------------
390 The Aquarium REST API
391 ---------------------
393 The Aquarium REST API is used to query a
395 As Aquarium is a backend system, clients are trusted and therefore no
396 authentication is required for accessing Aquarium's API.
401 **GET** /user/*id*/balance
403 **Normal Response Code**: 200
405 **Error Response Codes**: itemNotFound (404), timeout (500)
407 The operation returns the current balance for a user.
409 **Example get balance response**
411 .. code-block:: javascript
422 ================== ================================
424 ================== ================================
425 0.1 (Nov 2, 2011) Initial release. Credit and debit policy descriptions
426 0.2 (Feb 23, 2012) Update definitions, remove company use case
427 0.3 (Feb 28, 2012) Event and resource descriptions
428 ================== ================================