Revision 5cd4e64c doc/arch/aquarium.tex

b/doc/arch/aquarium.tex
3 3
\usepackage{amssymb}
4 4
\usepackage{graphicx}
5 5
\usepackage[british]{babel}
6
\usepackage{url}
7
\usepackage{listings}
8
\usepackage{color}
9

  
6 10
\newcommand{\cL}{{\cal L}}
7 11

  
8 12
\begin{document}
......
12 16

  
13 17
\titlebanner{DRAFT---Do not distribute}
14 18

  
15
\title{Aquarium: Accounting for the Cloud in the Cloud}
19

  
20

  
21
\title{Aquarium: Billing for the Cloud in the Cloud}
16 22

  
17 23
\authorinfo{Georgios Gousios \and Christos Loverdos}
18 24
{GRNet SA}
......
34 40

  
35 41
\section{Introduction}
36 42
\section{Requirements}
43

  
44

  
45
\subsection{Application Environment}
46
Aquarium is designed and developed as part of the Okeanos project at GRNet. The
47
Okeanos project is building a full stack public IaaS system for Greek
48
universities, and several services on top of it. Several components comprise
49
the Okeanos infrastructure:
50

  
51
\begin{description}
52

  
53
    \item[Synnefo] is an IaaS management console. Users can create and start
54
        VMs, monitor their usage, create private internal networks among VMs
55
        and connect to them over the web. The service backend is based on
56
        Google's Ganneti for VM host management and hundrends of physical
57
        VM container nodes.
58

  
59
    \item[Archipelago] is a storage service, based on the Rados
60
        distributed object store. It is currently under development, and the
61
        plan is to act as the single point of storage for VM images, shared
62
        volumes and user files, providing clonable snapshots and distributed
63
        fault tolerance.
64
    
65
    \item[Pithos] is a user oriented file storage service. Currently it its
66
        second incarnation, it supports content deduplication, sharing of files
67
        and folders and a multitude of clients.
68

  
69
    \item[Astakos] is an identity consolidation system that also acts as the
70
        entry point to the entire infrastructure. Users can login using 
71
        identities from multiple systems, such as the Shibboleth (SAML) 
72
        federation enabled across all Greek universities or their Twitter 
73
        accounts.
74

  
75
\end{description}
76

  
77
While all the above systems (and several prospective ones) have different 
78
user interfaces and provide different functionalities in the context of
79
the GRnet IaaS, they all share a common notion of \emph{resources}, access
80
and manipulation options to which they offer to users. 
81

  
82

  
83
\subsection{Configuration}
84

  
85
Billing systems are by nature open ended. As new services are deployed, new
86
resources appear, while others might be phased out.  Moreover, changes to
87
company policies may trigger changes to price lists for those resources, while
88
ad-hoc requests for large scale computational resources may require special
89
pricing policies. In order for a billing system to be able to successfully
90
adapt to changing requirements, it must be able to accommodate such changes
91
without requiring changes to the application itself. This means that all
92
information required for Aquarium in order to perform a billing operation,
93
must be provided to it externally. Moreover, to ensure high-availability,
94
billing configuration should be updatable while Aquarium is running, or at
95
least with minimal downtime, without affecting the operation of external
96
systems.
97

  
98

  
99

  
100
\subsection{Scaling}
101

  
102
In the context of the Okeanos system, Aquarium provides billing services on a
103
per user basis for all resources exposed by other systems. As such, it is in
104
the critical path of user requests that modify resource state; all supported
105
applications must query Aquarium in order to ensure that the user has enough
106
credits to create a new resource. This means that for a large number of users
107
(given previous GRNet systems usage by the Greek research community, we
108
estimate a concurrency level of 30.000 users), Aquarium must update and
109
maintain in a queryable form their credit status, 
110
with soft realtime guarantees. 
111

  
112
Being on the critical path also means that Aquarium must be highly resilient,
113
too. If Aquarium fails, all supported systems will also fail. Even if Aquarium
114
fails for a short period of time, it must not loose any billing events, as this
115
will allow users to use resources without paying for them. Moreover, in case of
116
failure, Aquarium must not corrupt any billing data under any circumstances,
117
while it should reach an operating state very fast after a service restart.
118

  
37 119
\section{Architecture}
38 120

  
121

  
122

  
123
\section{Implementation}
124

  
125
\subsection{The configuration DSL}
126

  
127
The configuration requirements presented above were addressed by creating a new
128
domain specific language ({\sc dsl}), based on the YAML format.  The DSL
129
enables administrators to specify billable resources, billing policies and
130
price lists and combine them arbitrarily into agreements applicable to specific
131
users, user groups or the whole system. 
132
The DSL supports inheritance for policies, price lists and agreements and composition in the case of agreements.
133
It also facilitates the
134
definition of generic, repeatable debiting rules, which are then used by the
135
system to refill the user's account with credits on a periodic based.
136

  
137
The DSL is in itself based on five top-level entities, namely:
138

  
139
\begin{description}
140

  
141
    \item[Resources] specify the properties of resources that Aquarium knows
142
        about. Apart from the expected ones (name, unit etc), 
143
        a resource has two properties that affect billing: \textsf{costpolicy}
144
        defines whether the billing operation is to be performed at the moment
145
        a billing event has arrived, while the \textsf{complex} attribute defines
146
        whether a resource can have many instances per user.
147

  
148
    \item[Pricelists] assign a price tag to each resource, within a timeframe.
149
    
150
    \item[Algorithms] specify the way the billing operation is done in response
151
        to a billing event. The simplest (and default) way is to multiply the 
152
        billable quantity with the applicable price. To enable more complex billing
153
        scenarios, the Aquarium DSL supports a simple imperative language with
154
        a number of implicit variables (e.g. \texttt{price, volume, date}) 
155
        that enable administrators to specify, e.g. billing algorithms that
156
        scale with billable volume. Similarily to pricelists, algorithms
157
        have an applicability timeframe attached to them.
158

  
159
    \item[Crediplans] define a number of credits to give to users and a repetition
160
        period.
161

  
162
    \item[Agreements] assign a name to algorithm, pricelist and creditplan triplets,
163
        which is then assigned to each user.
164

  
165
\end{description}
166

  
167

  
39 168
\begin{figure}
40
  \begin{center}
41
  
42
  \end{center}
43
  \caption{Foundational framework of the snork mechanism.}
44
  \label{fig-ffsm}
169
\lstset{language=ruby, basicstyle=\footnotesize,
170
stringstyle=\ttfamily, 
171
flexiblecolumns=true, aboveskip=-0.9em, belowskip=0em, lineskip=0em}
172

  
173
\begin{lstlisting}
174
resources:
175
  - resource:
176
    name: bandwidthup
177
    unit: MB/hr
178
    complex: false
179
    costpolicy: continuous
180
pricelists:
181
  - pricelist: 
182
    name: default
183
    bandwidthup: 0.01
184
    effective:
185
      from: 0
186
  - pricelist: 
187
    name: everyTue2
188
    overrides: default
189
    bandwidthup: 0.1
190
    effective:
191
      repeat:
192
      - start: "00 02 * * Tue"
193
        end:   "00 02 * * Wed"
194
      from: 1326041177 #Sun, 8 Jan 2012 18:46:27 EET
195
algorithms:
196
  - algorithm:
197
    name: default
198
    bandwidthup: $price times $volume
199
    effective:
200
      from: 0
201
agreements:
202
  - agreement:
203
    name: scaledbandwidth
204
    pricelist: everyTue2
205
    algorithm:
206
      bandwidthup: |
207
        if $volume gt 15 then
208
          $volume times $price
209
        elsif $volume gt 15 and volume lt 30 then
210
          $volume times $price times 1.2
211
        else
212
          $volume times price times 1.4
213
        end
214
\end{lstlisting}
215

  
216
\caption{A simple billing policy definition.} 
217
\label{fig:dsl}
218
\end{figure}
219

  
220
In Figure~\ref{fig:dsl}, we present the definition of a simple (albeit valid) 
221
policy. The policy parsing is done top down, so the order of definition 
222
is important. The definition starts with a resource, whose name is then
223
re-used in order to attach a pricelist and a price calculation algorith to it.
224
In the case of pricelists, we present an example of \emph{temporal overloading};
225
the \texttt{everyTue2} pricelist overrides the default one, but only for 
226
all repeating time frames between every Tuesday at 02:00 and Wednesday at
227
02:00, starting from the timestamp indicated at the \texttt{from} field. Another
228
example of overloading is presented at the definition of the agreement, which
229
overloads the default algorithm definition using the imperative part of the
230
Aquarium {\sc dsl} to provide a scaling charge algorithm.
231

  
232
\subsection{Billing}
233

  
234
As common to most similar systems, billing in Aquarium is the application of
235
a billing contract to an incoming billing event in order to produce an 
236
entry for the user's wallet. However, in stark contrast to most other systems,
237
which rely on database transactions in order to securely modify the user's
238
balance, Aquarium performs account updates asynchronously and concurrently
239
for all known users.
240

  
241
Billing events are obtained by a connection to a reliable message queue.
242
The billing event format depends on the 
243
The actual format of the event is presented in Figure~\ref{fig:resevt}.
244

  
245
\begin{figure}
246
\lstset{language=C, basicstyle=\footnotesize,
247
stringstyle=\ttfamily, 
248
flexiblecolumns=true, aboveskip=-0.9em, belowskip=0em, lineskip=0em}
249

  
250
\begin{lstlisting}
251
{
252
  "id":"4b3288b57e5c1b08a67147c495e54a68655fdab8",
253
  "occured":1314829876295,
254
  "userId":31,
255
  "cliendId":3,
256
  "resource":"vmtime",
257
  "eventVersion":1,
258
  "value": 1,
259
  "details":{
260
    "vmid":"3300",
261
    "action": "on"
262
  }
263
}
264
\end{lstlisting}
265
\caption{A billing event example} 
266
\label{fig:resevt}
267

  
45 268
\end{figure}
46 269

  
47 270

  
48 271
\section{Performance}
272

  
273
To evaluate the performance and scalability of Aquarium, we performed two
274
experiments: The first one is a micro-benchmark that measures the time required
275
for the basic processing operation performed by Aquarium, which is billing for
276
increasing number of messages. The second one demonstrates Aquarium's
277
scalability on a single node with respect to the number of users.  In both
278
cases, Aquarium was run on a MacBookPro featuring a quad core 2.33{\sc g}hz
279
Intel i7 processor and 8{\sc gb} of {\sc ram}. We selected Rabbit{\sc mq} and
280
Mongo{\sc db} as the queue and database servers, both of which were run on a
281
virtualised 4 core with 4{\sc gb} {\sc ram} Debian Linux server. Both systems
282
were run using current versions at the time of benchmarking (2.7.1 for
283
Rabbit{\sc mq} and 2.6 for Mongo{\sc db}).  The two systems were connected with
284
a full duplex 100Mbps connection.  No particular optimization was performed on
285
either back-end system, nor to the {\sc jvm} that run Aquarium. 
286

  
287
To simulate a realistic deployment, Aquarium was configured, using the policy
288
{\sc dsl} to handle billing events for 4 types of resources, using 3 overloaded
289
pricelists, 2 overloaded algorithms, all of which were combined to 10 different
290
agreements, which were randomly (uniformly) assigned to users. To drive the
291
benchmarks, we used a synthetic load generator that worked in two stages: it
292
first created a configurable number of users and then produced billing events
293
that 
294

  
49 295
\section{Related Work}
296

  
50 297
\section{Conclusions and Future Work}
298
In this paper, we presented Aquarium, a high-performance, generic accounting 
299
system, currently tuned for cloud applications.
300

  
301
Aquarium is currently under development, with a first operational version 
302
being planned for early 2012. All subsystems are operational, and the system
303
can already support 
304

  
305
Aquarium is available under an open source license at 
306
\url{https://code.grnet.gr/projects/aquarium}.
51 307

  
52 308
\bibliographystyle{abbrvnat}
53 309
\bibliography{aquarium}

Also available in: Unified diff