Revision 5cd4e64c doc/arch/aquarium.tex
b/doc/arch/aquarium.tex | ||
---|---|---|
3 | 3 |
\usepackage{amssymb} |
4 | 4 |
\usepackage{graphicx} |
5 | 5 |
\usepackage[british]{babel} |
6 |
\usepackage{url} |
|
7 |
\usepackage{listings} |
|
8 |
\usepackage{color} |
|
9 |
|
|
6 | 10 |
\newcommand{\cL}{{\cal L}} |
7 | 11 |
|
8 | 12 |
\begin{document} |
... | ... | |
12 | 16 |
|
13 | 17 |
\titlebanner{DRAFT---Do not distribute} |
14 | 18 |
|
15 |
\title{Aquarium: Accounting for the Cloud in the Cloud} |
|
19 |
|
|
20 |
|
|
21 |
\title{Aquarium: Billing for the Cloud in the Cloud} |
|
16 | 22 |
|
17 | 23 |
\authorinfo{Georgios Gousios \and Christos Loverdos} |
18 | 24 |
{GRNet SA} |
... | ... | |
34 | 40 |
|
35 | 41 |
\section{Introduction} |
36 | 42 |
\section{Requirements} |
43 |
|
|
44 |
|
|
45 |
\subsection{Application Environment} |
|
46 |
Aquarium is designed and developed as part of the Okeanos project at GRNet. The |
|
47 |
Okeanos project is building a full stack public IaaS system for Greek |
|
48 |
universities, and several services on top of it. Several components comprise |
|
49 |
the Okeanos infrastructure: |
|
50 |
|
|
51 |
\begin{description} |
|
52 |
|
|
53 |
\item[Synnefo] is an IaaS management console. Users can create and start |
|
54 |
VMs, monitor their usage, create private internal networks among VMs |
|
55 |
and connect to them over the web. The service backend is based on |
|
56 |
Google's Ganneti for VM host management and hundrends of physical |
|
57 |
VM container nodes. |
|
58 |
|
|
59 |
\item[Archipelago] is a storage service, based on the Rados |
|
60 |
distributed object store. It is currently under development, and the |
|
61 |
plan is to act as the single point of storage for VM images, shared |
|
62 |
volumes and user files, providing clonable snapshots and distributed |
|
63 |
fault tolerance. |
|
64 |
|
|
65 |
\item[Pithos] is a user oriented file storage service. Currently it its |
|
66 |
second incarnation, it supports content deduplication, sharing of files |
|
67 |
and folders and a multitude of clients. |
|
68 |
|
|
69 |
\item[Astakos] is an identity consolidation system that also acts as the |
|
70 |
entry point to the entire infrastructure. Users can login using |
|
71 |
identities from multiple systems, such as the Shibboleth (SAML) |
|
72 |
federation enabled across all Greek universities or their Twitter |
|
73 |
accounts. |
|
74 |
|
|
75 |
\end{description} |
|
76 |
|
|
77 |
While all the above systems (and several prospective ones) have different |
|
78 |
user interfaces and provide different functionalities in the context of |
|
79 |
the GRnet IaaS, they all share a common notion of \emph{resources}, access |
|
80 |
and manipulation options to which they offer to users. |
|
81 |
|
|
82 |
|
|
83 |
\subsection{Configuration} |
|
84 |
|
|
85 |
Billing systems are by nature open ended. As new services are deployed, new |
|
86 |
resources appear, while others might be phased out. Moreover, changes to |
|
87 |
company policies may trigger changes to price lists for those resources, while |
|
88 |
ad-hoc requests for large scale computational resources may require special |
|
89 |
pricing policies. In order for a billing system to be able to successfully |
|
90 |
adapt to changing requirements, it must be able to accommodate such changes |
|
91 |
without requiring changes to the application itself. This means that all |
|
92 |
information required for Aquarium in order to perform a billing operation, |
|
93 |
must be provided to it externally. Moreover, to ensure high-availability, |
|
94 |
billing configuration should be updatable while Aquarium is running, or at |
|
95 |
least with minimal downtime, without affecting the operation of external |
|
96 |
systems. |
|
97 |
|
|
98 |
|
|
99 |
|
|
100 |
\subsection{Scaling} |
|
101 |
|
|
102 |
In the context of the Okeanos system, Aquarium provides billing services on a |
|
103 |
per user basis for all resources exposed by other systems. As such, it is in |
|
104 |
the critical path of user requests that modify resource state; all supported |
|
105 |
applications must query Aquarium in order to ensure that the user has enough |
|
106 |
credits to create a new resource. This means that for a large number of users |
|
107 |
(given previous GRNet systems usage by the Greek research community, we |
|
108 |
estimate a concurrency level of 30.000 users), Aquarium must update and |
|
109 |
maintain in a queryable form their credit status, |
|
110 |
with soft realtime guarantees. |
|
111 |
|
|
112 |
Being on the critical path also means that Aquarium must be highly resilient, |
|
113 |
too. If Aquarium fails, all supported systems will also fail. Even if Aquarium |
|
114 |
fails for a short period of time, it must not loose any billing events, as this |
|
115 |
will allow users to use resources without paying for them. Moreover, in case of |
|
116 |
failure, Aquarium must not corrupt any billing data under any circumstances, |
|
117 |
while it should reach an operating state very fast after a service restart. |
|
118 |
|
|
37 | 119 |
\section{Architecture} |
38 | 120 |
|
121 |
|
|
122 |
|
|
123 |
\section{Implementation} |
|
124 |
|
|
125 |
\subsection{The configuration DSL} |
|
126 |
|
|
127 |
The configuration requirements presented above were addressed by creating a new |
|
128 |
domain specific language ({\sc dsl}), based on the YAML format. The DSL |
|
129 |
enables administrators to specify billable resources, billing policies and |
|
130 |
price lists and combine them arbitrarily into agreements applicable to specific |
|
131 |
users, user groups or the whole system. |
|
132 |
The DSL supports inheritance for policies, price lists and agreements and composition in the case of agreements. |
|
133 |
It also facilitates the |
|
134 |
definition of generic, repeatable debiting rules, which are then used by the |
|
135 |
system to refill the user's account with credits on a periodic based. |
|
136 |
|
|
137 |
The DSL is in itself based on five top-level entities, namely: |
|
138 |
|
|
139 |
\begin{description} |
|
140 |
|
|
141 |
\item[Resources] specify the properties of resources that Aquarium knows |
|
142 |
about. Apart from the expected ones (name, unit etc), |
|
143 |
a resource has two properties that affect billing: \textsf{costpolicy} |
|
144 |
defines whether the billing operation is to be performed at the moment |
|
145 |
a billing event has arrived, while the \textsf{complex} attribute defines |
|
146 |
whether a resource can have many instances per user. |
|
147 |
|
|
148 |
\item[Pricelists] assign a price tag to each resource, within a timeframe. |
|
149 |
|
|
150 |
\item[Algorithms] specify the way the billing operation is done in response |
|
151 |
to a billing event. The simplest (and default) way is to multiply the |
|
152 |
billable quantity with the applicable price. To enable more complex billing |
|
153 |
scenarios, the Aquarium DSL supports a simple imperative language with |
|
154 |
a number of implicit variables (e.g. \texttt{price, volume, date}) |
|
155 |
that enable administrators to specify, e.g. billing algorithms that |
|
156 |
scale with billable volume. Similarily to pricelists, algorithms |
|
157 |
have an applicability timeframe attached to them. |
|
158 |
|
|
159 |
\item[Crediplans] define a number of credits to give to users and a repetition |
|
160 |
period. |
|
161 |
|
|
162 |
\item[Agreements] assign a name to algorithm, pricelist and creditplan triplets, |
|
163 |
which is then assigned to each user. |
|
164 |
|
|
165 |
\end{description} |
|
166 |
|
|
167 |
|
|
39 | 168 |
\begin{figure} |
40 |
\begin{center} |
|
41 |
|
|
42 |
\end{center} |
|
43 |
\caption{Foundational framework of the snork mechanism.} |
|
44 |
\label{fig-ffsm} |
|
169 |
\lstset{language=ruby, basicstyle=\footnotesize, |
|
170 |
stringstyle=\ttfamily, |
|
171 |
flexiblecolumns=true, aboveskip=-0.9em, belowskip=0em, lineskip=0em} |
|
172 |
|
|
173 |
\begin{lstlisting} |
|
174 |
resources: |
|
175 |
- resource: |
|
176 |
name: bandwidthup |
|
177 |
unit: MB/hr |
|
178 |
complex: false |
|
179 |
costpolicy: continuous |
|
180 |
pricelists: |
|
181 |
- pricelist: |
|
182 |
name: default |
|
183 |
bandwidthup: 0.01 |
|
184 |
effective: |
|
185 |
from: 0 |
|
186 |
- pricelist: |
|
187 |
name: everyTue2 |
|
188 |
overrides: default |
|
189 |
bandwidthup: 0.1 |
|
190 |
effective: |
|
191 |
repeat: |
|
192 |
- start: "00 02 * * Tue" |
|
193 |
end: "00 02 * * Wed" |
|
194 |
from: 1326041177 #Sun, 8 Jan 2012 18:46:27 EET |
|
195 |
algorithms: |
|
196 |
- algorithm: |
|
197 |
name: default |
|
198 |
bandwidthup: $price times $volume |
|
199 |
effective: |
|
200 |
from: 0 |
|
201 |
agreements: |
|
202 |
- agreement: |
|
203 |
name: scaledbandwidth |
|
204 |
pricelist: everyTue2 |
|
205 |
algorithm: |
|
206 |
bandwidthup: | |
|
207 |
if $volume gt 15 then |
|
208 |
$volume times $price |
|
209 |
elsif $volume gt 15 and volume lt 30 then |
|
210 |
$volume times $price times 1.2 |
|
211 |
else |
|
212 |
$volume times price times 1.4 |
|
213 |
end |
|
214 |
\end{lstlisting} |
|
215 |
|
|
216 |
\caption{A simple billing policy definition.} |
|
217 |
\label{fig:dsl} |
|
218 |
\end{figure} |
|
219 |
|
|
220 |
In Figure~\ref{fig:dsl}, we present the definition of a simple (albeit valid) |
|
221 |
policy. The policy parsing is done top down, so the order of definition |
|
222 |
is important. The definition starts with a resource, whose name is then |
|
223 |
re-used in order to attach a pricelist and a price calculation algorith to it. |
|
224 |
In the case of pricelists, we present an example of \emph{temporal overloading}; |
|
225 |
the \texttt{everyTue2} pricelist overrides the default one, but only for |
|
226 |
all repeating time frames between every Tuesday at 02:00 and Wednesday at |
|
227 |
02:00, starting from the timestamp indicated at the \texttt{from} field. Another |
|
228 |
example of overloading is presented at the definition of the agreement, which |
|
229 |
overloads the default algorithm definition using the imperative part of the |
|
230 |
Aquarium {\sc dsl} to provide a scaling charge algorithm. |
|
231 |
|
|
232 |
\subsection{Billing} |
|
233 |
|
|
234 |
As common to most similar systems, billing in Aquarium is the application of |
|
235 |
a billing contract to an incoming billing event in order to produce an |
|
236 |
entry for the user's wallet. However, in stark contrast to most other systems, |
|
237 |
which rely on database transactions in order to securely modify the user's |
|
238 |
balance, Aquarium performs account updates asynchronously and concurrently |
|
239 |
for all known users. |
|
240 |
|
|
241 |
Billing events are obtained by a connection to a reliable message queue. |
|
242 |
The billing event format depends on the |
|
243 |
The actual format of the event is presented in Figure~\ref{fig:resevt}. |
|
244 |
|
|
245 |
\begin{figure} |
|
246 |
\lstset{language=C, basicstyle=\footnotesize, |
|
247 |
stringstyle=\ttfamily, |
|
248 |
flexiblecolumns=true, aboveskip=-0.9em, belowskip=0em, lineskip=0em} |
|
249 |
|
|
250 |
\begin{lstlisting} |
|
251 |
{ |
|
252 |
"id":"4b3288b57e5c1b08a67147c495e54a68655fdab8", |
|
253 |
"occured":1314829876295, |
|
254 |
"userId":31, |
|
255 |
"cliendId":3, |
|
256 |
"resource":"vmtime", |
|
257 |
"eventVersion":1, |
|
258 |
"value": 1, |
|
259 |
"details":{ |
|
260 |
"vmid":"3300", |
|
261 |
"action": "on" |
|
262 |
} |
|
263 |
} |
|
264 |
\end{lstlisting} |
|
265 |
\caption{A billing event example} |
|
266 |
\label{fig:resevt} |
|
267 |
|
|
45 | 268 |
\end{figure} |
46 | 269 |
|
47 | 270 |
|
48 | 271 |
\section{Performance} |
272 |
|
|
273 |
To evaluate the performance and scalability of Aquarium, we performed two |
|
274 |
experiments: The first one is a micro-benchmark that measures the time required |
|
275 |
for the basic processing operation performed by Aquarium, which is billing for |
|
276 |
increasing number of messages. The second one demonstrates Aquarium's |
|
277 |
scalability on a single node with respect to the number of users. In both |
|
278 |
cases, Aquarium was run on a MacBookPro featuring a quad core 2.33{\sc g}hz |
|
279 |
Intel i7 processor and 8{\sc gb} of {\sc ram}. We selected Rabbit{\sc mq} and |
|
280 |
Mongo{\sc db} as the queue and database servers, both of which were run on a |
|
281 |
virtualised 4 core with 4{\sc gb} {\sc ram} Debian Linux server. Both systems |
|
282 |
were run using current versions at the time of benchmarking (2.7.1 for |
|
283 |
Rabbit{\sc mq} and 2.6 for Mongo{\sc db}). The two systems were connected with |
|
284 |
a full duplex 100Mbps connection. No particular optimization was performed on |
|
285 |
either back-end system, nor to the {\sc jvm} that run Aquarium. |
|
286 |
|
|
287 |
To simulate a realistic deployment, Aquarium was configured, using the policy |
|
288 |
{\sc dsl} to handle billing events for 4 types of resources, using 3 overloaded |
|
289 |
pricelists, 2 overloaded algorithms, all of which were combined to 10 different |
|
290 |
agreements, which were randomly (uniformly) assigned to users. To drive the |
|
291 |
benchmarks, we used a synthetic load generator that worked in two stages: it |
|
292 |
first created a configurable number of users and then produced billing events |
|
293 |
that |
|
294 |
|
|
49 | 295 |
\section{Related Work} |
296 |
|
|
50 | 297 |
\section{Conclusions and Future Work} |
298 |
In this paper, we presented Aquarium, a high-performance, generic accounting |
|
299 |
system, currently tuned for cloud applications. |
|
300 |
|
|
301 |
Aquarium is currently under development, with a first operational version |
|
302 |
being planned for early 2012. All subsystems are operational, and the system |
|
303 |
can already support |
|
304 |
|
|
305 |
Aquarium is available under an open source license at |
|
306 |
\url{https://code.grnet.gr/projects/aquarium}. |
|
51 | 307 |
|
52 | 308 |
\bibliographystyle{abbrvnat} |
53 | 309 |
\bibliography{aquarium} |
Also available in: Unified diff