Revision 4a3e83c6

b/.gitignore
18 18
settings.d/*-local.conf
19 19
*.egg-info
20 20
/dist
21
_build
b/README
5 5
Synnefo cloud management software.
6 6

  
7 7
Consult:
8
  * README.develop: for information on how to setup a development environment
9
  * README.deploy:  for information on how to deploy the application
10
  * README.ci:      for information on how to setup a Jenkins-based
11
                   continuous integration system
12
  * README.i18n:    for information on application internationalization
8
  * docs/develop.rst: for information on how to setup a development environment
9
  * docs/deploy.rst:  for information on how to deploy the application
10
  * docs/ci.rst:      for information on how to setup a Jenkins-based
11
                      continuous integration system
12
  * docs/i18n.rst:    for information on application internationalization
13 13

  
14 14

  
15 15
Synnefo may be distributed under the terms of the following license:
/dev/null
1
README.admin - Administration notes
2

  
3
This file contains notes related to administration of a working Synnefo
4
deployment. This document should be read *after* README.deploy, which contains
5
step-by-step Synnefo deployment instructions.
6

  
7

  
8
Database
9
========
10

  
11
MySQL: manage.py dbshell seems to ignore the setting of 'init_command'
12
       in settings.DATABASES
13

  
14

  
15
Reconciliation mechanism
16
========================
17

  
18
On certain occasions, such as a Ganeti or RabbitMQ failure, the VM state in the
19
system's database may differ from that in the Ganeti installation. The
20
reconciliation process is designed to bring the system's database in sync with
21
what Ganeti knows about each VM, and is able to detect the following three
22
conditions:
23

  
24
 * Stale DB servers without corresponding Ganeti instances
25
 * Orphan Ganeti instances, without corresponding DB entries
26
 * Out-of-sync operstate for DB entries wrt to Ganeti instances
27

  
28
The reconciliation mechanism runs as a management command, e.g., as follows:
29
[PYTHONPATH needs to contain the parent of the synnefo Django project
30
directory]:
31

  
32
/srv/synnefo$ export PYTHONPATH=/srv:$PYTHONPATH
33
vkoukis@dev67:~/synnefo [reconc]$ ./manage.py reconcile --detect-all -v 2
34

  
35
Please see ./manage.py reconcile --help for all the details.
36

  
37
The administrator can also trigger reconciliation of operating state manually,
38
by issuing a Ganeti OP_INSTANCE_QUERY_DATA command on a Synnefo VM, using
39
gnt-instance info.
40

  
41

  
42
Logging
43
=======
44

  
45
Logging in Synnefo is using Python's logging module. The module is configured
46
using dictionary configuration, whose format is described here:
47

  
48
http://docs.python.org/release/2.7.1/library/logging.html#logging-config-dictschema
49

  
50
Note that this is a feature of Python 2.7 that we have backported for use in
51
Python 2.6.
52

  
53
The logging configuration dictionary is defined in settings.d/00-logging.conf
54
and is broken in 4 separate dictionaries:
55

  
56
  * LOGGING is the logging configuration used by the web app. By default all
57
    loggers fall back to the main 'synnefo' logger. The subloggers can be
58
    changed accordingly for finer logging control. e.g. To disable debug
59
    messages from the API set the level of 'synnefo.api' to 'INFO'.
60
  
61
  * DISPATCHER_LOGGING is the logging configuration of the logic/dispatcher.py
62
    command line tool.
63
  
64
  * SNFADMIN_LOGGING is the logging configuration of the snf-admin tool.
65
    Consider using matching configuration for snf-admin and the synnefo.admin
66
    logger of the web app.
67

  
68
Please note the following:
69
  * As of Synnefo v0.7, by default the Django webapp logs to syslog, the
70
    dispatcher logs to /var/log/synnefo/dispatcher.log and the console,
71
    snf-admin logs to the console.
72
  * Different handlers can be set to different logging levels:
73
    for example, everything may appear to the console, but only INFO and higher
74
    may actually be stored in a longer-term logfile.
75

  
76

  
77
Admin Tools
78
===========
79

  
80
snf-admin is a tool used to perform various administrative tasks. It needs to
81
be able to access the django database, so the following should be able to import
82
the Django settings.
83

  
84
Additionally, administrative tasks can be performed via the admin web interface
85
located in /admin. Only users of type ADMIN can access the admin pages. To change
86
the type of a user to ADMIN, snf-admin can be used:
87

  
88
   snf-admin user modify 42 --type ADMIN
/dev/null
1
Continuous integration with Jenkins
2
===================================
3

  
4
Preparing a GIT mirror
5
----------------------
6

  
7
Jenkins cannot currently work with Git over encrypted HTTP. To solve this
8
problem we currently mirror the central Git repository locally on the jenkins
9
installation machine. To setup such a mirror do the following:
10

  
11
-edit .netrc
12

  
13
machine code.grnet.gr
14
login accountname
15
password accountpasswd
16

  
17
-Create the mirror
18

  
19
git clone --mirror https://code.grnet.gr/git/synnefo synnefo
20

  
21
-Setup cron to pull from the mirror periodically. Ideally, Git mirror updates
22
should run just before Jenkins jobs check the mirror for changes.
23

  
24
4,14,24,34,44,54 * * * * cd /path/to/mirror && git fetch && git remote prune origin
25

  
26
Jenkins setup
27
-------------
28

  
29
The following instructions will setup Jenkins to run synnefo tests with the
30
SQLite database. To run the tests on MySQL and/or Postgres, step 5 must be
31
replicated. Also, the correct configuration file must be copied (line 6 of the
32
build script).
33

  
34
1. Install and start Jenkins. On Debian Squeeze:
35

  
36
   wget -q -O - http://pkg.jenkins-ci.org/debian/jenkins-ci.org.key | apt-key add -
37
   echo "deb http://pkg.jenkins-ci.org/debian binary/" >>/etc/apt/sources.list
38
   echo "deb http://ppa.launchpad.net/chris-lea/zeromq/ubuntu lucid main" >> /etc/apt/sources.list
39
   sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys C7917B12  
40
   sudo apt-get update
41
   sudo apt-get install jenkins
42

  
43
   Also install the following packages:
44

  
45
   apt-get install python-virtualenv libcurl3-gnutls libcurl3-gnutls-dev
46
                   uuid-dev libmysqlclient-dev libpq-dev libsqlite-dev
47
                   python-dev libzmq-dev
48

  
49
2. After Jenkins starts, go to
50

  
51
   http://$HOST:8080/pluginManager/
52

  
53
   and install the following plug-ins at
54

  
55
   -Jenkins Cobertura Plugin
56
   -Jenkins Email Extension Plugin
57
   -Jenkins GIT plugin
58
   -Jenkins SLOCCount Plug-in
59
   -Hudson/Jenkins Violations plugin
60

  
61
3. Configure the Jenkins user's Git details:
62
   su jenkins
63
   git config --global user.email "buildbot@lists.grnet.gr"
64
   git config --global user.name "Buildbot"
65

  
66
4. Make sure that all system-level dependencies specified in README.develop
67
   are correctly installed
68

  
69
5. Create a new "free-style software" job and set the following values:
70

  
71
-Project name: synnefo
72
-Source Code Management: Git
73
-URL of repository: Jenkins Git does not support HTTPS for checking out directly
74
                    from the repository. The temporary solution is to checkout
75
                    with a cron script in a directory and set the checkout path
76
                    in this field
77
-Branches to build: master and perhaps others
78
-Git->Advanced->Local subdirectory for repo (optional): synnefo
79
-Git->Advanced->Prune remote branches before build: check
80
-Repository browser: redmineweb,
81
                     URL: https://code.grnet.gr/projects/synnefo/repository/
82
-Build Triggers->Poll SCM: check
83
                 Schedule: # every five minutes
84
			   0,5,10,15,20,25,30,35,40,45,50,55 * * * * 
85

  
86
-Build -> Add build step-> Execute shell
87

  
88
Command:
89

  
90
#!/bin/bash -ex
91
cd synnefo
92
mkdir -p reports
93
/usr/bin/sloccount --duplicates --wide --details api util ui logic auth > reports/sloccount.sc
94
cp conf/ci/manage.py .
95
if [ ! -e requirements.pip ]; then cp conf/ci/pip-1.2.conf requirements.pip; fi
96
cat settings.py.dist conf/ci/settings.py.sqlite > settings.py
97
python manage.py update_ve
98
python manage.py hudson api db logic 
99

  
100
-Post-build Actions->Publish JUnit test result report: check
101
                     Test report XMLs: synnefo/reports/TEST-*.xml
102

  
103
-Post-build Actions->Publish Cobertura Coverage Report: check
104
                     Cobertura xml report pattern: synnefo/reports/coverage.xml
105

  
106
-Post-build Actions->Report Violations: check
107
                     pylint[XML filename pattern]: synnefo/reports/pylint.report
108

  
109
-Post-build Actions->Publish SLOCCount analysis results
110
                     SLOCCount reports: synnefo/reports/sloccount.sc
111
                     (also, remember to install sloccount at /usr/bin)
112
---------------
113
See also:
114

  
115
http://sites.google.com/site/kmmbvnr/home/django-hudson-tutorial
/dev/null
1
README.deploy -- Instructions for a basic Synnefo deployment
2

  
3
This document describes the basic steps to obtain a basic, working Synnefo
4
deployment. It begins by examining the different node roles, then moves to the
5
installation and setup of distinct software components.
6

  
7
It is current as of Synnefo v0.7.
8

  
9

  
10
Node types
11
===========
12

  
13
Nodes in a Synnefo deployment belong in one of the following types:
14

  
15
 * DB:
16
   A node [or more than one nodes, if using an HA configuration], running a DB
17
   engine supported by the Django ORM layer. The DB is the single source of
18
   truth for the servicing of API requests by Synnefo.
19
   Services: PostgreSQL / MySQL
20

  
21
 * APISERVER:
22
   A node running the implementation of the OpenStack API, in Django. Any number
23
   of APISERVERs can be used, in a load-balancing configuration, without any
24
   special consideration. Access to a common DB ensures consistency.
25
   Services: Web server, vncauthproxy
26

  
27
 * QUEUE:
28
   A node running the RabbitMQ software, which provides AMQP functionality. More
29
   than one QUEUE nodes may be deployed, in an HA configuration. Such
30
   deployments require shared storage, provided e.g., by DRBD.
31
   Services: RabbitMQ [rabbitmq-server]
32

  
33
 * LOGIC:
34
   A node running the business logic of Synnefo, in Django. It dequeues
35
   messages from QUEUE nodes, and provides the context in which business logic
36
   functions run. It uses Django ORM to connect to the common DB and update the
37
   state of the system, based on notifications received from the rest of the
38
   infrastructure, over AMQP.
39
   Services: the Synnefo logic dispatcher [/logic/dispatcher.py]
40

  
41
 * GANETI-MASTER and GANETI-NODE:
42
   A single GANETI-MASTER and a large number of GANETI-NODEs constitute the
43
   Ganeti backend for Synnefo, which undertakes all VM management functions.
44
   Any APISERVER can issue commands to the GANETI-MASTER, over RAPI, to effect
45
   changes in the state of the VMs. The GANETI-MASTER runs the Ganeti request
46
   queue.
47
   Services:
48
     only on GANETI-MASTER:
49
       the Synnefo Ganeti monitoring daemon [/ganeti/snf-ganeti-eventd]
50
       the Synnefo Ganeti hook [/ganeti/snf-ganeti-hook.py].
51
     on each GANETI_NODE:
52
       a deployment-specific KVM ifup script
53
       properly configured NFDHCPD
54

  
55

  
56
Installation Process
57
=====================
58

  
59
This section describes the installation process of the various node roles in a
60
Synnefo deployment.
61

  
62

  
63
0. Allocation of physical nodes:
64
   Determine the role of every physical node in your deployment.
65

  
66

  
67
1. Ganeti installation:
68
   Synnefo requires a working Ganeti installation at the backend. Installation
69
   of Ganeti is not covered by this document, please refer to
70
   http://docs.ganeti.org/ganeti/current/html for all the gory details. A
71
   successful Ganeti installation concludes with a working GANETI-MASTER and a
72
   number of GANETI-NODEs.
73

  
74

  
75
2. RabbitMQ installation:
76
   RabbitMQ is used as a generic message broker for the system. It should be
77
   installed on two seperate QUEUE nodes (VMs should be enough for the moment)
78
   in a high availability configuration as described here:
79

  
80
     http://www.rabbitmq.com/pacemaker.html
81

  
82
   After installation, create a user and set its permissions
83
     rabbitmqctl add_user okeanos 0k3@n0s
84
     rabbitmqctl set_permissions -p / okeanos  "^.*" ".*" ".*"
85

  
86
   The values set for the user and password must be mirrored in the
87
   RABBIT_* variables in settings.py (see step 6)
88

  
89

  
90
3. Web server installation:
91
   A Web Server (e.g., Apache) needs to be installed on the APISERVERs,
92
   and be configured to run the Synnefo Django project appropriately. Selection
93
   and configuration of a Web server is outside the scope of this document.
94

  
95
   For testing or development purposes, Django's own development server,
96
   `./manage.py runserver' can be used.
97

  
98

  
99
4. Installation of the Synnefo Django project:
100
   As of v0.5 the Synnefo Django project needs to be installed on nodes
101
   of type APISERVER, and LOGIC, with a properly configured settings.py. In
102
   later revisions, the specific parts of the Django project which need to run
103
   on each node type will be identified.
104

  
105
   Synnefo is written in Python 2.6 and depends on the following Python modules:
106
   [package versions confirmed to be compatible are in braces]
107

  
108
    * django 1.2 [Django==1.2.4]
109
    * simplejson [simplejson==2.1.3]
110
    * pycurl [pycurl==7.19.0]
111
    * python-dateutil  [python-dateutil==1.4.1]
112
      WARNING: version python-dateutil==2.0 downloaded by pip known *not* to
113
               work with Python 2.6
114
    * python-ipy [IPy==0.75]
115
        also verified to work with python-ipy 0.70-1 as shipped with Squeeze
116
    * south [south==0.7.1]
117
      WARNING: might not work with Debian Squeeze's default south-0.7-1 package.
118
    * amqplib [amqplib==0.6.1]
119
    * lockfile [lockfile==0.8]
120
    * python-daemon [python-daemon==1.5.5]
121
    * python-prctl [python-prctl==1.3.0]
122

  
123
   also, depending on the database engine of choice, on one of the following:
124
    * MySQL-python [MySQL-python==1.2.3]
125
    * psycopg2 [psycopg2==2.4]
126

  
127
   if the invitations application is deployed, the following dependencies should
128
   be installed:
129
    * pycrypto==2.1.0
130

  
131
   for server side ssh key pair generation to work the following module is required:
132
    * M2Crypto==0.20.1
133

  
134
   The integration test suite snf-tools/snf-test depends on:
135
    * python-unittest2 [unittest2==0.5.1]
136
    * python-paramiko  [paramiko==1.7.6], version included in Debian Squeeze
137
      is broken wrt to use of RandomPool, see Debian bug #576697
138
    * python-ipy [IPy==0.75]
139
    * python-prctl [python-prctl==1.3.0]
140
    * the client component of vncauthproxy, see Step 12
141
    * the kamaki client library, please see
142
      https://code.grnet.gr/projects/kamaki for installation instructions.
143
      [FIXME: Update instructions on kamaki installation]
144

  
145
   To run the user interface tests, selenium must be installed
146
    * selenium [?]
147

  
148
   The easiest method for installation of the Django project is to setup a
149
   working environment through virtualenv. Alternatively, you can use your
150
   system's package manager to install the dependencies (e.g. Macports has them
151
   all).
152

  
153
   * On Snow Leopard and linux (64-bit), you have to set the following
154
     environment variable for pip to compile the dependencies correctly.
155

  
156
  	   $ export ARCHFLAGS="-arch x86_64"
157

  
158
   * On Ubuntu, a few more packages must be installed before installing the
159
     prerequisite Python libraries
160

  
161
	   $ sudo aptitude install libcurl3-gnutls libcurl3-gnutls-dev uuid-dev
162

  
163
   Checkout the code and install the Python prerequisites. This assumes that
164
   python is already installed on the host.
165

  
166
    $ sudo easy_install virtualenv
167
    $ git clone https://user@code.grnet.gr/git/synnefo synnefo
168
    $ virtualenv --python=python2.6 synnefo --no-site-packages
169
    ...
170
    $ cd synnefo
171
    $ ./bin/pip install <list_of_dependencies>
172

  
173
    [WARNING]: The software must be checked out in a directory named synnefo,
174
    otherwise python imports will not work. Therefore, do not change the
175
    or rename the checkout path.
176

  
177

  
178
5. Database installation:
179
   A database supported by the Django ORM layer must be installed on nodes
180
   of type DB. The choices are: SQLIte, MySQL, PostgreSQL.
181

  
182
   * SQLite:
183
     The python sqlite driver is available by default with Python so no
184
     additional configuration is required. Also, most self-respecting systems
185
     have the sqlite library installed by default.
186

  
187
   * MySQL:
188
      MySQL must be installed first:
189

  
190
      * Ubuntu - Debian
191
	      $ sudo apt-get install libmysqlclient-dev
192

  
193
      * MacPorts
194
	      $ sudo port install mysql5
195

  
196
      Install the MySQL python library on servers running the Django project:
197

  
198
	    $ bin/pip install MySQL-python
199

  
200
      Note: On MacOSX with Mysql install from MacPorts the above command will
201
            fail complaining that it cannot find the mysql_config command. Do
202
            the following and restart the installation
203
	        $ echo "mysql_config = /opt/local/bin/mysql_config5" >> \
204
                                         ./build/MySQL-python/site.cfg
205

  
206
      Configure a MySQL db/account for synnefo
207
	    $ mysql -u root -p
208

  
209
    	mysql> create database synnefo;
210
	    mysql> show databases;
211
	    mysql> GRANT ALL on synnefo.* TO username IDENTIFIED BY 'password';
212

  
213
     IMPORTANT:
214
        MySQL *must* be set in READ-COMMITED mode, e.g. by setting
215

  
216
        transaction-isolation = READ-COMMITTED
217

  
218
        in the [mysqld] section of /etc/mysql/my.cnf.
219

  
220
        Alternatively, make sure the following code fragment stays enabled
221
        in settings.d/10-database.conf:
222

  
223
            if DATABASES['default']['ENGINE'].endswith('mysql'):
224
                DATABASES['default']['OPTIONS'] = {
225
                        'init_command': 'SET storage_engine=INNODB; ' +
226
                            'SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED',
227
                }
228
          
229
   * PostgreSQL
230
     You need to install the PostgreSQL binaries:
231
     * Ubuntu - Debian
232
	     $ sudo apt-get install postgresql-8.4 libpq-dev
233

  
234
     * MacPorts
235
	     $ sudo port install postgresql84
236

  
237
     Install the postgres Python library
238
	    $ bin/pip install psycopg2
239

  
240
     Configure a postgres db/account for synnefo:
241

  
242
     Become the postgres user, connect to PostgreSQL:
243
       $ sudo su - postgres
244
       $ psql
245
	
246
	 Run the following commands:
247
	   DROP DATABASE synnefo;
248
	   DROP USER username;
249
	   CREATE USER username WITH PASSWORD 'password';
250
	   CREATE DATABASE synnefo;
251
	   GRANT ALL PRIVILEGES ON DATABASE synnefo TO username;
252
	   ALTER DATABASE synnefo OWNER TO username;
253
	   ALTER USER username CREATEDB;
254

  
255
     The last line enables the newly created user to create own databases. This
256
     is needed for Django to create and drop the test_synnefo database for unit
257
     testing.
258

  
259

  
260
6. Setting up the Django project:
261
   The settings.py file for Django may be derived by concatenating the
262
   settings.py.dist file contained in the Synnefo distribution with a file
263
   containing custom modifications, which shall override all settings deviating
264
   from the supplied settings.py.dist. This is recommended to minimize the load
265
   of reconstructing settings.py from scratch, since each release currently
266
   brings heavy changes to settings.py.dist.
267

  
268
   Add the following to your custom settings.py, depending on your choice
269
   of DB:
270
   * SQLite
271

  
272
	 PROJECT_PATH = os.path.dirname(os.path.abspath(__file__)) + '/'
273

  
274
	 DATABASES = {
275
	     'default': {
276
		     'ENGINE': 'django.db.backends.sqlite3',
277
		     'NAME': PROJECT_PATH + 'synnefo.db' # WARN: This must be an absolute path
278
	     }
279
	 }
280

  
281
   * MySQL
282

  
283
 	 DATABASES = {
284
	     'default': {
285
             'ENGINE': 'django.db.backends.mysql',
286
             'NAME': 'synnefo',
287
             'USER': 'USERNAME',
288
             'PASSWORD': 'PASSWORD',
289
             'HOST': 'HOST',
290
             'PORT': 'PORT',
291
             'OPTIONS': {
292
                 'init_command': 'SET storage_engine=INNODB',
293
             }
294
	    }
295
	}
296

  
297
   * PostgreSQL
298

  
299
     DATABASES = {
300
	     'default': {
301
             'ENGINE': 'django.db.backends.postgresql_psycopg2',
302
             'NAME': 'DATABASE',
303
             'USER': 'USERNAME',
304
             'PASSWORD': 'PASSWORD',
305
             'HOST': 'HOST',
306
             'PORT': 'PORT',
307
	     }
308
     }
309

  
310
    Try it out. The following command will attempt to connect to the DB and
311
    print out DDL statements. It should not fail.
312

  
313
	$ ./bin/python manage.py sql db
314

  
315

  
316
7. Initialization of Synnefo DB:
317
   You need to initialize the Synnefo DB and load fixtures
318
   db/fixtures/{users,flavors,images}.json, which make the API usable by end
319
   users by defining a sample set of users, hardware configurations (flavors)
320
   and OS images.
321

  
322
   IMPORTANT: Be sure to modify db/fixtures/users.json and select
323
   a unique token for each of the initial and any other users defined in this
324
   file. DO NOT LEAVE THE SAMPLE AUTHENTICATION TOKENS enabled in deployed
325
   configurations.
326

  
327
     $ ./bin/python manage.py syncdb
328
     $ ./bin/python manage.py migrate
329
     $ ./bin/python manage.py loaddata db/fixtures/users.json
330
     $ ./bin/python manage.py loaddata db/fixtures/flavors.json
331
     $ ./bin/python manage.py loaddata db/fixtures/images.json
332

  
333

  
334
8. Finalization of settings.py:
335
   Set the BACKEND_PREFIX_ID variable to some unique prefix, e.g. your commit
336
   username in settings.py. Several functional conventions within the system
337
   require this variable to include a dash at its end (e.g. snf-)
338

  
339

  
340
9. Installation of the Ganeti monitoring daemon, /ganeti/snf-ganeti-eventd:
341
   The Ganeti monitoring daemon must run on GANETI-MASTER.
342

  
343
   The monitoring daemon is configured through /etc/synnefo/settings.conf.
344
   An example is provided under snf-ganeti-tools/.
345

  
346
   If run from the repository directory, make sure to have snf-ganeti-tools/
347
   in the PYTHONPATH.
348

  
349
   You may also build Debian packages directly from the repository:
350
   $ cd snf-ganeti-tools
351
   $ dpkg-buildpackage -b -uc -us
352
   # dpkg -i ../snf-ganeti-tools-*deb
353

  
354
   TBD: how to handle master migration.
355

  
356

  
357
10. Installation of the Synnefo dispatcher, /logic/dispatcher.py:
358
    The logic dispatcher is part of the Synnefo Django project and must run
359
    on LOGIC nodes.
360

  
361
    The dispatcher retrieves messages from the queue and calls the appropriate
362
    handler function as defined in the queue configuration in `setttings.py'.
363
    The default configuration should work directly without any modifications.
364

  
365
    For the time being The dispatcher must be run by hand:
366
      $ ./bin/python ./logic/dispatcher.py
367

  
368
    The dispatcher should run in at least 2 instances to ensure high
369
    (actually, increased) availability.
370

  
371

  
372
11. Installation of the Synnefo Ganeti hook:
373
    The generic Synnefo Ganeti hook wrapper resides in the snf-ganeti-tools/
374
    directory of the Synnefo repository.
375

  
376
    The hook needs to be enabled for phases post-{add,modify,reboot,start,stop}
377
    by *symlinking* in
378
    /etc/ganeti/hooks/instance-{add,modify,reboot,start,stop}-post.d on
379
    GANETI-MASTER, e.g.:
380

  
381
    root@ganeti-master:/etc/ganeti/hooks/instance-start-post.d# ls -l
382
    lrwxrwxrwx 1 root root 45 May   3 13:45 00-snf-ganeti-hook -> /home/devel/synnefo/snf-ganeti-hook/snf-ganeti-hook.py
383

  
384
    IMPORTANT: The link name may only contain "upper and lower case, digits,
385
    underscores and hyphens. In other words, the regexp ^[a-zA-Z0-9_-]+$."
386
    See:
387
    http://docs.ganeti.org/ganeti/master/html/hooks.html?highlight=hooks#naming
388

  
389
    If run from the repository directory, make sure to have snf-ganeti-tools/
390
    in the PYTHONPATH.
391

  
392
    Alternative, build Debian packages which take care of building, installing
393
    and activating the Ganeti hook automatically, see step. 9.
394

  
395

  
396
12. Installation of the VNC authentication proxy, vncauthproxy:
397
    To support OOB console access to the VMs over VNC, the vncauthproxy
398
    daemon must be running on every node of type APISERVER.
399

  
400
    Download and install vncauthproxy from its own repository,
401
    at https://code.grnet.gr/git/vncauthproxy (known good commit: tag v1.0).
402

  
403
    Download and install a specific repository commit:
404

  
405
    $ bin/pip install -e git+https://code.grnet.gr/git/vncauthproxy@INSERT_COMMIT_HERE#egg=vncauthproxy
406

  
407
    Create /var/log/vncauthproxy and set its permissions appropriately.
408

  
409
    Alternatively, you can build Debian packages. To do so,
410
    checkout the "debian" branch of the vncauthproxy repository
411
    (known good commit: tag debian/v1.0):
412

  
413
    $ git checkout debian
414

  
415
    Then build debian package, and install as root:
416

  
417
    $ dpkg-buildpackage -b -uc -us
418
    # dpkg -i ../vncauthproxy_1.0-1_all.deb
419

  
420
    --Failure to build the package on the Mac.
421

  
422
    libevent, a requirement for gevent which in turn is a requirement for
423
    vncauthproxy is not included in MacOSX by default and installing it with
424
    MacPorts does not lead to a version that can be found by the gevent
425
    build process. A quick workaround is to execute the following commands:
426

  
427
    cd $SYNNEFO
428
    sudo pip install -e git+https://code.grnet.gr/git/vncauthproxy@5a196d8481e171a#egg=vncauthproxy
429
    <the above fails>
430
    cd build/gevent
431
    sudo python setup.py -I/opt/local/include -L/opt/local/lib build
432
    cd $SYNNEFO
433
    sudo pip install -e git+https://code.grnet.gr/git/vncauthproxy@5a196d8481e171a#egg=vncauthproxy
434

  
435

  
436
13. Installation of the snf-image Ganeti OS provider for image deployment:
437
    For Synnefo to be able to launch VMs from specified Images, you need
438
    the snf-image OS Provider installed on *all* Ganeti nodes.
439

  
440
    Please see https://code.grnet.gr/projects/snf-image/wiki
441
    for installation instructions and documentation on the design
442
    and implementation of snf-image.
443

  
444
    Please see https://code.grnet.gr/projects/snf-image/files
445
    for the latest packages.
446

  
447
    Images should be stored under extdump format in a directory
448
    of your choice, configurable as IMAGE_DIR in /etc/default/snf-image.
449

  
450

  
451
14. Setup Synnefo-specific networking on the Ganeti backend:
452
    This part is deployment-specific and must be customized based on the
453
    specific needs of the system administrators.
454

  
455
    A reference installation will use a Synnefo-specific KVM ifup script,
456
    NFDHCPD and pre-provisioned Linux bridges to support public and private
457
    network functionality. For this:
458

  
459
    Grab NFDHCPD from its own repository (https://code.grnet.gr/git/nfdhcpd),
460
    install it, modify /etc/nfdhcpd/nfdhcpd.conf to reflect your network
461
    configuration.
462

  
463
    Install a custom KVM ifup script for use by Ganeti, as
464
    /etc/ganeti/kvm-vif-bridge, on GANETI-NODEs. A sample implementation is
465
    provided under /contrib/ganeti-hooks. Set NFDHCPD_STATE_DIR to point
466
    to NFDHCPD's state directory, usually /var/lib/nfdhcpd.
467

  
468

  
469
15. See section "Logging" in README.admin, and edit settings.d/00-logging.conf
470
    according to your OS and individual deployment characteristics.
471

  
472

  
473
16. Optionally, read the okeanos_site/README file to setup ~okeanos introductory 
474
    site (intro, video/info pages). Please see okeanos_site/90-okeanos.sample
475
    for a sample configuration file which overrides site-specific variables,
476
    to be placed under settings.d/, after customization.
477

  
478

  
479
17. (Hopefully) Done
480

  
481

  
/dev/null
1
DEVELOP.txt - Information on how to setup a development environment.
2

  
3
This file documents the installation of a development environment for Synnefo.
4
It should be read alongside README.deploy.
5

  
6
It contains development-specific ammendments to the basic deployment steps
7
outlined in README.deploy, and development-specific notes.
8

  
9

  
10
Installing the development environment
11
======================================
12

  
13
For a basic development environment you need to follow steps 0-15
14
of README.deploy, which should be read in its entirety *before* this document.
15

  
16
Development-specific guidelines on each step:
17

  
18

  
19
0. Allocation of physical nodes:
20
   Node types DB, APISERVER, LOGIC may all be run on the same physical machine,
21
   usually, your development workstation.
22

  
23
   Nodes of type GANETI-MASTER, GANETI-NODES and QUEUE are already provided
24
   by the development Ganeti backend. Access credentials are provided in
25
   settings.py.dist.
26

  
27

  
28
1. You do not need to install your own Ganeti installation.
29
   Use the RAPI endpoint as contained in settings.py.dist.
30

  
31

  
32
2. You do not need to setup your own RabbitMQ nodes, use the AMQP endpoints
33
   contained in settings.py.dist. 
34

  
35
3. For development purposes, Django's own development
36
   `server, ./manage.py runserver' will suffice.
37

  
38

  
39
4. Use a virtual environment to install the Django project, or packages provided
40
   by your distribution.
41

  
42

  
43
5. Install a DB of your own, or use the PostgreSQL instance available on the
44
   development backend.
45

  
46

  
47
6. As is.
48

  
49

  
50
7. The following fixtures can be loaded optionally depending on
51
   testing/development requirements, and are not needed in a production setup:
52

  
53
	$ ./bin/python manage.py loaddata db/fixtures/vms.json
54
	$ ./bin/python manage.py loaddata db/fixtures/disks.json
55

  
56

  
57
8. MAKE SURE you setup a distinct BACKEND_PREFIX_ID, e.g., use your commit
58
   username. 
59

  
60

  
61
9. The Ganeti monitoring daemon from the latest Synnefo release is already
62
   running on the development Ganeti master. You may also run your own, on your
63
   own Ganeti backend if you so wish.
64

  
65

  
66
10.As is.
67

  
68
11.The Synnefo Ganeti hook is already running on the development backend,
69
   sending notifications over AMQP.
70

  
71

  
72
12.The VNC authentication proxy is already running on the Ganeti development
73
   backend. You *cannot* run your own, unless you install your own Ganeti
74
   backend, because it needs direct access to the hypervisor's VNC port on
75
   GANETI-NODEs.
76

  
77
   Note: You still need to install the vncauthproxy package to satisfy
78
   the dependency of the API on the vncauthproxy client. See Synnefo #807
79
   for more details.
80

  
81

  
82
13.The development Ganeti backend already has a number of OS Images available.
83

  
84

  
85
14.The development Ganeti backend already has a number of pre-provisioned
86
   bridges available, per each BACKEND_PREFIX_ID.
87

  
88
   To setup simple NAT-based networking on a Ganeti backend on your own,
89
   please see the provided patches under contrib/patches/.
90
   You will need minor patches to the sample KVM ifup hook, kvm-vif-bridge,
91
   and a small patch to NFDHCPD to enable it to work with bridged tap+
92
   interfaces. To support bridged tap interfaces you also need to patch the
93
   python-nfqueue package, patches against python-nfqueue-0.3 [part of Debian
94
   Sid] are also provided under contrib/patches/.
95

  
96

  
97
15.As is.
98

  
99

  
100
16.As is.
101

  
102

  
103
17.[OPTIONAL] Create settings.d/99-local.conf and insert local overrides for
104
   settings.d/*.  This will allow pulling new files without needing to reapply
105
   local any local modifications.
106

  
107

  
108
South Database Migrations
109
=========================
110

  
111
* Initial Migration
112

  
113
First, remember to add the south app to settings.py (it is already included in
114
the settings.py.dist).
115

  
116
To initialise south migrations in your database the following commands must be
117
executed:
118

  
119
    $ ./bin/python manage.py syncdb       # Create / update the database with the south tables
120
    $ ./bin/python manage.py migrate db   # Perform migration in the database
121

  
122
Note that syncdb will create the latest models that exist in the db app, so some
123
migrations may fail.  If you are sure a migration has already taken place you
124
must use the "--fake" option, to apply it.
125

  
126
For example:
127

  
128
    $ ./bin/python manage.py migrate db 0001 --fake
129

  
130
To be sure that all migrations are applied type:
131

  
132
    $ ./bin/python manage.py migrate db --list
133

  
134
All starred migrations are applied.
135

  
136
Remember, the migration is performed mainly for the data, not for the database
137
schema. If you do not want to migrate the data, a syncdb and fake migrations for
138
all the migration versions will suffice.
139

  
140
* Schema migrations:
141

  
142
Do not use the syncdb management command. It can only be used the first time
143
and/or if you drop the database and must recreate it from scratch. See
144
"Initial Migration" section.
145

  
146
Every time you make changes to the database and data migration is not required
147
(WARNING: always perform this with extreme care):
148

  
149
    $ ./bin/python manage.py schemamigration db --auto
150

  
151
The above will create the migration script. Now this must be applied to the live
152
database.
153

  
154
    $ ./bin/python migrate db
155

  
156
Consider this example (adding a field to the SynnefoUser model):
157

  
158
    $ ./bin/python manage.py schemamigration db --auto
159
     + Added field new_south_test_field on db.SynnefoUser
160
     Created 0002_auto__add_field_synnefouser_new_south_test_field.py.
161

  
162
  You can now apply this migration with: ./manage.py migrate db
163

  
164
    $ ./manage.py migrate db
165
     Running migrations for db:
166
     - Migrating forwards to 0002_auto__add_field_synnefouser_new_south_test_field.
167
     > db:0002_auto__add_field_synnefouser_new_south_test_field
168
     - Loading initial data for db.
169
    Installing json fixture 'initial_data' from '/home/bkarak/devel/synnefo/../synnefo/db/fixtures'.
170
    Installed 1 object(s) from 1 fixture(s)
171

  
172
South needs some extra definitions to the model to preserve and migrate the
173
existing data, for example, if we add a field in a model, we should declare its
174
default value. If not, South will propably fail, after indicating the error.
175

  
176
    $ ./bin/python manage.py schemamigration db --auto
177
     ? The field 'SynnefoUser.new_south_field_2' does not have a default specified, yet is NOT NULL.
178
     ? Since you are adding or removing this field, you MUST specify a default
179
     ? value to use for existing rows. Would you like to:
180
     ?  1. Quit now, and add a default to the field in models.py
181
     ?  2. Specify a one-off value to use for existing columns now
182
     ? Please select a choice: 1
183

  
184
* Data migrations:
185

  
186
If we need to do data migration as well, for example rename a field, we use the
187
'datamigration' management command.
188

  
189
In contrast with schemamigration, to perform complex data migration, we must
190
write the script manually. The process is the following:
191

  
192
    1. Introduce the changes in the code and fixtures (initial data).
193
    2. Execute:
194

  
195
    $ ./bin/python manage.py datamigration <migration_name_here>
196

  
197
    For example:
198

  
199
    $ ./bin/python manage.py datamigration db rename_credit_wallet
200
    Created 0003_rename_credit_wallet.py.
201

  
202
    3. We edit the generated script. It contains two methods: forwards and
203
    backwards.
204

  
205
    For database operations (column additions, alter tables etc) we use the
206
    South database API (http://south.aeracode.org/docs/databaseapi.html).
207

  
208
    To access the data, we use the database reference (orm) provided as
209
    parameter in forwards, backwards method declarations in the migration
210
    script. For example:
211

  
212
    class Migration(DataMigration):
213

  
214
    def forwards(self, orm):
215
        orm.SynnefoUser.objects.all()
216

  
217
    4. To migrate the database to the latest version, we execute:
218

  
219
    ./manage.py migrate db
220

  
221
To see which migrations are applied:
222

  
223
    $ ./bin/python manage.py migrate db --list
224

  
225
      db
226
        (*) 0001_initial
227
        (*) 0002_auto__add_field_synnefouser_new_south_test_field
228
        (*) 0003_rename_credit_wallet
229

  
230
More information and more thorough examples can be found in the South web site.
231

  
232
http://south.aeracode.org/
233

  
234

  
235
UI Testing
236
==========
237
The functional ui tests require the Selenium server and the synnefo app to
238
be running.
239

  
240
    $ wget http://selenium.googlecode.com/files/selenium-server-standalone-2.0b2.jar
241
    $ java -jar selenium-server-standalone-2.0b2.jar &
242
    $ ./bin/python manage.py runserver &
243
    $ ./bin/python manage.py test ui
244

  
245

  
246
Test coverage
247
=============
248

  
249
In order to get code coverage reports you need to install django-test-coverage
250

  
251
   $ ./bin/pip install django-test-coverage
252

  
253
Then edit your settings.py and configure the test runner:
254

  
255
   TEST_RUNNER = 'django-test-coverage.runner.run_tests'
/dev/null
1
DJANGO TRANSLATIONS OF STATIC TEXT
2

  
3
0) From our project's base, we add directory locale
4

  
5
     $mkdir locale
6

  
7
then we add on the settings.py the language code e.g.,
8

  
9
     LANGUAGES = (
10
      ('el', u'Ελληνικά'),
11
      ('en', 'English'),
12
     )
13

  
14
1) For each language we want to add, we run makemessages, from our project's
15
   base:
16

  
17
     $ ./bin/django-admin.py makemessages -l el -e html,txt,py
18
     (./bin/django-admin.py makemessages -l el -e html,txt,py --ignore=lib/*)
19

  
20
   This will add the Greek language, and we specify that html, txt and python
21
   files contain translatable strings
22

  
23
2) We translate our strings. 
24

  
25
   On .py files, (e.g., views.py), we add on the beggining of the file from
26
   django.utils.translation import gettext_lazy as _ and then each string that
27
   needs translation becomes like this:  _('string')
28
   e.g..
29

  
30
     help_text=_("letters and numbers only"))
31
     'title': _('Ubuntu 10.10 server 64bit'),
32

  
33
   On django templates (html files), on the beggining of the file we add
34
   {% load i18n %} then on each string that needs to be translated, we put it on
35
   {% trans "string" %}, for example {% trans "Home" %}
36

  
37
3) When we have put our strings to be translated, from the project's base we run
38

  
39
     $ django-admin.py makemessages -l el -e html,txt,py
40

  
41
   processing language el. This creates (or updates) the po file for the Greek
42
   language. We run this command each time we add new strings to be translated. 
43
   After that, we can translate our strings, on the po file
44
   (locale/el/LC_MESSAGES/django.po)
45

  
46
4) When we are ready, we run the following command from the project's base
47
     
48
     $ ./bin/django-admin.py compilemessages
49

  
50
   This compiles the po files to mo. Our strings will appear translated once we 
51
   change the language (eg from a dropdown menu in the page)
52

  
53
More info:
54
http://docs.djangoproject.com/en/dev/topics/i18n/internationalization/
/dev/null
1
README.storage -- Instructions for RADOS cluster deployment and administration
2

  
3
This document describes the basic steps to obtain a working RADOS cluster /
4
object store installation, to be used as a storage backend for synnefo, and
5
provides information about its administration.
6

  
7
It begins by providing general information on the RADOS object store describing
8
the different nodes in a RADOS cluster, and then moves to the installation and
9
setup of the distinct software components. Finally, it provides some basic
10
information about the cluster administration and debugging.
11

  
12
RADOS is the object storage component of the Ceph project
13
(http://http://ceph.newdream.net). For more documentation, see the official wiki
14
(http://ceph.newdream.net/wiki), and the official documentation
15
(http://ceph.newdream.net/docs). Usage information for userspace tools, used to
16
administer the cluster, are also available in the respective manpages.
17

  
18

  
19
RADOS Intro
20
===========
21
RADOS is the object storage component of Ceph.
22

  
23
An object, in this context, means a named entity that has
24

  
25
 * name: a sequence of bytes, unique within its container, that is used to locate
26
   and access the object
27
 * content: sequence of bytes
28
 * metadata: a mapping from keys to values
29

  
30
RADOS takes care of distributing the objects across the whole storage cluster
31
and replicating them for fault tolerance.
32

  
33

  
34
Node types
35
==========
36

  
37
Nodes in a RADOS deployment belong in one of the following types:
38

  
39
 * Monitor:
40
   Lightweight daemon (ceph-mon) that provides a consensus for distributed
41
   decisionmaking in a Ceph/RADOS cluster. It also is the initial point of
42
   contact for new clients, and will hand out information about the topology of
43
   the cluster, such as the osdmap.
44

  
45
   You normally run 3 ceph-mon daemons, on 3 separate physical machines,
46
   isolated from each other; for example, in different racks or rows.  You could
47
   run just 1 instance, but that means giving up on high availability.
48

  
49
   Any decision requires the majority of the ceph-mon processes to be healthy
50
   and communicating with each other. For this reason, you never want an even
51
   number of ceph-mons; there is no unambiguous majority subgroup for an even
52
   number.
53

  
54
 * OSD:
55
   Storage daemon (ceph-osd) that provides the RADOS service. It uses the
56
   monitor servers for cluster membership, services object read/write/etc
57
   request from clients, and peers with other ceph-osds for data replication.
58

  
59
   The data model is fairly simple on this level. There are multiple named
60
   pools, and within each pool there are named objects, in a flat namespace (no
61
   directories). Each object has both data and metadata.
62

  
63
   By default, three pools are created (data, metadata, rbd).
64

  
65
   The data for an object is a single, potentially big, series of bytes.
66
   Additionally, the series may be sparse, it may have holes that contain binary
67
   zeros, and take up no actual storage.
68
   
69
   The metadata is an unordered set of key-value pairs. Its semantics are
70
   completely up to the client.
71

  
72
   Multiple OSDs can run on one node, one for each disk included in the object
73
   store. This might impose a perfomance overhead, due to peering/replication.
74
   Alternatively, disks can be pooled together (either with RAID or with btrfs),
75
   requiring only one osd to manage the pool.
76

  
77
   In the case of multiple OSDs, care must be taken to generate a CRUSH map,
78
   which doesn't replicate objects across OSDs on the same host (see the next
79
   section).
80

  
81
 * Clients:
82
   Clients that can access the RADOS cluster either directly, and on an object
83
   'granurality' by using librados and the rados userspace tool, or by using
84
   librbd, and the rbd tool, which creates an image / volume abstraction over
85
   the object store.
86

  
87
   RBD images are striped over the object store daemons, to provide higher
88
   throughput, and can be accessed either via the in-kernel Rados Block Device
89
   (RBD) driver, which maps RBD images to block devices, or directly via Qemu,
90
   and the Qemu-RBD driver.
91
   
92

  
93
Replication and Fault tolerance
94
===============================
95

  
96
The objects in each pool are paritioned in a (per-pool configurable) number
97
of placement groups (pgs), and each placement group is mapped to a nubmer of
98
OSDs, according to the (per-pool configurable) replication level, and a
99
(per-pool configurable) CRUSH map, which defines how objects are replicated
100
across OSDs.
101

  
102
The CRUSH map is generated with hints from the config file (eg hostnames, racks
103
etc), so that the objects are replicated across OSDs in different 'failure
104
domains'. However, in order to be on the safe side, the CRUSH map should be
105
examined to verify that for example PGs are not replicated acroos OSDs on the
106
same host, and corrected if needed (see the Admin section).
107

  
108
Information about objects, pools, and pgs is included in the osdmap, which
109
the clients fetch initially from the monitor servers. Using the osdmap,
110
clients learn which OSD is the primary for each PG, and therefore know which
111
OSD to contact when they want to interact with a specific object. 
112

  
113
More information about the internals of the replication / fault tolerace /
114
peering inside the RADOS cluster can be found in the original RADOS paper
115
(http://dl.acm.org/citation.cfm?id=1374606).
116

  
117

  
118
Journaling
119
===========
120

  
121
The OSD maintains a journal to help keep all on-disk data in a consistent state
122
while still keep write latency low. That is, each OSD normally has a back-end
123
file system (ideally btrfs) and a journal device or file.
124

  
125
When the journal is enabled, all writes are written both to the journal and to
126
the file system. This is somewhat similar to ext3's data=journal mode, with a
127
few differences. There are two basic journaling modes:
128

  
129
 * In writeahead mode, every write transaction is written first to the journal.
130
   Once that is safely on disk, we can ack the write and then apply it to the
131
   back-end file system. This will work with any file system (with a few
132
   caveats).
133
   
134
 * In parallel mode, every write transaction is written to the journal and the 
135
   file system in parallel. The write is acked when either one safely commits
136
   (usually the journal). This will only work on btrfs, as it relies on
137
   btrfs-specific snapshot ioctls to rollback to a consistent state before
138
   replaying the journal.
139

  
140

  
141
Authentication
142
==============
143

  
144
Ceph supports cephx secure authentication between the nodes, this to make your
145
cluster more secure. There are some issues with the cephx authentication,
146
especially with clients (Qemu-RBD), and it complicates the cluster deployment.
147
Future revisions of this document will include documentation on setting up
148
fine-grained cephx authentication acroos the cluster.
149

  
150

  
151
RADOS Cluster design and configuration
152
======================================
153

  
154
This section proposes and describes a sample cluster configuration.
155

  
156
0. Monitor servers:
157
	* 3 mon servers on separate 'failure domains' (eg rack) 
158
	* Monitor servers are named mon.a, mon.b, mon.c repectively
159
	* Monitor data stored in /rados/mon.$id (should be created)
160
	* Monitor servers bind on 6789 TCP port, which should not be blocked by
161
	  firewall
162
	* Ceph configuration section for monitors:
163
		[mon]
164
			mon data = /rados/mon.$id
165

  
166
		[mon.a]
167
			host = [hostname] 
168
			mon addr = [ip]:6789
169
		[mon.b]
170
			host = [hostname] 
171
			mon addr = [ip]:6789
172
		[mon.c]
173
			host = [hostname] 
174
			mon addr = [ip]:6789
175
			
176
	* Debugging options which can be included in the monitor configuration:
177
		[mon] 
178
			;show monitor messaging traffic
179
			debug ms = 1 
180
			;show monitor debug messages
181
			debug mon = 20
182
			; show Paxos debug messages (consensus protocol)
183
			debug paxos = 20
184

  
185
1. OSD servers:
186
	* A numeric id is used to name the osds (osd.0, osd.1, ... , osd.n)
187
	* OSD servers bind on 6800+ TCP ports, which should not be blocked by
188
	  firewall
189
	* OSD data are stored in /rados/osd.$id (should be created and mounted if
190
	  needed)
191
	* /rados/osd.$id can be either a directory on the rootfs, or a separate
192
	  partition, on a dedicated fast disk (recommended)
193
		
194
	  The upstream recommended filesystem is btrfs. btrfs will use the parallel
195
	  mode for OSD journaling.
196

  
197
	  Alternatively, ext4 can be used. ext4 will use the writeahead mode for OSD
198
	  journaling. ext4 itself can also use an external journal device
199
	  (preferably a fast, eg SSD, disk). In that case, the filesystem can be
200
	  mounted with data=journal,commit=9999,noatime,nodiratime options, to
201
	  improve perfomance (proof?):
202

  
203
		mkfs.ext4 /dev/sdyy
204
	  	mke2fs -O journal_dev /dev/sdxx
205
		tune2fs -O ^has_journal /dev/sdyy
206
		tune2fs -o journal_data -j -J device=/dev/sdxx /dev/sdyy
207
		mount /dev/sdyy /rados/osd.$id -o noatime,nodiratime,data=journal,commit=9999
208
		
209
	* OSD journal can be either on a raw block device, a separate partition, or
210
	  a file.
211

  
212
	  A fash disk (SSD) is recommended as a journal device. 
213
	  
214
	  If a file is used, the journal size must be also specified in the
215
	  configuration.
216

  
217
	* Ceph configuration section for monitors:
218
		[osd]
219
			osd data = /rados/osd.$id
220
			osd journal = /dev/sdzz
221
			;if a file is used as a journal
222
			;osd journal size = N (in MB)
223
		
224
		[osd.0]
225
			;host and rack directives are used to generate a CRUSH map for PG
226
			;placement
227
			host = [hostname]
228
			rack = [rack]
229
			
230
			;public addr is the one the clients will use to contact the osd
231
			public_addr = [public ip]
232
			;cluster addr is the one used for osd-to-osd replication/peering etc
233
			cluster_addr = [cluster ip]
234

  
235
		[osd.1] 
236
			...
237

  
238
	* Debug options which can be included in the osd configuration:
239
		[osd]
240
			;show OSD messaging traffic
241
			debug ms = 1
242
			;show OSD debug information
243
			debug osd = 20
244
			;show OSD journal debug information
245
			debug jorunal = 20
246
			;show filestore debug information
247
			debug filestore = 20
248
			;show monitor client debug information
249
			debug monc = 20
250

  
251
3. Clients
252
	* Clients configuration only need the monitor servers addresses
253
	* Configration section for clients:
254
		[mon.a]
255
			mon addr = [ip]:6789
256
		[mon.b]
257
			mon addr = [ip]:6789
258
		[mon.c]
259
			mon addr = [ip]:6789
260
	* Debug options which can be included in the client configuration:
261
			;show client messaging traffic
262
			debug ms = 1
263
			;show RADOS debug information
264
			debug rados = 20
265
			;show objecter debug information
266
			debug objecter = 20
267
			;show filer debug information
268
			debug filer = 20
269
			;show objectcacher debug information
270
			debug object cacher = 20
271
		
272
4. Tips
273
	* Mount all the filesystems with noatime,nodiratime options
274
	* Even without any debug options, RADOS generates lots of logs. Make sure
275
	  the logs files are in a fast disk, with little I/O traffic, and the
276
	  partition is mounted with noatime.
277

  
278

  
279
Installation Process
280
====================
281

  
282
This section describes the installation process of the various software
283
components in a RADOS cluster.
284

  
285
0. Add Ceph Debian repository in /etc/apt/sources.list on every node (mon, osd,
286
   clients):
287
	 deb http://ceph.newdream.net/debian/ squeeze main
288
	 deb-src http://ceph.newdream.net/debian/ squeeze main
289

  
290
1. Monitor and OSD servers:
291
	* Install the ceph package
292
	* Upgrade to an up-to-date kernel (>=3.x)
293
	* Edit the /etc/ceph/ceph.conf to include the mon and osd configuration
294
	  sections, shown previously.
295
	* Create the corresponding dirs in /rados (mon.$id and osd.$id)
296
	* (optionally) Format and mount the osd.$id patition in /rados/osd.$id
297
	* Make sure the journal device specified in the conf exists.
298
	* (optionally) Make sure everything is mounted with the noatime,nodiratime
299
	  options
300
	* Make sure monitor and osd servers can freely ssh to each other, using only
301
	  hostnames.
302
	* Create the object store: 
303
		mkcephfs -a -c /etc/ceph/ceph.conf
304
	* Start the servers:
305
		service ceph -a start
306
	* Verify that the object store is healthy, and running:
307
		ceph helth
308
		ceph -s
309

  
310
2. Clients:
311
	* Install the ceph-common package
312
	* Upgrade to an up-to-date kernel (>=3.x)
313
	* Install linux-headers for the new kernel
314
	* Check out the latest ceph-client git repo:
315
		git clone git://github.com/NewDreamNetwork/ceph-client.git
316
	* Copy the ncecessary ceph header file to linux-headers:
317
		cp -r ceph-client/include/linux/ceph/* /usr/src/linux-$(uname-r)/include/linux/ceph/
318
	* Build the modules:
319
		cd ~/ceph-client/net/ceph/
320
		make -C /usr/src/linux-headers-3.0.0-2-amd64/  M=$(pwd) libceph.ko
321
		cp Modules.symvers ../../drivers/block/
322
		cd ~/ceph-client/drivers/block/
323
		make -C /usr/src/linux-headers-3.0.0-2-amd64/  M=$(pwd) rbd.ko
324
	* Optionally, copy rbd.ko and libceph. ko to /lib/modules/
325
	* Load the modules:
326
		modprobe rbd
327

  
328

  
329
Administration Notes
330
====================
331

  
332
This section includes some notes on the RADOS cluster administration.
333

  
334
0. Starting / Stopping servers
335
	* service ceph -a start/stop (affects all the servers in the cluster)
336
	* service ceph start/stop osd (affects only the osds in the current node)
337
	* service ceph start/stop mon (affects only the mons in the current node)
338
	* service ceph start/stop osd.$id/mon.$id (affects only the specified node)
339

  
340
	* sevice ceph cleanlogs/cleanalllogs
341

  
342
1. Stop the cluster cleanly
343
	ceph stop
344

  
345
2. Increase the replication level for a given pool:
346
	ceph osd pool set $poolname size $size
347

  
348
   Note that increasing the replication level, the overhead for the replication
349
   will impact perfomance.
350

  
351
3. Adjust the number of placement groups per pool:
352
	ceph osd pool set $poolname pg_num $num
353
   
354
   The default number of pgs per pool is determined by the number of OSDs in the
355
   cluster, and the replication level of the pool (for 4 OSDs and replication
356
   size 2, the default value is 8). The default pools (data,metadata,rbd) are
357
   assigned 256 PGs.
358

  
359
   After the splitting is complete, the number of PGs in the system must be
360
   changed. Warning: this is not considered safe on PGs in use (with objects),
361
   and should be changed only when the PG is created, and before being used:
362
   	ceph osd pool set $poolname pgp_num $num
363

  
364
4. Replacing the journal for osd.$id:
365
	Edit the osd.$id journal configration section
366
	ceph-osd -i osd.$id --mkjournal
367
	ceph-osd -i osd.$id --osd.journal /path/to/journal
368

  
369
5. Add a new OSD:
370
	Edit /etc/ceph/ceph.conf to include the new OSD
371
	ceph mon getmap -o /tmp/monmap
372
	ceph-osd --mkfs -i osd.$id --monmap /tmp/monmap
373
	ceph osd setmaxosd [maxosd+1] (ceph osd getmaxosd to get the num of osd if needed)
374
	service ceph start osd.$id
375

  
376
	Generate the CRUSH map to include the new osd in PGs:
377
		osdmaptool --createsimple [maxosd] --clobber /tmp/osdmap --export-crush /tmp/crush
378
		ceph osd setcrushmap -i /tmp/crush
379
	Or edit the CRUSH map by hand:
380
		ceph osd getcrushmap -o /tmp/crush
381
		crushmaptool -d /tmp/crush -o crushmap
382
		vim crushmap
383
		crushmaptool -c crushmap -o /tmp/crush
384
		ceph osd setcrushmap -i /tmp/crush
385

  
386
6. General ceph tool commands:
387
	* ceph mon stat (stat mon servers)
388
	* ceph mon getmap (get the monmap, use monmaptool to edit)
389
	* ceph osd dump (dump osdmap -> pool info, osd info)
390
	* ceph osd getmap (get osdmap -> use osdmaptool to edit)
391
	* ceph osd lspools
392
	* ceph osd stat (stat osd servers)
393
	* ceph ost tree (osd server info)
394
	* ceph pg dump/stat (show info about PGs)
395

  
396
7. rados userspace tool:
397

  
398
   The rados userspace tool (included in ceph-common package), uses librados to
399
   communicate with the object store.
400

  
401
	* rados mkpool [pool]
402
	* rados rmpool [pool]
403
	* rados df (show usage per pool)
404
	* rados lspools (list pools)
405
	* rados ls -p [pool] (list objects in [pool]
406
	* rados bench [secs] write|seq -t [concurrent operation]
407
	* rados import/export <pool> <dir> (import/export a local directory in a rados pool)
408

  
409
8. rbd userspace tool:
410
   
411
   The rbd userspace tool (included in ceph-commong package), uses librbd and
412
   librados to communicate with the object store. 
413

  
414
	* rbd ls -p [pool] (list RBD images in [pool], default pool = rbd) 
415
	* rbd info [pool] -p [pool]
416
	* rbd create [image] --size n (in MB)
417
	* rbd rm [image]
418
	* rbd export/import [dir] [image]
419
	* rbd cp/mv [image] [dest]
420
	* rbd resize [image]
421
	* rbd map [image] (map an RBD image to a block device using the in-kernel RBD driver)
422
	* rbd unmap /dev/rbdx (unmap an RBD device)
423
	* rbd showmapped
424

  
425
9. In-kernel RBD driver
426

  
427
   The in-kernel RBD driver can be used to map and ummap RBD images as block
428
   devices. Once mapped, they will appear as /dev/rbdX, and a symlink will be
429
   created in /dev/rbd/[poolname]/[imagename]:[bdev id].
430

  
431
   It also exports a sysfs interface, under /sys/bus/rbd/ which can be used to
432
   add / remove / list devices, although the rbd map/unmap/showmapped commands
433
   are preferred.
... This diff was truncated because it exceeds the maximum size that can be displayed.

Also available in: Unified diff