Statistics
| Branch: | Tag: | Revision:

root / doc / design-ceph-ganeti-support.rst @ 56c934da

History | View | Annotate | Download (6.3 kB)

1 e77c026d Pulkit Singhal
============================
2 e77c026d Pulkit Singhal
RADOS/Ceph support in Ganeti
3 e77c026d Pulkit Singhal
============================
4 e77c026d Pulkit Singhal
5 e77c026d Pulkit Singhal
.. contents:: :depth: 4
6 e77c026d Pulkit Singhal
7 e77c026d Pulkit Singhal
Objective
8 e77c026d Pulkit Singhal
=========
9 e77c026d Pulkit Singhal
10 e77c026d Pulkit Singhal
The project aims to improve Ceph RBD support in Ganeti. It can be
11 e77c026d Pulkit Singhal
primarily divided into following tasks.
12 e77c026d Pulkit Singhal
13 e77c026d Pulkit Singhal
- Use Qemu/KVM RBD driver to provide instances with direct RBD
14 e77c026d Pulkit Singhal
  support.
15 e77c026d Pulkit Singhal
- Allow Ceph RBDs' configuration through Ganeti.
16 e77c026d Pulkit Singhal
- Write a data collector to monitor Ceph nodes.
17 e77c026d Pulkit Singhal
18 e77c026d Pulkit Singhal
Background
19 e77c026d Pulkit Singhal
==========
20 e77c026d Pulkit Singhal
21 e77c026d Pulkit Singhal
Ceph RBD
22 e77c026d Pulkit Singhal
--------
23 e77c026d Pulkit Singhal
24 e77c026d Pulkit Singhal
Ceph is a distributed storage system which provides data access as
25 e77c026d Pulkit Singhal
files, objects and blocks. As part of this project, we're interested in
26 e77c026d Pulkit Singhal
integrating ceph's block device (RBD) directly with Qemu/KVM.
27 e77c026d Pulkit Singhal
28 e77c026d Pulkit Singhal
Primary components/daemons of Ceph.
29 e77c026d Pulkit Singhal
- Monitor - Serve as authentication point for clients.
30 e77c026d Pulkit Singhal
- Metadata - Store all the filesystem metadata (Not configured here as
31 e77c026d Pulkit Singhal
they are not required for RBD)
32 e77c026d Pulkit Singhal
- OSD - Object storage devices. One daemon for each drive/location.
33 e77c026d Pulkit Singhal
34 e77c026d Pulkit Singhal
RBD support in Ganeti
35 e77c026d Pulkit Singhal
---------------------
36 e77c026d Pulkit Singhal
37 e77c026d Pulkit Singhal
Currently, Ganeti supports RBD volumes on a pre-configured Ceph cluster.
38 e77c026d Pulkit Singhal
This is enabled through RBD disk templates. These templates allow RBD
39 e77c026d Pulkit Singhal
volume's access through RBD Linux driver. The volumes are mapped to host
40 e77c026d Pulkit Singhal
as local block devices which are then attached to the instances. This
41 e77c026d Pulkit Singhal
method incurs an additional overhead. We plan to resolve it by using
42 e77c026d Pulkit Singhal
Qemu's RBD driver to enable direct access to RBD volumes for KVM
43 e77c026d Pulkit Singhal
instances.
44 e77c026d Pulkit Singhal
45 e77c026d Pulkit Singhal
Also, Ganeti currently uses RBD volumes on a pre-configured ceph cluster.
46 e77c026d Pulkit Singhal
Allowing configuration of ceph nodes through Ganeti will be a good
47 e77c026d Pulkit Singhal
addition to its prime features.
48 e77c026d Pulkit Singhal
49 e77c026d Pulkit Singhal
50 e77c026d Pulkit Singhal
Qemu/KVM Direct RBD Integration
51 e77c026d Pulkit Singhal
===============================
52 e77c026d Pulkit Singhal
53 e77c026d Pulkit Singhal
A new disk param ``access`` is introduced. It's added at
54 e77c026d Pulkit Singhal
cluster/node-group level to simplify prototype implementation.
55 e77c026d Pulkit Singhal
It will specify the access method either as ``userspace`` or
56 e77c026d Pulkit Singhal
``kernelspace``. It's accessible to StartInstance() in hv_kvm.py. The
57 e77c026d Pulkit Singhal
device path, ``rbd:<pool>/<vol_name>``, is generated by RADOSBlockDevice
58 e77c026d Pulkit Singhal
and is added to the params dictionary as ``kvm_dev_path``.
59 e77c026d Pulkit Singhal
60 e77c026d Pulkit Singhal
This approach ensures that no disk template specific changes are
61 e77c026d Pulkit Singhal
required in hv_kvm.py allowing easy integration of other distributed
62 e77c026d Pulkit Singhal
storage systems (like Gluster).
63 e77c026d Pulkit Singhal
64 e77c026d Pulkit Singhal
Note that the RBD volume is mapped as a local block device as before.
65 e77c026d Pulkit Singhal
The local mapping won't be used during instance operation in the
66 e77c026d Pulkit Singhal
``userspace`` access mode, but can be used by administrators and OS
67 e77c026d Pulkit Singhal
scripts.
68 e77c026d Pulkit Singhal
69 e77c026d Pulkit Singhal
Updated commands
70 e77c026d Pulkit Singhal
----------------
71 e77c026d Pulkit Singhal
::
72 e77c026d Pulkit Singhal
  $ gnt-instance info
73 e77c026d Pulkit Singhal
74 e77c026d Pulkit Singhal
``access:userspace/kernelspace`` will be added to Disks category. This
75 e77c026d Pulkit Singhal
output applies to KVM based instances only.
76 e77c026d Pulkit Singhal
77 e77c026d Pulkit Singhal
Ceph configuration on Ganeti nodes
78 e77c026d Pulkit Singhal
==================================
79 e77c026d Pulkit Singhal
80 e77c026d Pulkit Singhal
This document proposes configuration of distributed storage
81 e77c026d Pulkit Singhal
pool (Ceph or Gluster) through ganeti. Currently, this design document
82 e77c026d Pulkit Singhal
focuses on configuring a Ceph cluster. A prerequisite of this setup
83 e77c026d Pulkit Singhal
would be installation of ceph packages on all the concerned nodes.
84 e77c026d Pulkit Singhal
85 e77c026d Pulkit Singhal
At Ganeti Cluster init, the user will set distributed-storage specific
86 e77c026d Pulkit Singhal
options which will be stored at cluster level. The Storage cluster
87 e77c026d Pulkit Singhal
will be initialized using ``gnt-storage``. For the prototype, only a
88 e77c026d Pulkit Singhal
single storage pool/node-group is configured.
89 e77c026d Pulkit Singhal
90 e77c026d Pulkit Singhal
Following steps take place when a node-group is initialized as a storage
91 e77c026d Pulkit Singhal
cluster.
92 e77c026d Pulkit Singhal
93 e77c026d Pulkit Singhal
  - Check for an existing ceph cluster through /etc/ceph/ceph.conf file
94 e77c026d Pulkit Singhal
    on each node.
95 e77c026d Pulkit Singhal
  - Fetch cluster configuration parameters and create a distributed
96 e77c026d Pulkit Singhal
    storage object accordingly.
97 e77c026d Pulkit Singhal
  - Issue an 'init distributed storage' RPC to group nodes (if any).
98 e77c026d Pulkit Singhal
  - On each node, ``ceph`` cli tool will run appropriate services.
99 e77c026d Pulkit Singhal
  - Mark nodes as well as the node-group as distributed-storage-enabled.
100 e77c026d Pulkit Singhal
101 e77c026d Pulkit Singhal
The storage cluster will operate at a node-group level. The ceph
102 e77c026d Pulkit Singhal
cluster will be initiated using gnt-storage. A new sub-command
103 e77c026d Pulkit Singhal
``init-distributed-storage`` will be added to it.
104 e77c026d Pulkit Singhal
105 e77c026d Pulkit Singhal
The configuration of the nodes will be handled through an init function
106 e77c026d Pulkit Singhal
called by the node daemons running on the respective nodes. A new RPC is
107 e77c026d Pulkit Singhal
introduced to handle the calls.
108 e77c026d Pulkit Singhal
109 e77c026d Pulkit Singhal
A new object will be created to send the storage parameters to the node
110 e77c026d Pulkit Singhal
- storage_type, devices, node_role (mon/osd) etc.
111 e77c026d Pulkit Singhal
112 e77c026d Pulkit Singhal
A new node can be directly assigned to the storage enabled node-group.
113 e77c026d Pulkit Singhal
During the 'gnt-node add' process, required ceph daemons will be started
114 e77c026d Pulkit Singhal
and node will be added to the ceph cluster.
115 e77c026d Pulkit Singhal
116 e77c026d Pulkit Singhal
Only an offline node can be assigned to storage enabled node-group.
117 e77c026d Pulkit Singhal
``gnt-node --readd`` needs to be performed to issue RPCs for spawning
118 e77c026d Pulkit Singhal
appropriate services on the newly assigned node.
119 e77c026d Pulkit Singhal
120 e77c026d Pulkit Singhal
Updated Commands
121 e77c026d Pulkit Singhal
----------------
122 e77c026d Pulkit Singhal
123 e77c026d Pulkit Singhal
Following are the affected commands.::
124 e77c026d Pulkit Singhal
125 e77c026d Pulkit Singhal
  $ gnt-cluster init -S ceph:disk=/dev/sdb,option=value...
126 e77c026d Pulkit Singhal
127 e77c026d Pulkit Singhal
During cluster initialization, ceph specific options are provided which
128 e77c026d Pulkit Singhal
apply at cluster-level.::
129 e77c026d Pulkit Singhal
130 e77c026d Pulkit Singhal
  $ gnt-cluster modify -S ceph:option=value2...
131 e77c026d Pulkit Singhal
132 e77c026d Pulkit Singhal
For now, cluster modification will be allowed when there is no
133 e77c026d Pulkit Singhal
initialized storage cluster.::
134 e77c026d Pulkit Singhal
135 e77c026d Pulkit Singhal
  $ gnt-storage init-distributed-storage -s{--storage-type} ceph \
136 e77c026d Pulkit Singhal
    <node-group>
137 e77c026d Pulkit Singhal
138 e77c026d Pulkit Singhal
Ensure that no other node-group is configured as distributed storage
139 e77c026d Pulkit Singhal
cluster and configure ceph on the specified node-group. If there is no
140 e77c026d Pulkit Singhal
node in the node-group, it'll only be marked as distributed storage
141 e77c026d Pulkit Singhal
enabled and no action will be taken.::
142 e77c026d Pulkit Singhal
143 e77c026d Pulkit Singhal
  $ gnt-group assign-nodes <group> <node>
144 e77c026d Pulkit Singhal
145 e77c026d Pulkit Singhal
It ensures that the node is offline if the node-group specified is
146 e77c026d Pulkit Singhal
distributed storage capable. Ceph configuration on the newly assigned
147 e77c026d Pulkit Singhal
node is not performed at this step.::
148 e77c026d Pulkit Singhal
149 e77c026d Pulkit Singhal
  $ gnt-node --offline
150 e77c026d Pulkit Singhal
151 e77c026d Pulkit Singhal
If the node is part of storage node-group, an offline call will stop/remove
152 e77c026d Pulkit Singhal
ceph daemons.::
153 e77c026d Pulkit Singhal
154 e77c026d Pulkit Singhal
  $ gnt-node add --readd
155 e77c026d Pulkit Singhal
156 e77c026d Pulkit Singhal
If the node is now part of the storage node-group, issue init
157 e77c026d Pulkit Singhal
distributed storage RPC to the respective node. This step is required
158 e77c026d Pulkit Singhal
after assigning a node to the storage enabled node-group::
159 e77c026d Pulkit Singhal
160 e77c026d Pulkit Singhal
  $ gnt-node remove
161 e77c026d Pulkit Singhal
162 e77c026d Pulkit Singhal
A warning will be issued stating that the node is part of distributed
163 e77c026d Pulkit Singhal
storage, mark it offline before removal.
164 e77c026d Pulkit Singhal
165 e77c026d Pulkit Singhal
Data collector for Ceph
166 e77c026d Pulkit Singhal
-----------------------
167 e77c026d Pulkit Singhal
168 e77c026d Pulkit Singhal
TBD
169 e77c026d Pulkit Singhal
170 e77c026d Pulkit Singhal
Future Work
171 e77c026d Pulkit Singhal
-----------
172 e77c026d Pulkit Singhal
173 e77c026d Pulkit Singhal
Due to the loopback bug in ceph, one may run into daemon hang issues
174 e77c026d Pulkit Singhal
while performing writes to a RBD volumes through block device mapping.
175 e77c026d Pulkit Singhal
This bug is applicable only when the RBD volume is stored on the OSD
176 e77c026d Pulkit Singhal
running on the local node. In order to mitigate this issue, we can
177 e77c026d Pulkit Singhal
create storage pools on different nodegroups and access RBD
178 e77c026d Pulkit Singhal
volumes on different pools.
179 e77c026d Pulkit Singhal
http://tracker.ceph.com/issues/3076
180 e77c026d Pulkit Singhal
181 e77c026d Pulkit Singhal
.. vim: set textwidth=72 :
182 e77c026d Pulkit Singhal
.. Local Variables:
183 e77c026d Pulkit Singhal
.. mode: rst
184 e77c026d Pulkit Singhal
.. fill-column: 72
185 e77c026d Pulkit Singhal
.. End: