/doc/design-ceph-ganeti-support.rst - snf-ganeti - Greek Research and Technology Network's projects

| Branch: | Tag: | Revision:

root / doc / design-ceph-ganeti-support.rst @ 9110fb4a

History | View | Annotate | Download (6.3 kB)

       ============================
       RADOS/Ceph support in Ganeti
       ============================
       .. contents:: :depth: 4
       Objective
       =========
       The project aims to improve Ceph RBD support in Ganeti. It can be
       primarily divided into following tasks.
       - Use Qemu/KVM RBD driver to provide instances with direct RBD
         support.
       - Allow Ceph RBDs' configuration through Ganeti.
       - Write a data collector to monitor Ceph nodes.
       Background
       ==========
       Ceph RBD
       --------
       Ceph is a distributed storage system which provides data access as
       files, objects and blocks. As part of this project, we're interested in
       integrating ceph's block device (RBD) directly with Qemu/KVM.
       Primary components/daemons of Ceph.
       - Monitor - Serve as authentication point for clients.
       - Metadata - Store all the filesystem metadata (Not configured here as
       they are not required for RBD)
       - OSD - Object storage devices. One daemon for each drive/location.
       RBD support in Ganeti
       ---------------------
       Currently, Ganeti supports RBD volumes on a pre-configured Ceph cluster.
       This is enabled through RBD disk templates. These templates allow RBD
       volume's access through RBD Linux driver. The volumes are mapped to host
       as local block devices which are then attached to the instances. This
       method incurs an additional overhead. We plan to resolve it by using
       Qemu's RBD driver to enable direct access to RBD volumes for KVM
       instances.
       Also, Ganeti currently uses RBD volumes on a pre-configured ceph cluster.
       Allowing configuration of ceph nodes through Ganeti will be a good
       addition to its prime features.
       Qemu/KVM Direct RBD Integration
       ===============================
       A new disk param ``access`` is introduced. It's added at
       cluster/node-group level to simplify prototype implementation.
       It will specify the access method either as ``userspace`` or
       ``kernelspace``. It's accessible to StartInstance() in hv_kvm.py. The
       device path, ``rbd:<pool>/<vol_name>``, is generated by RADOSBlockDevice
       and is added to the params dictionary as ``kvm_dev_path``.
       This approach ensures that no disk template specific changes are
       required in hv_kvm.py allowing easy integration of other distributed
       storage systems (like Gluster).
       Note that the RBD volume is mapped as a local block device as before.
       The local mapping won't be used during instance operation in the
       ``userspace`` access mode, but can be used by administrators and OS
       scripts.
       Updated commands
       ----------------
       ::
         $ gnt-instance info
       ``access:userspace/kernelspace`` will be added to Disks category. This
       output applies to KVM based instances only.
       Ceph configuration on Ganeti nodes
       ==================================
       This document proposes configuration of distributed storage
       pool (Ceph or Gluster) through ganeti. Currently, this design document
       focuses on configuring a Ceph cluster. A prerequisite of this setup
       would be installation of ceph packages on all the concerned nodes.
       At Ganeti Cluster init, the user will set distributed-storage specific
       options which will be stored at cluster level. The Storage cluster
       will be initialized using ``gnt-storage``. For the prototype, only a
       single storage pool/node-group is configured.
       Following steps take place when a node-group is initialized as a storage
       cluster.
         - Check for an existing ceph cluster through /etc/ceph/ceph.conf file
           on each node.
         - Fetch cluster configuration parameters and create a distributed
           storage object accordingly.
         - Issue an 'init distributed storage' RPC to group nodes (if any).
         - On each node, ``ceph`` cli tool will run appropriate services.
         - Mark nodes as well as the node-group as distributed-storage-enabled.
       The storage cluster will operate at a node-group level. The ceph
       cluster will be initiated using gnt-storage. A new sub-command
       ``init-distributed-storage`` will be added to it.
       The configuration of the nodes will be handled through an init function
       called by the node daemons running on the respective nodes. A new RPC is
       introduced to handle the calls.
       A new object will be created to send the storage parameters to the node
       - storage_type, devices, node_role (mon/osd) etc.
       A new node can be directly assigned to the storage enabled node-group.
       During the 'gnt-node add' process, required ceph daemons will be started
       and node will be added to the ceph cluster.
       Only an offline node can be assigned to storage enabled node-group.
       ``gnt-node --readd`` needs to be performed to issue RPCs for spawning
       appropriate services on the newly assigned node.
       Updated Commands
       ----------------
       Following are the affected commands.::
         $ gnt-cluster init -S ceph:disk=/dev/sdb,option=value...
       During cluster initialization, ceph specific options are provided which
       apply at cluster-level.::
         $ gnt-cluster modify -S ceph:option=value2...
       For now, cluster modification will be allowed when there is no
       initialized storage cluster.::
         $ gnt-storage init-distributed-storage -s{--storage-type} ceph \
           <node-group>
       Ensure that no other node-group is configured as distributed storage
       cluster and configure ceph on the specified node-group. If there is no
       node in the node-group, it'll only be marked as distributed storage
       enabled and no action will be taken.::
         $ gnt-group assign-nodes <group> <node>
       It ensures that the node is offline if the node-group specified is
       distributed storage capable. Ceph configuration on the newly assigned
       node is not performed at this step.::
         $ gnt-node --offline
       If the node is part of storage node-group, an offline call will stop/remove
       ceph daemons.::
         $ gnt-node add --readd
       If the node is now part of the storage node-group, issue init
       distributed storage RPC to the respective node. This step is required
       after assigning a node to the storage enabled node-group::
         $ gnt-node remove
       A warning will be issued stating that the node is part of distributed
       storage, mark it offline before removal.
       Data collector for Ceph
       -----------------------
       TBD
       Future Work
       -----------
       Due to the loopback bug in ceph, one may run into daemon hang issues
       while performing writes to a RBD volumes through block device mapping.
       This bug is applicable only when the RBD volume is stored on the OSD
       running on the local node. In order to mitigate this issue, we can
       create storage pools on different nodegroups and access RBD
       volumes on different pools.
       http://tracker.ceph.com/issues/3076
       .. vim: set textwidth=72 :
       .. Local Variables:
       .. mode: rst
       .. fill-column: 72
       .. End:

Synnefo » snf-ganeti

root / doc / design-ceph-ganeti-support.rst @ 9110fb4a