X-Git-Url: https://code.grnet.gr/git/ganeti-local/blobdiff_plain/00b157522469058680f231f51880e720ee48d12c..0c936d24dccef20cdca0b5fb96c786731e284173:/README diff --git a/README b/README index ff5936e..3e799c6 100644 --- a/README +++ b/README @@ -1,13 +1,44 @@ -Ganeti Cluster tools (htools) -============================= +Ganeti Cluster tools (ganeti-htools) +==================================== -These are some simple cluster tools for fixing common problems. Right -now N+1 and rebalancing are included. Starting with version 0.1.0, -only Ganeti 2.0 is supported. +These are some simple cluster tools for fixing common allocation +problems on Ganeti 2.0 clusters. +Note that these tools are most useful for bigger cluster sizes +(e.g. more than five or ten machines); at lower sizes, the +computations they do can also be done manually. + +Most of the tools revolve around the concept of keeping the cluster +N+1 compliant: this means that in case of failure of any node, the +instances affected can be failed over (via ``gnt-node failover`` or +``gnt-instance failover``) to their secondary node, and there is +enough memory reserved for this operation without needing to shutdown +other instances or rebalance the cluster. + +**Quick start** (see the installation section for more details): + +- (have the ghc compiler and the prerequisite libraries installed) +- make +- ./hbal -m $cluster -C -p +- look at the original and final cluster layout, and if acceptable, + execute the given commands + + +Available tools +--------------- + +Cluster rebalancer +~~~~~~~~~~~~~~~~~~ + +The rebalancer uses a simple algorithm to try to get the nodes of the +cluster as equal as possible in their resource usage. It tries to +repeatedly move each instance one step, so that the cluster score +becomes better. We stop when no further move can improve the score. + +For algorithm details and usage, see the man page hbal(1). Cluster N+1 solver ------------------- +~~~~~~~~~~~~~~~~~~ This program runs a very simple brute force algorithm over the instance placement space in order to determine the shortest number of replace-disks @@ -16,34 +47,53 @@ just one that passes N+1 checks. For algorithm details and usage, see the man page hn1(1). -Cluster rebalancer ------------------- +.. note:: This program is deprecated, hbal should be used instead. -Compared to the N+1 solver, the rebalancer uses a very simple algorithm: -repeatedly try to move each instance one step, so that the cluster score -becomes better. We stop when no further move can improve the score. +IAllocator plugin +~~~~~~~~~~~~~~~~~ + +The ``hail`` iallocator plugin can be used for allocations of mirrored +and non-mirrored instances and for relocations of mirrored +instances. It needs to be installed in Ganeti's iallocator search +path—usually ``/usr/lib/ganeti/iallocators`` or +``/usr/local/lib/ganeti/iallocators``, and after that it can be used +via ganeti's ``--iallocator`` option (in various gnt-node/gnt-instance +commands). See the man page hail(1) for more details. + +Cluster capacity estimator +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``hspace`` program will, given an input instance specification, +estimate how many instances of those type can be place on the cluster +before it will become full (as in any new allocation would fail N+1 +checks). For more details, see the man page hspace(1). -For algorithm details and usage, see the man page hbal(1). Integration with Ganeti ----------------------- -The programs can either get their input from text files, or online -from a cluster via RAPI. For online collection via RAPI, the "-m" -argument to both hn1 and hbal should specify the cluster or master -node name. +The ``hbal``, ``hspace`` and ``hn1`` programs can either get their +input from text files, or online from a cluster via RAPI. For online +collection via RAPI, the "-m" argument to both hn1 and hbal should +specify the cluster or master node name. ``hail`` uses the standard +iallocator API and thus doesn't need any special setup (just needs to +be installed in the right directory). -For text files, a separate tool (hscan) is provided to automate their -gathering if RAPI is available, which is better since it can extract -more precise information. In case RAPI is not usable for whatever -reason, the following two commands should be run:: +For generating the text files, a separate tool (``hscan``) is provided +to automate their gathering if RAPI is available, which is better +since it can extract more precise information. In case RAPI is not +usable for whatever reason, the following two commands should be run:: gnt-node list -oname,mtotal,mnode,mfree,dtotal,dfree,offline \ --separator '|' --no-headers > nodes - gnt-instance list -oname,admin_ram,sda_size,status,pnode,snodes \ + gnt-instance list -oname,be/memory,sda_size,be/vcpus,status,pnode,snodes \ --separator '|' --no-head > instances -These two files should be saved under the names of *nodes* and *instances*. +These two files should be saved under the names of *nodes* and +*instances*. + +The ``hail`` program gets its data automatically from Ganeti when used +as described in its section. Installation ------------ @@ -52,11 +102,21 @@ If installing from source, you need a working ghc compiler (6.8 at least) and some extra Haskell libraries which usually need to be installed manually: -- json -- curl +- json (http://hackage.haskell.org/cgi-bin/hackage-scripts/package/json) +- curl (http://hackage.haskell.org/cgi-bin/hackage-scripts/package/curl) + +Once these are installed, just typing *make* in the top-level +directory should be enough. -One these are available, just typing *make* in the top-level directory -should be enough. +Only the ``hail`` program needs to be installed in a specific place, +the other tools are not location-dependent. + +For running the (admittedly small) unittest suite (via *make check*), +the QuickCheck version 1 library is needed. Internal (implementation) documentation is available in the ``apidoc`` directory. + +.. Local Variables: +.. mode: rst +.. End: