Statistics
| Branch: | Tag: | Revision:

root / doc / design-hugepages-support.rst @ 333bd799

History | View | Annotate | Download (3.6 kB)

1 6d67e8bf Izhar
===============================
2 6d67e8bf Izhar
Huge Pages Support for Ganeti
3 6d67e8bf Izhar
===============================
4 6d67e8bf Izhar
This is a design document about implementing support for huge pages in
5 6d67e8bf Izhar
Ganeti. (Please note that Ganeti works with Transparent Huge Pages i.e.
6 6d67e8bf Izhar
THP and any reference in this document to Huge Pages refers to explicit
7 6d67e8bf Izhar
Huge Pages).
8 6d67e8bf Izhar
9 6d67e8bf Izhar
Current State and Shortcomings:
10 6d67e8bf Izhar
-------------------------------
11 6d67e8bf Izhar
The Linux kernel allows using pages of larger size by setting aside a
12 6d67e8bf Izhar
portion of the memory. Using larger page size may enhance the
13 6d67e8bf Izhar
performance of applications that require a lot of memory by improving
14 6d67e8bf Izhar
page hits. To use huge pages, memory has to be reserved beforehand. This
15 6d67e8bf Izhar
portion of memory is subtracted from free memory and is considered as in
16 6d67e8bf Izhar
use. Currently Ganeti cannot take proper advantage of huge pages. On a
17 6d67e8bf Izhar
node, if huge pages are reserved and are available to fulfill the VM
18 6d67e8bf Izhar
request, Ganeti fails to recognize huge pages and considers the memory
19 6d67e8bf Izhar
reserved for huge pages as used memory.  This leads to failure of
20 6d67e8bf Izhar
launching VMs on a node where memory is available in the form of huge
21 6d67e8bf Izhar
pages rather than normal pages.
22 6d67e8bf Izhar
23 6d67e8bf Izhar
Proposed Changes:
24 6d67e8bf Izhar
-----------------
25 6d67e8bf Izhar
The following components will be changed in order for Ganeti to take
26 6d67e8bf Izhar
advantage of Huge Pages.
27 6d67e8bf Izhar
28 6d67e8bf Izhar
Hypervisor Parameters:
29 6d67e8bf Izhar
----------------------
30 6d67e8bf Izhar
Currently, It is possible to set or modify huge pages mount point at
31 6d67e8bf Izhar
cluster level via the hypervisor parameter ``mem_path`` as::
32 6d67e8bf Izhar
33 2c88200b Helga Velroyen
	$ gnt-cluster init \
34 6d67e8bf Izhar
	>--enabled-hypervisors=kvm -nic-parameters link=br100 \
35 6d67e8bf Izhar
	> -H kvm:mem_path=/mount/point/for/hugepages
36 6d67e8bf Izhar
37 6d67e8bf Izhar
This hypervisor parameter is inherited by all the instances as
38 6d67e8bf Izhar
default although it can be overriden at the instance level.
39 6d67e8bf Izhar
40 6d67e8bf Izhar
The following changes will be made to the inheritence behaviour.
41 6d67e8bf Izhar
42 6d67e8bf Izhar
-  The hypervisor parameter   ``mem_path`` and all other hypervisor
43 6d67e8bf Izhar
   parameters will be made available at the node group level (in
44 6d67e8bf Izhar
   addition to the cluster level), so that users can set defaults for
45 6d67e8bf Izhar
   the node group::
46 6d67e8bf Izhar
47 6d67e8bf Izhar
	$ gnt-group add/modify\
48 6d67e8bf Izhar
	> -H hv:parameter=value
49 6d67e8bf Izhar
50 6d67e8bf Izhar
   This changes the hypervisor inheritence level as::
51 6d67e8bf Izhar
52 6d67e8bf Izhar
     cluster -> group -> OS -> instance
53 6d67e8bf Izhar
54 6d67e8bf Izhar
-  Furthermore, the hypervisor parameter ``mem_path`` will be changeable
55 6d67e8bf Izhar
   only at the cluster or node group level and users must not be able to
56 6d67e8bf Izhar
   override this at OS or instance level. The following command must
57 6d67e8bf Izhar
   produce an error message that ``mem_path`` may only be set at either
58 6d67e8bf Izhar
   the cluster or the node group level::
59 6d67e8bf Izhar
60 6d67e8bf Izhar
	$ gnt-instance add -H kvm:mem_path=/mount/point/for/hugepages
61 6d67e8bf Izhar
62 6d67e8bf Izhar
Memory Pools:
63 6d67e8bf Izhar
-------------
64 6d67e8bf Izhar
Memory management of Ganeti will be improved by creating separate pools
65 6d67e8bf Izhar
for memory used by the node itself, memory used by the hypervisor and
66 6d67e8bf Izhar
the memory reserved for huge pages as:
67 6d67e8bf Izhar
- mtotal/xen (Xen memory)
68 6d67e8bf Izhar
- mfree/xen (Xen unused memory)
69 6d67e8bf Izhar
- mtotal/hp (Memory reserved for Huge Pages)
70 6d67e8bf Izhar
- mfree/hp (Memory available from unused huge pages)
71 6d67e8bf Izhar
- mpgsize/hp (Size of a huge page)
72 6d67e8bf Izhar
73 6d67e8bf Izhar
mfree and mtotal will be changed to mean "the total and free memory for
74 6d67e8bf Izhar
the default method in this cluster/nodegroup". Note that the default
75 6d67e8bf Izhar
method depends both on the default hypervisor and its parameters.
76 6d67e8bf Izhar
77 6d67e8bf Izhar
iAllocator Changes:
78 6d67e8bf Izhar
-------------------
79 6d67e8bf Izhar
If huge pages are set as default for a cluster of node group, then
80 6d67e8bf Izhar
iAllocator must consider the huge pages memory on the nodes, as a
81 6d67e8bf Izhar
parameter when trying to find the best node for the VM.
82 6d67e8bf Izhar
Note that the iallocator will also be changed to use the correct
83 6d67e8bf Izhar
parameter depending on the cluster/group.
84 6d67e8bf Izhar
85 6d67e8bf Izhar
hbal Changes:
86 6d67e8bf Izhar
-------------
87 6d67e8bf Izhar
The cluster balancer (hbal) will be changed to use the default  memory
88 6d67e8bf Izhar
pool and  recognize memory reserved for huge pages when trying to
89 6d67e8bf Izhar
rebalance the cluster.
90 6d67e8bf Izhar
91 6d67e8bf Izhar
.. vim: set textwidth=72 :
92 6d67e8bf Izhar
.. Local Variables:
93 6d67e8bf Izhar
.. mode: rst
94 6d67e8bf Izhar
.. fill-column: 72
95 6d67e8bf Izhar
.. End: