Statistics
| Branch: | Tag: | Revision:

root / doc / design-cpu-pinning.rst @ 54f834df

History | View | Annotate | Download (7.7 kB)

1 ae9b5e0f Tsachy Shacham
Ganeti CPU Pinning
2 ae9b5e0f Tsachy Shacham
==================
3 ae9b5e0f Tsachy Shacham
4 ae9b5e0f Tsachy Shacham
Objective
5 ae9b5e0f Tsachy Shacham
---------
6 ae9b5e0f Tsachy Shacham
7 ae9b5e0f Tsachy Shacham
This document defines Ganeti's support for CPU pinning (aka CPU
8 ae9b5e0f Tsachy Shacham
affinity).
9 ae9b5e0f Tsachy Shacham
10 ae9b5e0f Tsachy Shacham
CPU pinning enables mapping and unmapping entire virtual machines or a
11 ae9b5e0f Tsachy Shacham
specific virtual CPU (vCPU), to a physical CPU or a range of CPUs.
12 ae9b5e0f Tsachy Shacham
13 ae9b5e0f Tsachy Shacham
At this stage Pinning will be implemented for Xen and KVM.
14 ae9b5e0f Tsachy Shacham
15 ae9b5e0f Tsachy Shacham
Command Line
16 ae9b5e0f Tsachy Shacham
------------
17 ae9b5e0f Tsachy Shacham
18 ae9b5e0f Tsachy Shacham
Suggested command line parameters for controlling CPU pinning are as
19 ae9b5e0f Tsachy Shacham
follows::
20 ae9b5e0f Tsachy Shacham
21 ae9b5e0f Tsachy Shacham
  gnt-instance modify -H cpu_mask=<cpu-pinning-info> <instance>
22 ae9b5e0f Tsachy Shacham
23 ae9b5e0f Tsachy Shacham
cpu-pinning-info can be any of the following:
24 ae9b5e0f Tsachy Shacham
25 ae9b5e0f Tsachy Shacham
* One vCPU mapping, which can be the word "all" or a combination
26 ae9b5e0f Tsachy Shacham
  of CPU numbers and ranges separated by comma. In this case, all
27 ae9b5e0f Tsachy Shacham
  vCPUs will be mapped to the indicated list.
28 ae9b5e0f Tsachy Shacham
* A list of vCPU mappings, separated by a colon ':'. In this case
29 ae9b5e0f Tsachy Shacham
  each vCPU is mapped to an entry in the list, and the size of the
30 ae9b5e0f Tsachy Shacham
  list must match the number of vCPUs defined for the instance. This
31 ae9b5e0f Tsachy Shacham
  is enforced when setting CPU pinning or when setting the number of
32 ae9b5e0f Tsachy Shacham
  vCPUs using ``-B vcpus=#``.
33 ae9b5e0f Tsachy Shacham
34 ae9b5e0f Tsachy Shacham
  The mapping list is matched to consecutive virtual CPUs, so the first entry
35 ae9b5e0f Tsachy Shacham
  would be the CPU pinning information for vCPU 0, the second entry
36 ae9b5e0f Tsachy Shacham
  for vCPU 1, etc.
37 ae9b5e0f Tsachy Shacham
38 ae9b5e0f Tsachy Shacham
The default setting for new instances is "all", which maps the entire
39 ae9b5e0f Tsachy Shacham
instance to all CPUs, thus effectively turning off CPU pinning.
40 ae9b5e0f Tsachy Shacham
41 ae9b5e0f Tsachy Shacham
Here are some usage examples::
42 ae9b5e0f Tsachy Shacham
43 ae9b5e0f Tsachy Shacham
  # Map vCPU 0 to physical CPU 1 and vCPU 1 to CPU 3 (assuming 2 vCPUs)
44 ae9b5e0f Tsachy Shacham
  gnt-instance modify -H cpu_mask=1:3 my-inst
45 ae9b5e0f Tsachy Shacham
46 ae9b5e0f Tsachy Shacham
  # Pin vCPU 0 to CPUs 1 or 2, and vCPU 1 to any CPU
47 ae9b5e0f Tsachy Shacham
  gnt-instance modify -H cpu_mask=1-2:all my-inst
48 ae9b5e0f Tsachy Shacham
49 ae9b5e0f Tsachy Shacham
  # Pin vCPU 0 to any CPU, vCPU 1 to CPUs 1, 3, 4 or 5, and CPU 2 to
50 ae9b5e0f Tsachy Shacham
  # CPU 0
51 ccf5dcf5 Tsachy Shacham
  gnt-instance modify -H cpu_mask=all:1\\,3-5:0 my-inst
52 ae9b5e0f Tsachy Shacham
53 ae9b5e0f Tsachy Shacham
  # Pin entire VM to CPU 0
54 ae9b5e0f Tsachy Shacham
  gnt-instance modify -H cpu_mask=0 my-inst
55 ae9b5e0f Tsachy Shacham
56 ae9b5e0f Tsachy Shacham
  # Turn off CPU pinning (default setting)
57 ae9b5e0f Tsachy Shacham
  gnt-instance modify -H cpu_mask=all my-inst
58 ae9b5e0f Tsachy Shacham
59 ccf5dcf5 Tsachy Shacham
Assuming an instance has 3 vCPUs, the following commands will fail::
60 ae9b5e0f Tsachy Shacham
61 ae9b5e0f Tsachy Shacham
  # not enough mappings
62 ccf5dcf5 Tsachy Shacham
  gnt-instance modify -H cpu_mask=0:1 my-inst
63 ae9b5e0f Tsachy Shacham
64 ae9b5e0f Tsachy Shacham
  # too many
65 ccf5dcf5 Tsachy Shacham
  gnt-instance modify -H cpu_mask=2:1:1:all my-inst
66 ae9b5e0f Tsachy Shacham
67 ae9b5e0f Tsachy Shacham
Validation
68 ae9b5e0f Tsachy Shacham
----------
69 ae9b5e0f Tsachy Shacham
70 ae9b5e0f Tsachy Shacham
CPU pinning information is validated by making sure it matches the
71 ae9b5e0f Tsachy Shacham
number of vCPUs. This validation happens when changing either the
72 ae9b5e0f Tsachy Shacham
cpu_mask or vcpus parameters.
73 ae9b5e0f Tsachy Shacham
Changing either parameter in a way that conflicts with the other will
74 ae9b5e0f Tsachy Shacham
fail with a proper error message.
75 ae9b5e0f Tsachy Shacham
To make such a change, both parameters should be modified at the same
76 ae9b5e0f Tsachy Shacham
time. For example:
77 ae9b5e0f Tsachy Shacham
``gnt-instance modify -B vcpus=4 -H cpu_mask=1:1:2-3:4\\,6 my-inst``
78 ae9b5e0f Tsachy Shacham
79 ae9b5e0f Tsachy Shacham
Besides validating CPU configuration, i.e. the number of vCPUs matches
80 ae9b5e0f Tsachy Shacham
the requested CPU pinning, Ganeti will also verify the number of
81 ae9b5e0f Tsachy Shacham
physical CPUs is enough to support the required configuration. For
82 ae9b5e0f Tsachy Shacham
example, trying to run a configuration of vcpus=2,cpu_mask=0:4 on
83 ae9b5e0f Tsachy Shacham
a node with 4 cores will fail (Note: CPU numbers are 0-based).
84 ae9b5e0f Tsachy Shacham
85 ae9b5e0f Tsachy Shacham
This validation should repeat every time an instance is started or
86 ae9b5e0f Tsachy Shacham
migrated live. See more details under Migration below.
87 ae9b5e0f Tsachy Shacham
88 ae9b5e0f Tsachy Shacham
Cluster verification should also test the compatibility of other nodes in
89 ae9b5e0f Tsachy Shacham
the cluster to required configuration and alert if a minimum requirement
90 ae9b5e0f Tsachy Shacham
is not met.
91 ae9b5e0f Tsachy Shacham
92 ae9b5e0f Tsachy Shacham
Failover
93 ae9b5e0f Tsachy Shacham
--------
94 ae9b5e0f Tsachy Shacham
95 ae9b5e0f Tsachy Shacham
CPU pinning configuration can be transferred from node to node, unless
96 ae9b5e0f Tsachy Shacham
the number of physical CPUs is smaller than what the configuration calls
97 ae9b5e0f Tsachy Shacham
for.  It is suggested that unless this is the case, all transfers and
98 ae9b5e0f Tsachy Shacham
migrations will succeed.
99 ae9b5e0f Tsachy Shacham
100 ae9b5e0f Tsachy Shacham
In case the number of physical CPUs is smaller than the numbers
101 ae9b5e0f Tsachy Shacham
indicated by CPU pinning information, instance failover will fail.
102 ae9b5e0f Tsachy Shacham
103 ae9b5e0f Tsachy Shacham
In case of emergency, to force failover to ignore mismatching CPU
104 ae9b5e0f Tsachy Shacham
information, the following switch can be used:
105 ccf5dcf5 Tsachy Shacham
``gnt-instance failover --fix-cpu-mismatch my-inst``.
106 ccf5dcf5 Tsachy Shacham
This command will try to failover the instance with the current cpu mask,
107 ae9b5e0f Tsachy Shacham
but if that fails, it will change the mask to be "all".
108 ae9b5e0f Tsachy Shacham
109 ae9b5e0f Tsachy Shacham
Migration
110 ae9b5e0f Tsachy Shacham
---------
111 ae9b5e0f Tsachy Shacham
112 ae9b5e0f Tsachy Shacham
In case of live migration, and in addition to failover considerations,
113 ae9b5e0f Tsachy Shacham
it is required to remap CPU pinning after migration. This can be done in
114 ae9b5e0f Tsachy Shacham
realtime for instances for both Xen and KVM, and only depends on the
115 ae9b5e0f Tsachy Shacham
number of physical CPUs being sufficient to support the migrated
116 ae9b5e0f Tsachy Shacham
instance.
117 ae9b5e0f Tsachy Shacham
118 ae9b5e0f Tsachy Shacham
Data
119 ae9b5e0f Tsachy Shacham
----
120 ae9b5e0f Tsachy Shacham
121 ae9b5e0f Tsachy Shacham
Pinning information will be kept as a list of integers per vCPU.
122 ae9b5e0f Tsachy Shacham
To mark a mapping of any CPU, we will use (-1).
123 ae9b5e0f Tsachy Shacham
A single entry, no matter what the number of vCPUs is, will always mean
124 ae9b5e0f Tsachy Shacham
that all vCPUs have the same mapping.
125 ae9b5e0f Tsachy Shacham
126 ae9b5e0f Tsachy Shacham
Configuration file
127 ae9b5e0f Tsachy Shacham
------------------
128 ae9b5e0f Tsachy Shacham
129 ae9b5e0f Tsachy Shacham
The pinning information is kept for each instance's hypervisor
130 ccf5dcf5 Tsachy Shacham
params section of the configuration file as the original string.
131 ae9b5e0f Tsachy Shacham
132 ae9b5e0f Tsachy Shacham
Xen
133 ae9b5e0f Tsachy Shacham
---
134 ae9b5e0f Tsachy Shacham
135 ae9b5e0f Tsachy Shacham
There are 2 ways to control pinning in Xen, either via the command line
136 ae9b5e0f Tsachy Shacham
or through the configuration file.
137 ae9b5e0f Tsachy Shacham
138 ae9b5e0f Tsachy Shacham
The commands to make direct pinning changes are the following::
139 ae9b5e0f Tsachy Shacham
140 ae9b5e0f Tsachy Shacham
  # To pin a vCPU to a specific CPU
141 ae9b5e0f Tsachy Shacham
  xm vcpu-pin <domain> <vcpu> <cpu>
142 ae9b5e0f Tsachy Shacham
143 ae9b5e0f Tsachy Shacham
  # To unpin a vCPU
144 ae9b5e0f Tsachy Shacham
  xm vcpu-pin <domain> <vcpu> all
145 ae9b5e0f Tsachy Shacham
146 ae9b5e0f Tsachy Shacham
  # To get the current pinning status
147 ae9b5e0f Tsachy Shacham
  xm vcpu-list <domain>
148 ae9b5e0f Tsachy Shacham
149 ae9b5e0f Tsachy Shacham
Since currently controlling Xen in Ganeti is done in the configuration
150 ae9b5e0f Tsachy Shacham
file, it is straight forward to use the same method for CPU pinning.
151 ae9b5e0f Tsachy Shacham
There are 2 different parameters that control Xen's CPU pinning and
152 ae9b5e0f Tsachy Shacham
configuration:
153 ae9b5e0f Tsachy Shacham
154 ae9b5e0f Tsachy Shacham
vcpus
155 ae9b5e0f Tsachy Shacham
  controls the number of vCPUs
156 ae9b5e0f Tsachy Shacham
cpus
157 ae9b5e0f Tsachy Shacham
  maps vCPUs to physical CPUs
158 ae9b5e0f Tsachy Shacham
159 ae9b5e0f Tsachy Shacham
When no pinning is required (pinning information is "all"), the
160 ae9b5e0f Tsachy Shacham
"cpus" entry is removed from the configuration file.
161 ae9b5e0f Tsachy Shacham
162 ae9b5e0f Tsachy Shacham
For all other cases, the configuration is "translated" to Xen, which
163 ae9b5e0f Tsachy Shacham
expects either ``cpus = "a"`` or ``cpus = [ "a", "b", "c", ...]``,
164 ae9b5e0f Tsachy Shacham
where each a, b or c are a physical CPU number, CPU range, or a
165 ae9b5e0f Tsachy Shacham
combination, and the number of entries (if a list is used) must match
166 ae9b5e0f Tsachy Shacham
the number of vCPUs, and are mapped in order.
167 ae9b5e0f Tsachy Shacham
168 ae9b5e0f Tsachy Shacham
For example, CPU pinning information of ``1:2,4-7:0-1`` is translated
169 ae9b5e0f Tsachy Shacham
to this entry in Xen's configuration ``cpus = [ "1", "2,4-7", "0-1" ]``
170 ae9b5e0f Tsachy Shacham
171 ae9b5e0f Tsachy Shacham
KVM
172 ae9b5e0f Tsachy Shacham
---
173 ae9b5e0f Tsachy Shacham
174 ae9b5e0f Tsachy Shacham
Controlling pinning in KVM is a little more complicated as there is no
175 ae9b5e0f Tsachy Shacham
configuration to control pinning before instances are started.
176 ae9b5e0f Tsachy Shacham
177 ae9b5e0f Tsachy Shacham
The way to change or assign CPU pinning under KVM is to use ``taskset`` or
178 ae9b5e0f Tsachy Shacham
its underlying system call ``sched_setaffinity``. Setting the affinity for
179 ae9b5e0f Tsachy Shacham
the VM process will change CPU pinning for the entire VM, and setting it
180 ae9b5e0f Tsachy Shacham
for specific vCPU threads will control specific vCPUs.
181 ae9b5e0f Tsachy Shacham
182 ae9b5e0f Tsachy Shacham
The sequence of commands to control pinning is this: start the instance
183 ae9b5e0f Tsachy Shacham
with the ``-S`` switch, so it halts before starting execution, get the
184 ae9b5e0f Tsachy Shacham
process ID or identify thread IDs of each vCPU by sending ``info cpus``
185 ae9b5e0f Tsachy Shacham
to the monitor, map vCPUs as required by the cpu-pinning information,
186 ae9b5e0f Tsachy Shacham
and issue a ``cont`` command on the KVM monitor to allow the instance
187 ae9b5e0f Tsachy Shacham
to start execution.
188 ae9b5e0f Tsachy Shacham
189 ae9b5e0f Tsachy Shacham
For example, a sequence of commands to control CPU affinity under KVM
190 ae9b5e0f Tsachy Shacham
may be:
191 ae9b5e0f Tsachy Shacham
192 ae9b5e0f Tsachy Shacham
* Start KVM: ``/usr/bin/kvm … <kvm-command-line-options> … -S``
193 ae9b5e0f Tsachy Shacham
* Use socat to connect to monitor
194 ae9b5e0f Tsachy Shacham
* send ``info cpus`` to monitor to get thread/vCPU information
195 ae9b5e0f Tsachy Shacham
* call ``sched_setaffinity`` for each thread with the CPU mask
196 ae9b5e0f Tsachy Shacham
* send ``cont`` to KVM's monitor
197 ae9b5e0f Tsachy Shacham
198 ae9b5e0f Tsachy Shacham
A CPU mask is a hexadecimal bit mask where each bit represents one
199 ae9b5e0f Tsachy Shacham
physical CPU. See man page for :manpage:`sched_setaffinity(2)` for more
200 ae9b5e0f Tsachy Shacham
details.
201 ae9b5e0f Tsachy Shacham
202 ae9b5e0f Tsachy Shacham
For example, to run a specific thread-id on CPUs 1 or 3 the mask is
203 ae9b5e0f Tsachy Shacham
0x0000000A.
204 ae9b5e0f Tsachy Shacham
205 ae9b5e0f Tsachy Shacham
We will control process and thread affinity using the python affinity
206 ae9b5e0f Tsachy Shacham
package (http://pypi.python.org/pypi/affinity). This package is a Python
207 ae9b5e0f Tsachy Shacham
wrapper around the two affinity system calls, and has no other
208 ae9b5e0f Tsachy Shacham
requirements.
209 ae9b5e0f Tsachy Shacham
210 ae9b5e0f Tsachy Shacham
Alternative Design Options
211 ae9b5e0f Tsachy Shacham
--------------------------
212 ae9b5e0f Tsachy Shacham
213 ae9b5e0f Tsachy Shacham
1. There's an option to ignore the limitations of the underlying
214 ae9b5e0f Tsachy Shacham
   hypervisor and instead of requiring explicit pinning information
215 ae9b5e0f Tsachy Shacham
   for *all* vCPUs, assume a mapping of "all" to vCPUs not mentioned.
216 ae9b5e0f Tsachy Shacham
   This can lead to inadvertent missing information, but either way,
217 ae9b5e0f Tsachy Shacham
   since using cpu-pinning options is probably not going to be
218 ae9b5e0f Tsachy Shacham
   frequent, there's no real advantage.
219 ae9b5e0f Tsachy Shacham
220 ae9b5e0f Tsachy Shacham
.. vim: set textwidth=72 :
221 ae9b5e0f Tsachy Shacham
.. Local Variables:
222 ae9b5e0f Tsachy Shacham
.. mode: rst
223 ae9b5e0f Tsachy Shacham
.. fill-column: 72
224 ae9b5e0f Tsachy Shacham
.. End: