Revision ae9b5e0f

b/Makefile.am
278 278
	doc/design-2.4.rst \
279 279
	doc/design-draft.rst \
280 280
	doc/design-oob.rst \
281
	doc/design-cpu-pinning.rst \
281 282
	doc/design-query2.rst \
282 283
	doc/design-x509-ca.rst \
283 284
	doc/design-http-server.rst \
b/doc/design-cpu-pinning.rst
1
Ganeti CPU Pinning
2
==================
3

  
4
Objective
5
---------
6

  
7
This document defines Ganeti's support for CPU pinning (aka CPU
8
affinity).
9

  
10
CPU pinning enables mapping and unmapping entire virtual machines or a
11
specific virtual CPU (vCPU), to a physical CPU or a range of CPUs.
12

  
13
At this stage Pinning will be implemented for Xen and KVM.
14

  
15
Command Line
16
------------
17

  
18
Suggested command line parameters for controlling CPU pinning are as
19
follows::
20

  
21
  gnt-instance modify -H cpu_mask=<cpu-pinning-info> <instance>
22

  
23
cpu-pinning-info can be any of the following:
24

  
25
* One vCPU mapping, which can be the word "all" or a combination
26
  of CPU numbers and ranges separated by comma. In this case, all
27
  vCPUs will be mapped to the indicated list.
28
* A list of vCPU mappings, separated by a colon ':'. In this case
29
  each vCPU is mapped to an entry in the list, and the size of the
30
  list must match the number of vCPUs defined for the instance. This
31
  is enforced when setting CPU pinning or when setting the number of
32
  vCPUs using ``-B vcpus=#``.
33

  
34
  The mapping list is matched to consecutive virtual CPUs, so the first entry
35
  would be the CPU pinning information for vCPU 0, the second entry
36
  for vCPU 1, etc.
37

  
38
The default setting for new instances is "all", which maps the entire
39
instance to all CPUs, thus effectively turning off CPU pinning.
40

  
41
Here are some usage examples::
42

  
43
  # Map vCPU 0 to physical CPU 1 and vCPU 1 to CPU 3 (assuming 2 vCPUs)
44
  gnt-instance modify -H cpu_mask=1:3 my-inst
45

  
46
  # Pin vCPU 0 to CPUs 1 or 2, and vCPU 1 to any CPU
47
  gnt-instance modify -H cpu_mask=1-2:all my-inst
48

  
49
  # Pin vCPU 0 to any CPU, vCPU 1 to CPUs 1, 3, 4 or 5, and CPU 2 to
50
  # CPU 0
51
  gnt-instance modify -H cpu_mask=all:1\\,3-4:0 my-inst
52

  
53
  # Pin entire VM to CPU 0
54
  gnt-instance modify -H cpu_mask=0 my-inst
55

  
56
  # Turn off CPU pinning (default setting)
57
  gnt-instance modify -H cpu_mask=all my-inst
58

  
59
Assuming an instance has 2 vCPUs, the following commands will fail::
60

  
61
  # not enough mappings
62
  gnt-instance modify -H cpu_mask=0 my-inst
63

  
64
  # too many
65
  gnt-instance modify -H cpu_mask=2:1:1 my-inst
66

  
67
Validation
68
----------
69

  
70
CPU pinning information is validated by making sure it matches the
71
number of vCPUs. This validation happens when changing either the
72
cpu_mask or vcpus parameters.
73
Changing either parameter in a way that conflicts with the other will
74
fail with a proper error message.
75
To make such a change, both parameters should be modified at the same
76
time. For example:
77
``gnt-instance modify -B vcpus=4 -H cpu_mask=1:1:2-3:4\\,6 my-inst``
78

  
79
Besides validating CPU configuration, i.e. the number of vCPUs matches
80
the requested CPU pinning, Ganeti will also verify the number of
81
physical CPUs is enough to support the required configuration. For
82
example, trying to run a configuration of vcpus=2,cpu_mask=0:4 on
83
a node with 4 cores will fail (Note: CPU numbers are 0-based).
84

  
85
This validation should repeat every time an instance is started or
86
migrated live. See more details under Migration below.
87

  
88
Cluster verification should also test the compatibility of other nodes in
89
the cluster to required configuration and alert if a minimum requirement
90
is not met.
91

  
92
Failover
93
--------
94

  
95
CPU pinning configuration can be transferred from node to node, unless
96
the number of physical CPUs is smaller than what the configuration calls
97
for.  It is suggested that unless this is the case, all transfers and
98
migrations will succeed.
99

  
100
In case the number of physical CPUs is smaller than the numbers
101
indicated by CPU pinning information, instance failover will fail.
102

  
103
In case of emergency, to force failover to ignore mismatching CPU
104
information, the following switch can be used:
105
``gnt-instance failover --ignore-cpu-mismatch my-inst``.
106
This command will try to fail the instance with the current cpu mask,
107
but if that fails, it will change the mask to be "all".
108

  
109
Migration
110
---------
111

  
112
In case of live migration, and in addition to failover considerations,
113
it is required to remap CPU pinning after migration. This can be done in
114
realtime for instances for both Xen and KVM, and only depends on the
115
number of physical CPUs being sufficient to support the migrated
116
instance.
117

  
118
Data
119
----
120

  
121
Pinning information will be kept as a list of integers per vCPU.
122
To mark a mapping of any CPU, we will use (-1).
123
A single entry, no matter what the number of vCPUs is, will always mean
124
that all vCPUs have the same mapping.
125

  
126
Configuration file
127
------------------
128

  
129
The pinning information is kept for each instance's hypervisor
130
params section of the configuration file as
131
``cpu_mask: [ [ a ], [ b, c ], [ d ] ]``
132

  
133
Xen
134
---
135

  
136
There are 2 ways to control pinning in Xen, either via the command line
137
or through the configuration file.
138

  
139
The commands to make direct pinning changes are the following::
140

  
141
  # To pin a vCPU to a specific CPU
142
  xm vcpu-pin <domain> <vcpu> <cpu>
143

  
144
  # To unpin a vCPU
145
  xm vcpu-pin <domain> <vcpu> all
146

  
147
  # To get the current pinning status
148
  xm vcpu-list <domain>
149

  
150
Since currently controlling Xen in Ganeti is done in the configuration
151
file, it is straight forward to use the same method for CPU pinning.
152
There are 2 different parameters that control Xen's CPU pinning and
153
configuration:
154

  
155
vcpus
156
  controls the number of vCPUs
157
cpus
158
  maps vCPUs to physical CPUs
159

  
160
When no pinning is required (pinning information is "all"), the
161
"cpus" entry is removed from the configuration file.
162

  
163
For all other cases, the configuration is "translated" to Xen, which
164
expects either ``cpus = "a"`` or ``cpus = [ "a", "b", "c", ...]``,
165
where each a, b or c are a physical CPU number, CPU range, or a
166
combination, and the number of entries (if a list is used) must match
167
the number of vCPUs, and are mapped in order.
168

  
169
For example, CPU pinning information of ``1:2,4-7:0-1`` is translated
170
to this entry in Xen's configuration ``cpus = [ "1", "2,4-7", "0-1" ]``
171

  
172
KVM
173
---
174

  
175
Controlling pinning in KVM is a little more complicated as there is no
176
configuration to control pinning before instances are started.
177

  
178
The way to change or assign CPU pinning under KVM is to use ``taskset`` or
179
its underlying system call ``sched_setaffinity``. Setting the affinity for
180
the VM process will change CPU pinning for the entire VM, and setting it
181
for specific vCPU threads will control specific vCPUs.
182

  
183
The sequence of commands to control pinning is this: start the instance
184
with the ``-S`` switch, so it halts before starting execution, get the
185
process ID or identify thread IDs of each vCPU by sending ``info cpus``
186
to the monitor, map vCPUs as required by the cpu-pinning information,
187
and issue a ``cont`` command on the KVM monitor to allow the instance
188
to start execution.
189

  
190
For example, a sequence of commands to control CPU affinity under KVM
191
may be:
192

  
193
* Start KVM: ``/usr/bin/kvm … <kvm-command-line-options> … -S``
194
* Use socat to connect to monitor
195
* send ``info cpus`` to monitor to get thread/vCPU information
196
* call ``sched_setaffinity`` for each thread with the CPU mask
197
* send ``cont`` to KVM's monitor
198

  
199
A CPU mask is a hexadecimal bit mask where each bit represents one
200
physical CPU. See man page for :manpage:`sched_setaffinity(2)` for more
201
details.
202

  
203
For example, to run a specific thread-id on CPUs 1 or 3 the mask is
204
0x0000000A.
205

  
206
We will control process and thread affinity using the python affinity
207
package (http://pypi.python.org/pypi/affinity). This package is a Python
208
wrapper around the two affinity system calls, and has no other
209
requirements.
210

  
211
Alternative Design Options
212
--------------------------
213

  
214
1. There's an option to ignore the limitations of the underlying
215
   hypervisor and instead of requiring explicit pinning information
216
   for *all* vCPUs, assume a mapping of "all" to vCPUs not mentioned.
217
   This can lead to inadvertent missing information, but either way,
218
   since using cpu-pinning options is probably not going to be
219
   frequent, there's no real advantage.
220

  
221
.. vim: set textwidth=72 :
222
.. Local Variables:
223
.. mode: rst
224
.. fill-column: 72
225
.. End:
b/doc/design-draft.rst
10 10
   design-impexp2.rst
11 11
   design-lu-generated-jobs.rst
12 12
   design-multi-reloc.rst
13
   design-cpu-pinning.rst
13 14

  
14 15
.. vim: set textwidth=72 :
15 16
.. Local Variables:

Also available in: Unified diff