code.grnet.gr Git - ganeti-local/blob - doc/design-cpu-pinning.rst

   1 Ganeti CPU Pinning
   2 ==================
   3
   4 Objective
   5 ---------
   6
   7 This document defines Ganeti's support for CPU pinning (aka CPU
   8 affinity).
   9
  10 CPU pinning enables mapping and unmapping entire virtual machines or a
  11 specific virtual CPU (vCPU), to a physical CPU or a range of CPUs.
  12
  13 At this stage Pinning will be implemented for Xen and KVM.
  14
  15 Command Line
  16 ------------
  17
  18 Suggested command line parameters for controlling CPU pinning are as
  19 follows::
  20
  21   gnt-instance modify -H cpu_mask=<cpu-pinning-info> <instance>
  22
  23 cpu-pinning-info can be any of the following:
  24
  25 * One vCPU mapping, which can be the word "all" or a combination
  26   of CPU numbers and ranges separated by comma. In this case, all
  27   vCPUs will be mapped to the indicated list.
  28 * A list of vCPU mappings, separated by a colon ':'. In this case
  29   each vCPU is mapped to an entry in the list, and the size of the
  30   list must match the number of vCPUs defined for the instance. This
  31   is enforced when setting CPU pinning or when setting the number of
  32   vCPUs using ``-B vcpus=#``.
  33
  34   The mapping list is matched to consecutive virtual CPUs, so the first entry
  35   would be the CPU pinning information for vCPU 0, the second entry
  36   for vCPU 1, etc.
  37
  38 The default setting for new instances is "all", which maps the entire
  39 instance to all CPUs, thus effectively turning off CPU pinning.
  40
  41 Here are some usage examples::
  42
  43   # Map vCPU 0 to physical CPU 1 and vCPU 1 to CPU 3 (assuming 2 vCPUs)
  44   gnt-instance modify -H cpu_mask=1:3 my-inst
  45
  46   # Pin vCPU 0 to CPUs 1 or 2, and vCPU 1 to any CPU
  47   gnt-instance modify -H cpu_mask=1-2:all my-inst
  48
  49   # Pin vCPU 0 to any CPU, vCPU 1 to CPUs 1, 3, 4 or 5, and CPU 2 to
  50   # CPU 0
  51   gnt-instance modify -H cpu_mask=all:1\\,3-4:0 my-inst
  52
  53   # Pin entire VM to CPU 0
  54   gnt-instance modify -H cpu_mask=0 my-inst
  55
  56   # Turn off CPU pinning (default setting)
  57   gnt-instance modify -H cpu_mask=all my-inst
  58
  59 Assuming an instance has 2 vCPUs, the following commands will fail::
  60
  61   # not enough mappings
  62   gnt-instance modify -H cpu_mask=0 my-inst
  63
  64   # too many
  65   gnt-instance modify -H cpu_mask=2:1:1 my-inst
  66
  67 Validation
  68 ----------
  69
  70 CPU pinning information is validated by making sure it matches the
  71 number of vCPUs. This validation happens when changing either the
  72 cpu_mask or vcpus parameters.
  73 Changing either parameter in a way that conflicts with the other will
  74 fail with a proper error message.
  75 To make such a change, both parameters should be modified at the same
  76 time. For example:
  77 ``gnt-instance modify -B vcpus=4 -H cpu_mask=1:1:2-3:4\\,6 my-inst``
  78
  79 Besides validating CPU configuration, i.e. the number of vCPUs matches
  80 the requested CPU pinning, Ganeti will also verify the number of
  81 physical CPUs is enough to support the required configuration. For
  82 example, trying to run a configuration of vcpus=2,cpu_mask=0:4 on
  83 a node with 4 cores will fail (Note: CPU numbers are 0-based).
  84
  85 This validation should repeat every time an instance is started or
  86 migrated live. See more details under Migration below.
  87
  88 Cluster verification should also test the compatibility of other nodes in
  89 the cluster to required configuration and alert if a minimum requirement
  90 is not met.
  91
  92 Failover
  93 --------
  94
  95 CPU pinning configuration can be transferred from node to node, unless
  96 the number of physical CPUs is smaller than what the configuration calls
  97 for.  It is suggested that unless this is the case, all transfers and
  98 migrations will succeed.
  99
 100 In case the number of physical CPUs is smaller than the numbers
 101 indicated by CPU pinning information, instance failover will fail.
 102
 103 In case of emergency, to force failover to ignore mismatching CPU
 104 information, the following switch can be used:
 105 ``gnt-instance failover --ignore-cpu-mismatch my-inst``.
 106 This command will try to fail the instance with the current cpu mask,
 107 but if that fails, it will change the mask to be "all".
 108
 109 Migration
 110 ---------
 111
 112 In case of live migration, and in addition to failover considerations,
 113 it is required to remap CPU pinning after migration. This can be done in
 114 realtime for instances for both Xen and KVM, and only depends on the
 115 number of physical CPUs being sufficient to support the migrated
 116 instance.
 117
 118 Data
 119 ----
 120
 121 Pinning information will be kept as a list of integers per vCPU.
 122 To mark a mapping of any CPU, we will use (-1).
 123 A single entry, no matter what the number of vCPUs is, will always mean
 124 that all vCPUs have the same mapping.
 125
 126 Configuration file
 127 ------------------
 128
 129 The pinning information is kept for each instance's hypervisor
 130 params section of the configuration file as
 131 ``cpu_mask: [ [ a ], [ b, c ], [ d ] ]``
 132
 133 Xen
 134 ---
 135
 136 There are 2 ways to control pinning in Xen, either via the command line
 137 or through the configuration file.
 138
 139 The commands to make direct pinning changes are the following::
 140
 141   # To pin a vCPU to a specific CPU
 142   xm vcpu-pin <domain> <vcpu> <cpu>
 143
 144   # To unpin a vCPU
 145   xm vcpu-pin <domain> <vcpu> all
 146
 147   # To get the current pinning status
 148   xm vcpu-list <domain>
 149
 150 Since currently controlling Xen in Ganeti is done in the configuration
 151 file, it is straight forward to use the same method for CPU pinning.
 152 There are 2 different parameters that control Xen's CPU pinning and
 153 configuration:
 154
 155 vcpus
 156   controls the number of vCPUs
 157 cpus
 158   maps vCPUs to physical CPUs
 159
 160 When no pinning is required (pinning information is "all"), the
 161 "cpus" entry is removed from the configuration file.
 162
 163 For all other cases, the configuration is "translated" to Xen, which
 164 expects either ``cpus = "a"`` or ``cpus = [ "a", "b", "c", ...]``,
 165 where each a, b or c are a physical CPU number, CPU range, or a
 166 combination, and the number of entries (if a list is used) must match
 167 the number of vCPUs, and are mapped in order.
 168
 169 For example, CPU pinning information of ``1:2,4-7:0-1`` is translated
 170 to this entry in Xen's configuration ``cpus = [ "1", "2,4-7", "0-1" ]``
 171
 172 KVM
 173 ---
 174
 175 Controlling pinning in KVM is a little more complicated as there is no
 176 configuration to control pinning before instances are started.
 177
 178 The way to change or assign CPU pinning under KVM is to use ``taskset`` or
 179 its underlying system call ``sched_setaffinity``. Setting the affinity for
 180 the VM process will change CPU pinning for the entire VM, and setting it
 181 for specific vCPU threads will control specific vCPUs.
 182
 183 The sequence of commands to control pinning is this: start the instance
 184 with the ``-S`` switch, so it halts before starting execution, get the
 185 process ID or identify thread IDs of each vCPU by sending ``info cpus``
 186 to the monitor, map vCPUs as required by the cpu-pinning information,
 187 and issue a ``cont`` command on the KVM monitor to allow the instance
 188 to start execution.
 189
 190 For example, a sequence of commands to control CPU affinity under KVM
 191 may be:
 192
 193 * Start KVM: ``/usr/bin/kvm … <kvm-command-line-options> … -S``
 194 * Use socat to connect to monitor
 195 * send ``info cpus`` to monitor to get thread/vCPU information
 196 * call ``sched_setaffinity`` for each thread with the CPU mask
 197 * send ``cont`` to KVM's monitor
 198
 199 A CPU mask is a hexadecimal bit mask where each bit represents one
 200 physical CPU. See man page for :manpage:`sched_setaffinity(2)` for more
 201 details.
 202
 203 For example, to run a specific thread-id on CPUs 1 or 3 the mask is
 204 0x0000000A.
 205
 206 We will control process and thread affinity using the python affinity
 207 package (http://pypi.python.org/pypi/affinity). This package is a Python
 208 wrapper around the two affinity system calls, and has no other
 209 requirements.
 210
 211 Alternative Design Options
 212 --------------------------
 213
 214 1. There's an option to ignore the limitations of the underlying
 215    hypervisor and instead of requiring explicit pinning information
 216    for *all* vCPUs, assume a mapping of "all" to vCPUs not mentioned.
 217    This can lead to inadvertent missing information, but either way,
 218    since using cpu-pinning options is probably not going to be
 219    frequent, there's no real advantage.
 220
 221 .. vim: set textwidth=72 :
 222 .. Local Variables:
 223 .. mode: rst
 224 .. fill-column: 72
 225 .. End: