X-Git-Url: https://code.grnet.gr/git/ganeti-local/blobdiff_plain/ea158460cc38665a587b0ea39b6905340a9ad9bf..99c7cd5be025e86745aa46003ca0962609e0b4e2:/doc/iallocator.rst diff --git a/doc/iallocator.rst b/doc/iallocator.rst index 50df153..957c2bb 100644 --- a/doc/iallocator.rst +++ b/doc/iallocator.rst @@ -1,7 +1,7 @@ Ganeti automatic instance allocation ==================================== -Documents Ganeti version 2.1 +Documents Ganeti version 2.7 .. contents:: @@ -41,7 +41,7 @@ using the first one whose filename matches the one given by the user. Command line interface changes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The node selection options in instanece add and instance replace disks +The node selection options in instance add and instance replace disks can be replace by the new ``--iallocator=NAME`` option (shortened to ``-I``), which will cause the auto-assignement of nodes with the passed iallocator. The selected node(s) will be show as part of the @@ -68,7 +68,14 @@ Input message ~~~~~~~~~~~~~ The input message will be the JSON encoding of a dictionary containing -the following: +all the required information to perform the operation. We explain the +contents of this dictionary in two parts: common information that every +type of operation requires, and operation-specific information. + +Common information +++++++++++++++++++ + +All input dictionaries to the IAllocator must carry the following keys: version the version of the protocol; this document @@ -83,93 +90,14 @@ cluster_tags enabled_hypervisors the list of enabled hypervisors -request - a dictionary containing the request data: - - type - the request type; this can be either ``allocate``, ``relocate`` or - ``multi-evacuate``; the ``allocate`` request is used when a new - instance needs to be placed on the cluster, while the ``relocate`` - request is used when an existing instance needs to be moved within - the cluster; the ``multi-evacuate`` protocol requests that the - script computes the optimal relocate solution for all secondary - instances of the given nodes - - The following keys are needed in allocate/relocate mode: - - name - the name of the instance; if the request is a realocation, then this - name will be found in the list of instances (see below), otherwise - is the FQDN of the new instance - - required_nodes - how many nodes should the algorithm return; while this information - can be deduced from the instace's disk template, it's better if - this computation is left to Ganeti as then allocator scripts are - less sensitive to changes to the disk templates - - disk_space_total - the total disk space that will be used by this instance on the - (new) nodes; again, this information can be computed from the list - of instance disks and its template type, but Ganeti is better - suited to compute it - - If the request is an allocation, then there are extra fields in the - request dictionary: - - disks - list of dictionaries holding the disk definitions for this - instance (in the order they are exported to the hypervisor): - - mode - either ``r`` or ``w`` denoting if the disk is read-only or - writable - - size - the size of this disk in mebibytes - - nics - a list of dictionaries holding the network interfaces for this - instance, containing: - - ip - the IP address that Ganeti know for this instance, or null - - mac - the MAC address for this interface +ipolicy + the cluster-wide instance policy (for information; the per-node group + values take precedence and should be used instead) - bridge - the bridge to which this interface will be connected - - vcpus - the number of VCPUs for the instance - - disk_template - the disk template for the instance - - memory - the memory size for the instance - - os - the OS type for the instance - - tags - the list of the instance's tags - - hypervisor - the hypervisor of this instance - - - If the request is of type relocate, then there is one more entry in - the request dictionary, named ``relocate_from``, and it contains a - list of nodes to move the instance away from; note that with Ganeti - 2.0, this list will always contain a single node, the current - secondary of the instance. - - The multi-evacuate mode has instead a single request argument: - - evac_nodes - the names of the nodes to be evacuated +request + a dictionary containing the details of the request; the keys vary + depending on the type of operation that's being requested, as + explained in `Operation-specific input`_ below. nodegroups a dictionary with the data for the cluster's node groups; it is keyed @@ -179,14 +107,17 @@ nodegroups name the node group name alloc_policy - the allocation policy of the node group + the allocation policy of the node group (consult the semantics of + this attribute in the :manpage:`gnt-group(8)` manpage) + ipolicy + the instance policy of the node group instances a dictionary with the data for the current existing instance on the cluster, indexed by instance name; the contents are similar to the instance definitions for the allocate mode, with the addition of: - admin_up + admin_state if this instance is set to run (but not the actual status of the instance) @@ -253,6 +184,142 @@ nodes reserved_memory, free_memory, total_disk, free_disk, total_cpus, i_pri_memory and i_pri_up memory will be absent +Operation-specific input +++++++++++++++++++++++++ + +All input dictionaries to the IAllocator carry, in the ``request`` +dictionary, detailed information about the operation that's being +requested. The required keys vary depending on the type of operation, as +follows. + +In all cases, it includes: + + type + the request type; this can be either ``allocate``, ``relocate``, + ``change-group`` or ``node-evacuate``. The + ``allocate`` request is used when a new instance needs to be placed + on the cluster. The ``relocate`` request is used when an existing + instance needs to be moved within its node group. + + The ``multi-evacuate`` protocol used to request that the script + computes the optimal relocate solution for all secondary instances + of the given nodes. It is now deprecated and needs only be + implemented if backwards compatibility with Ganeti 2.4 and lower is + needed. + + The ``change-group`` request is used to relocate multiple instances + across multiple node groups. ``node-evacuate`` evacuates instances + off their node(s). These are described in a separate :ref:`design + document `. + + The ``multi-allocate`` request is used to allocate multiple + instances on the cluster. The request is beside of that very + similiar to the ``allocate`` one. For more details look at + :doc:`Ganeti bulk create `. + +For both allocate and relocate mode, the following extra keys are needed +in the ``request`` dictionary: + + name + the name of the instance; if the request is a realocation, then this + name will be found in the list of instances (see below), otherwise + is the FQDN of the new instance; type *string* + + required_nodes + how many nodes should the algorithm return; while this information + can be deduced from the instace's disk template, it's better if + this computation is left to Ganeti as then allocator scripts are + less sensitive to changes to the disk templates; type *integer* + + disk_space_total + the total disk space that will be used by this instance on the + (new) nodes; again, this information can be computed from the list + of instance disks and its template type, but Ganeti is better + suited to compute it; type *integer* + +.. pyassert:: + + constants.DISK_ACCESS_SET == set([constants.DISK_RDONLY, + constants.DISK_RDWR]) + +Allocation needs, in addition: + + disks + list of dictionaries holding the disk definitions for this + instance (in the order they are exported to the hypervisor): + + mode + either :pyeval:`constants.DISK_RDONLY` or + :pyeval:`constants.DISK_RDWR` denoting if the disk is read-only or + writable + + size + the size of this disk in mebibytes + + nics + a list of dictionaries holding the network interfaces for this + instance, containing: + + ip + the IP address that Ganeti know for this instance, or null + + mac + the MAC address for this interface + + bridge + the bridge to which this interface will be connected + + vcpus + the number of VCPUs for the instance + + disk_template + the disk template for the instance + + memory + the memory size for the instance + + os + the OS type for the instance + + tags + the list of the instance's tags + + hypervisor + the hypervisor of this instance + +Relocation: + + relocate_from + a list of nodes to move the instance away from; for DRBD-based + instances, this will contain a single node, the current secondary + of the instance, whereas for shared-storage instance, this will + contain also a single node, the current primary of the instance; + type *list of strings* + +As for ``node-evacuate``, it needs the following request arguments: + + instances + a list of instance names to evacuate; type *list of strings* + + evac_mode + specify which instances to evacuate; one of ``primary-only``, + ``secondary-only``, ``all``, type *string* + +``change-group`` needs the following request arguments: + + instances + a list of instance names whose group to change; type + *list of strings* + + target_groups + must either be the empty list, or contain a list of group UUIDs that + should be considered for relocating instances to; type + *list of strings* + +``multi-allocate`` needs the following request arguments: + + instances + a list of request dicts Response message ~~~~~~~~~~~~~~~~ @@ -276,8 +343,16 @@ result entry in the input message, otherwise Ganeti will consider the result as failed - for multi-evacuation mode, this is a list of lists; each element of - the list is a list of instance name and the new secondary node + for the ``node-evacuate`` and ``change-group`` modes, this is a + dictionary containing, among other information, a list of lists of + serialized opcodes; see the :ref:`design document + ` for a detailed description + + for the ``multi-allocate`` mode this is a tuple of 2 lists, the first + being element of the tuple is a list of succeeded allocation, with the + instance name as first element of each entry and the node placement in + the second. The second element of the tuple is the instance list of + failed allocations. .. note:: Current Ganeti version accepts either ``result`` or ``nodes`` as a backwards-compatibility measure (older versions only supported @@ -289,42 +364,22 @@ Examples Input messages to scripts ~~~~~~~~~~~~~~~~~~~~~~~~~ -Input message, new instance allocation:: +Input message, new instance allocation (common elements are listed this +time, but not included in further examples below):: { + "version": 2, + "cluster_name": "cluster1.example.com", "cluster_tags": [], - "request": { - "required_nodes": 2, - "name": "instance3.example.com", - "tags": [ - "type:test", - "owner:foo" - ], - "type": "allocate", - "disks": [ - { - "mode": "w", - "size": 1024 - }, - { - "mode": "w", - "size": 2048 - } - ], - "nics": [ - { - "ip": null, - "mac": "00:11:22:33:44:55", - "bridge": null - } - ], - "vcpus": 1, - "disk_template": "drbd", - "memory": 2048, - "disk_space_total": 3328, - "os": "debootstrap+default" + "enabled_hypervisors": [ + "xen-pvm" + ], + "nodegroups": { + "f4e06e0d-528a-4963-a5ad-10f3e114232d": { + "name": "default", + "alloc_policy": "preferred" + } }, - "cluster_name": "cluster1.example.com", "instances": { "instance1.example.com": { "tags": [], @@ -384,13 +439,13 @@ Input message, new instance allocation:: "os": "debootstrap+default" } }, - "version": 1, "nodes": { "node1.example.com": { "total_disk": 858276, "primary_ip": "198.51.100.1", "secondary_ip": "192.0.2.1", "tags": [], + "group": "f4e06e0d-528a-4963-a5ad-10f3e114232d", "free_memory": 3505, "free_disk": 856740, "total_memory": 4095 @@ -400,6 +455,7 @@ Input message, new instance allocation:: "primary_ip": "198.51.100.2", "secondary_ip": "192.0.2.2", "tags": ["test"], + "group": "f4e06e0d-528a-4963-a5ad-10f3e114232d", "free_memory": 3505, "free_disk": 848320, "total_memory": 4095 @@ -409,35 +465,61 @@ Input message, new instance allocation:: "primary_ip": "198.51.100.3", "secondary_ip": "192.0.2.3", "tags": [], + "group": "f4e06e0d-528a-4963-a5ad-10f3e114232d", "free_memory": 3505, "free_disk": 570648, "total_memory": 4095 } + }, + "request": { + "type": "allocate", + "name": "instance3.example.com", + "required_nodes": 2, + "disk_space_total": 3328, + "disks": [ + { + "mode": "w", + "size": 1024 + }, + { + "mode": "w", + "size": 2048 + } + ], + "nics": [ + { + "ip": null, + "mac": "00:11:22:33:44:55", + "bridge": null + } + ], + "vcpus": 1, + "disk_template": "drbd", + "memory": 2048, + "os": "debootstrap+default", + "tags": [ + "type:test", + "owner:foo" + ], + hypervisor: "xen-pvm" } } -Input message, reallocation. Since only the request entry in the input -message is changed, we show only this changed entry:: - - "request": { - "relocate_from": [ - "node3.example.com" - ], - "required_nodes": 1, - "type": "relocate", - "name": "instance2.example.com", - "disk_space_total": 832 - }, - +Input message, reallocation:: -Input message, node evacuation:: - - "request": { - "evac_nodes": [ - "node2" - ], - "type": "multi-evacuate" - }, + { + "version": 2, + ... + "request": { + "type": "relocate", + "name": "instance2.example.com", + "required_nodes": 1, + "disk_space_total": 832, + "relocate_from": [ + "node3.example.com" + ] + } + } Response messages @@ -445,25 +527,26 @@ Response messages Successful response message:: { + "success": true, "info": "Allocation successful", "result": [ "node2.example.com", "node1.example.com" - ], - "success": true + ] } Failed response message:: { + "success": false, "info": "Can't find a suitable node for position 2 (already selected: node2.example.com)", - "result": [], - "success": false + "result": [] } Successful node evacuation message:: { + "success": true, "info": "Request successful", "result": [ [ @@ -474,8 +557,7 @@ Successful node evacuation message:: "instance2", "node1" ] - ], - "success": true + ] } @@ -499,10 +581,9 @@ Command line messages Reference implementation ~~~~~~~~~~~~~~~~~~~~~~~~ -Ganeti's default iallocator is "hail" which is part of the separate -ganeti-htools project. In order to see its source code please clone -``git://git.ganeti.org/htools.git``. Note that htools is implemented -using the Haskell programming language. +Ganeti's default iallocator is "hail" which is available when "htools" +components have been enabled at build time (see :doc:`install-quick` for +more details). .. vim: set textwidth=72 : .. Local Variables: