Statistics
| Branch: | Revision:

root / docs / xbzrle.txt @ c08ba66f

History | View | Annotate | Download (4.4 kB)

1 34c26412 Orit Wasserman
XBZRLE (Xor Based Zero Run Length Encoding)
2 34c26412 Orit Wasserman
===========================================
3 34c26412 Orit Wasserman
4 34c26412 Orit Wasserman
Using XBZRLE (Xor Based Zero Run Length Encoding) allows for the reduction
5 34c26412 Orit Wasserman
of VM downtime and the total live-migration time of Virtual machines.
6 34c26412 Orit Wasserman
It is particularly useful for virtual machines running memory write intensive
7 34c26412 Orit Wasserman
workloads that are typical of large enterprise applications such as SAP ERP
8 34c26412 Orit Wasserman
Systems, and generally speaking for any application that uses a sparse memory
9 34c26412 Orit Wasserman
update pattern.
10 34c26412 Orit Wasserman
11 34c26412 Orit Wasserman
Instead of sending the changed guest memory page this solution will send a
12 34c26412 Orit Wasserman
compressed version of the updates, thus reducing the amount of data sent during
13 34c26412 Orit Wasserman
live migration.
14 34c26412 Orit Wasserman
In order to be able to calculate the update, the previous memory pages need to
15 34c26412 Orit Wasserman
be stored on the source. Those pages are stored in a dedicated cache
16 34c26412 Orit Wasserman
(hash table) and are accessed by their address.
17 34c26412 Orit Wasserman
The larger the cache size the better the chances are that the page has already
18 34c26412 Orit Wasserman
been stored in the cache.
19 34c26412 Orit Wasserman
A small cache size will result in high cache miss rate.
20 34c26412 Orit Wasserman
Cache size can be changed before and during migration.
21 34c26412 Orit Wasserman
22 34c26412 Orit Wasserman
Format
23 34c26412 Orit Wasserman
=======
24 34c26412 Orit Wasserman
25 34c26412 Orit Wasserman
The compression format performs a XOR between the previous and current content
26 34c26412 Orit Wasserman
of the page, where zero represents an unchanged value.
27 34c26412 Orit Wasserman
The page data delta is represented by zero and non zero runs.
28 34c26412 Orit Wasserman
A zero run is represented by its length (in bytes).
29 34c26412 Orit Wasserman
A non zero run is represented by its length (in bytes) and the new data.
30 34c26412 Orit Wasserman
The run length is encoded using ULEB128 (http://en.wikipedia.org/wiki/LEB128)
31 34c26412 Orit Wasserman
32 34c26412 Orit Wasserman
There can be more than one valid encoding, the sender may send a longer encoding
33 34c26412 Orit Wasserman
for the benefit of reducing computation cost.
34 34c26412 Orit Wasserman
35 34c26412 Orit Wasserman
page = zrun nzrun
36 34c26412 Orit Wasserman
       | zrun nzrun page
37 34c26412 Orit Wasserman
38 34c26412 Orit Wasserman
zrun = length
39 34c26412 Orit Wasserman
40 34c26412 Orit Wasserman
nzrun = length byte...
41 34c26412 Orit Wasserman
42 34c26412 Orit Wasserman
length = uleb128 encoded integer
43 34c26412 Orit Wasserman
44 34c26412 Orit Wasserman
On the sender side XBZRLE is used as a compact delta encoding of page updates,
45 34c26412 Orit Wasserman
retrieving the old page content from the cache (default size of 512 MB). The
46 34c26412 Orit Wasserman
receiving side uses the existing page's content and XBZRLE to decode the new
47 34c26412 Orit Wasserman
page's content.
48 34c26412 Orit Wasserman
49 34c26412 Orit Wasserman
This work was originally based on research results published
50 34c26412 Orit Wasserman
VEE 2011: Evaluation of Delta Compression Techniques for Efficient Live
51 34c26412 Orit Wasserman
Migration of Large Virtual Machines by Benoit, Svard, Tordsson and Elmroth.
52 34c26412 Orit Wasserman
Additionally the delta encoder XBRLE was improved further using the XBZRLE
53 34c26412 Orit Wasserman
instead.
54 34c26412 Orit Wasserman
55 34c26412 Orit Wasserman
XBZRLE has a sustained bandwidth of 2-2.5 GB/s for typical workloads making it
56 34c26412 Orit Wasserman
ideal for in-line, real-time encoding such as is needed for live-migration.
57 34c26412 Orit Wasserman
58 34c26412 Orit Wasserman
Example
59 34c26412 Orit Wasserman
old buffer:
60 34c26412 Orit Wasserman
1001 zeros
61 34c26412 Orit Wasserman
05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 68 00 00 6b 00 6d
62 34c26412 Orit Wasserman
3074 zeros
63 34c26412 Orit Wasserman
64 34c26412 Orit Wasserman
new buffer:
65 34c26412 Orit Wasserman
1001 zeros
66 34c26412 Orit Wasserman
01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 68 00 00 67 00 69
67 34c26412 Orit Wasserman
3074 zeros
68 34c26412 Orit Wasserman
69 34c26412 Orit Wasserman
encoded buffer:
70 34c26412 Orit Wasserman
71 34c26412 Orit Wasserman
encoded length 24
72 34c26412 Orit Wasserman
e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69
73 34c26412 Orit Wasserman
74 34c26412 Orit Wasserman
Usage
75 34c26412 Orit Wasserman
======================
76 34c26412 Orit Wasserman
1. Verify the destination QEMU version is able to decode the new format.
77 34c26412 Orit Wasserman
    {qemu} info migrate_capabilities
78 34c26412 Orit Wasserman
    {qemu} xbzrle: off , ...
79 34c26412 Orit Wasserman
80 34c26412 Orit Wasserman
2. Activate xbzrle on both source and destination:
81 34c26412 Orit Wasserman
   {qemu} migrate_set_capability xbzrle on
82 34c26412 Orit Wasserman
83 34c26412 Orit Wasserman
3. Set the XBZRLE cache size - the cache size is in MBytes and should be a
84 34c26412 Orit Wasserman
power of 2. The cache default value is 64MBytes. (on source only)
85 34c26412 Orit Wasserman
    {qemu} migrate_set_cache_size 256m
86 34c26412 Orit Wasserman
87 34c26412 Orit Wasserman
4. Start outgoing migration
88 34c26412 Orit Wasserman
    {qemu} migrate -d tcp:destination.host:4444
89 34c26412 Orit Wasserman
    {qemu} info migrate
90 34c26412 Orit Wasserman
    capabilities: xbzrle: on
91 34c26412 Orit Wasserman
    Migration status: active
92 34c26412 Orit Wasserman
    transferred ram: A kbytes
93 34c26412 Orit Wasserman
    remaining ram: B kbytes
94 34c26412 Orit Wasserman
    total ram: C kbytes
95 34c26412 Orit Wasserman
    total time: D milliseconds
96 34c26412 Orit Wasserman
    duplicate: E pages
97 34c26412 Orit Wasserman
    normal: F pages
98 34c26412 Orit Wasserman
    normal bytes: G kbytes
99 34c26412 Orit Wasserman
    cache size: H bytes
100 34c26412 Orit Wasserman
    xbzrle transferred: I kbytes
101 34c26412 Orit Wasserman
    xbzrle pages: J pages
102 34c26412 Orit Wasserman
    xbzrle cache miss: K
103 34c26412 Orit Wasserman
    xbzrle overflow : L
104 34c26412 Orit Wasserman
105 34c26412 Orit Wasserman
xbzrle cache-miss: the number of cache misses to date - high cache-miss rate
106 34c26412 Orit Wasserman
indicates that the cache size is set too low.
107 34c26412 Orit Wasserman
xbzrle overflow: the number of overflows in the decoding which where the delta
108 34c26412 Orit Wasserman
could not be compressed. This can happen if the changes in the pages are too
109 34c26412 Orit Wasserman
large or there are many short changes; for example, changing every second byte
110 34c26412 Orit Wasserman
(half a page).
111 34c26412 Orit Wasserman
112 34c26412 Orit Wasserman
Testing: Testing indicated that live migration with XBZRLE was completed in 110
113 34c26412 Orit Wasserman
seconds, whereas without it would not be able to complete.
114 34c26412 Orit Wasserman
115 34c26412 Orit Wasserman
A simple synthetic memory r/w load generator:
116 34c26412 Orit Wasserman
..    include <stdlib.h>
117 34c26412 Orit Wasserman
..    include <stdio.h>
118 34c26412 Orit Wasserman
..    int main()
119 34c26412 Orit Wasserman
..    {
120 34c26412 Orit Wasserman
..        char *buf = (char *) calloc(4096, 4096);
121 34c26412 Orit Wasserman
..        while (1) {
122 34c26412 Orit Wasserman
..            int i;
123 34c26412 Orit Wasserman
..            for (i = 0; i < 4096 * 4; i++) {
124 34c26412 Orit Wasserman
..                buf[i * 4096 / 4]++;
125 34c26412 Orit Wasserman
..            }
126 34c26412 Orit Wasserman
..            printf(".");
127 34c26412 Orit Wasserman
..        }
128 34c26412 Orit Wasserman
..    }