root / docs / xbzrle.txt @ 45724d6d
History | View | Annotate | Download (4.4 kB)
1 |
XBZRLE (Xor Based Zero Run Length Encoding) |
---|---|
2 |
=========================================== |
3 |
|
4 |
Using XBZRLE (Xor Based Zero Run Length Encoding) allows for the reduction |
5 |
of VM downtime and the total live-migration time of Virtual machines. |
6 |
It is particularly useful for virtual machines running memory write intensive |
7 |
workloads that are typical of large enterprise applications such as SAP ERP |
8 |
Systems, and generally speaking for any application that uses a sparse memory |
9 |
update pattern. |
10 |
|
11 |
Instead of sending the changed guest memory page this solution will send a |
12 |
compressed version of the updates, thus reducing the amount of data sent during |
13 |
live migration. |
14 |
In order to be able to calculate the update, the previous memory pages need to |
15 |
be stored on the source. Those pages are stored in a dedicated cache |
16 |
(hash table) and are accessed by their address. |
17 |
The larger the cache size the better the chances are that the page has already |
18 |
been stored in the cache. |
19 |
A small cache size will result in high cache miss rate. |
20 |
Cache size can be changed before and during migration. |
21 |
|
22 |
Format |
23 |
======= |
24 |
|
25 |
The compression format performs a XOR between the previous and current content |
26 |
of the page, where zero represents an unchanged value. |
27 |
The page data delta is represented by zero and non zero runs. |
28 |
A zero run is represented by its length (in bytes). |
29 |
A non zero run is represented by its length (in bytes) and the new data. |
30 |
The run length is encoded using ULEB128 (http://en.wikipedia.org/wiki/LEB128) |
31 |
|
32 |
There can be more than one valid encoding, the sender may send a longer encoding |
33 |
for the benefit of reducing computation cost. |
34 |
|
35 |
page = zrun nzrun |
36 |
| zrun nzrun page |
37 |
|
38 |
zrun = length |
39 |
|
40 |
nzrun = length byte... |
41 |
|
42 |
length = uleb128 encoded integer |
43 |
|
44 |
On the sender side XBZRLE is used as a compact delta encoding of page updates, |
45 |
retrieving the old page content from the cache (default size of 512 MB). The |
46 |
receiving side uses the existing page's content and XBZRLE to decode the new |
47 |
page's content. |
48 |
|
49 |
This work was originally based on research results published |
50 |
VEE 2011: Evaluation of Delta Compression Techniques for Efficient Live |
51 |
Migration of Large Virtual Machines by Benoit, Svard, Tordsson and Elmroth. |
52 |
Additionally the delta encoder XBRLE was improved further using the XBZRLE |
53 |
instead. |
54 |
|
55 |
XBZRLE has a sustained bandwidth of 2-2.5 GB/s for typical workloads making it |
56 |
ideal for in-line, real-time encoding such as is needed for live-migration. |
57 |
|
58 |
Example |
59 |
old buffer: |
60 |
1001 zeros |
61 |
05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 68 00 00 6b 00 6d |
62 |
3074 zeros |
63 |
|
64 |
new buffer: |
65 |
1001 zeros |
66 |
01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 68 00 00 67 00 69 |
67 |
3074 zeros |
68 |
|
69 |
encoded buffer: |
70 |
|
71 |
encoded length 24 |
72 |
e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69 |
73 |
|
74 |
Usage |
75 |
====================== |
76 |
1. Verify the destination QEMU version is able to decode the new format. |
77 |
{qemu} info migrate_capabilities |
78 |
{qemu} xbzrle: off , ... |
79 |
|
80 |
2. Activate xbzrle on both source and destination: |
81 |
{qemu} migrate_set_capability xbzrle on |
82 |
|
83 |
3. Set the XBZRLE cache size - the cache size is in MBytes and should be a |
84 |
power of 2. The cache default value is 64MBytes. (on source only) |
85 |
{qemu} migrate_set_cache_size 256m |
86 |
|
87 |
4. Start outgoing migration |
88 |
{qemu} migrate -d tcp:destination.host:4444 |
89 |
{qemu} info migrate |
90 |
capabilities: xbzrle: on |
91 |
Migration status: active |
92 |
transferred ram: A kbytes |
93 |
remaining ram: B kbytes |
94 |
total ram: C kbytes |
95 |
total time: D milliseconds |
96 |
duplicate: E pages |
97 |
normal: F pages |
98 |
normal bytes: G kbytes |
99 |
cache size: H bytes |
100 |
xbzrle transferred: I kbytes |
101 |
xbzrle pages: J pages |
102 |
xbzrle cache miss: K |
103 |
xbzrle overflow : L |
104 |
|
105 |
xbzrle cache-miss: the number of cache misses to date - high cache-miss rate |
106 |
indicates that the cache size is set too low. |
107 |
xbzrle overflow: the number of overflows in the decoding which where the delta |
108 |
could not be compressed. This can happen if the changes in the pages are too |
109 |
large or there are many short changes; for example, changing every second byte |
110 |
(half a page). |
111 |
|
112 |
Testing: Testing indicated that live migration with XBZRLE was completed in 110 |
113 |
seconds, whereas without it would not be able to complete. |
114 |
|
115 |
A simple synthetic memory r/w load generator: |
116 |
.. include <stdlib.h> |
117 |
.. include <stdio.h> |
118 |
.. int main() |
119 |
.. { |
120 |
.. char *buf = (char *) calloc(4096, 4096); |
121 |
.. while (1) { |
122 |
.. int i; |
123 |
.. for (i = 0; i < 4096 * 4; i++) { |
124 |
.. buf[i * 4096 / 4]++; |
125 |
.. } |
126 |
.. printf("."); |
127 |
.. } |
128 |
.. } |