Statistics
| Branch: | Revision:

root / docs / blkverify.txt @ d2d979c6

History | View | Annotate | Download (3 kB)

1 d9d33417 Stefan Hajnoczi
= Block driver correctness testing with blkverify =
2 d9d33417 Stefan Hajnoczi
3 d9d33417 Stefan Hajnoczi
== Introduction ==
4 d9d33417 Stefan Hajnoczi
5 d9d33417 Stefan Hajnoczi
This document describes how to use the blkverify protocol to test that a block
6 d9d33417 Stefan Hajnoczi
driver is operating correctly.
7 d9d33417 Stefan Hajnoczi
8 d9d33417 Stefan Hajnoczi
It is difficult to test and debug block drivers against real guests.  Often
9 d9d33417 Stefan Hajnoczi
processes inside the guest will crash because corrupt sectors were read as part
10 d9d33417 Stefan Hajnoczi
of the executable.  Other times obscure errors are raised by a program inside
11 d9d33417 Stefan Hajnoczi
the guest.  These issues are extremely hard to trace back to bugs in the block
12 d9d33417 Stefan Hajnoczi
driver.
13 d9d33417 Stefan Hajnoczi
14 d9d33417 Stefan Hajnoczi
Blkverify solves this problem by catching data corruption inside QEMU the first
15 d9d33417 Stefan Hajnoczi
time bad data is read and reporting the disk sector that is corrupted.
16 d9d33417 Stefan Hajnoczi
17 d9d33417 Stefan Hajnoczi
== How it works ==
18 d9d33417 Stefan Hajnoczi
19 d9d33417 Stefan Hajnoczi
The blkverify protocol has two child block devices, the "test" device and the
20 d9d33417 Stefan Hajnoczi
"raw" device.  Read/write operations are mirrored to both devices so their
21 d9d33417 Stefan Hajnoczi
state should always be in sync.
22 d9d33417 Stefan Hajnoczi
23 d9d33417 Stefan Hajnoczi
The "raw" device is a raw image, a flat file, that has identical starting
24 d9d33417 Stefan Hajnoczi
contents to the "test" image.  The idea is that the "raw" device will handle
25 d9d33417 Stefan Hajnoczi
read/write operations correctly and not corrupt data.  It can be used as a
26 d9d33417 Stefan Hajnoczi
reference for comparison against the "test" device.
27 d9d33417 Stefan Hajnoczi
28 d9d33417 Stefan Hajnoczi
After a mirrored read operation completes, blkverify will compare the data and
29 d9d33417 Stefan Hajnoczi
raise an error if it is not identical.  This makes it possible to catch the
30 d9d33417 Stefan Hajnoczi
first instance where corrupt data is read.
31 d9d33417 Stefan Hajnoczi
32 d9d33417 Stefan Hajnoczi
== Example ==
33 d9d33417 Stefan Hajnoczi
34 d9d33417 Stefan Hajnoczi
Imagine raw.img has 0xcd repeated throughout its first sector:
35 d9d33417 Stefan Hajnoczi
36 d9d33417 Stefan Hajnoczi
    $ ./qemu-io -c 'read -v 0 512' raw.img
37 d9d33417 Stefan Hajnoczi
    00000000:  cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd  ................
38 d9d33417 Stefan Hajnoczi
    00000010:  cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd  ................
39 d9d33417 Stefan Hajnoczi
    [...]
40 d9d33417 Stefan Hajnoczi
    000001e0:  cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd  ................
41 d9d33417 Stefan Hajnoczi
    000001f0:  cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd  ................
42 d9d33417 Stefan Hajnoczi
    read 512/512 bytes at offset 0
43 d9d33417 Stefan Hajnoczi
    512.000000 bytes, 1 ops; 0.0000 sec (97.656 MiB/sec and 200000.0000 ops/sec)
44 d9d33417 Stefan Hajnoczi
45 d9d33417 Stefan Hajnoczi
And test.img is corrupt, its first sector is zeroed when it shouldn't be:
46 d9d33417 Stefan Hajnoczi
47 d9d33417 Stefan Hajnoczi
    $ ./qemu-io -c 'read -v 0 512' test.img
48 d9d33417 Stefan Hajnoczi
    00000000:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
49 d9d33417 Stefan Hajnoczi
    00000010:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
50 d9d33417 Stefan Hajnoczi
    [...]
51 d9d33417 Stefan Hajnoczi
    000001e0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
52 d9d33417 Stefan Hajnoczi
    000001f0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
53 d9d33417 Stefan Hajnoczi
    read 512/512 bytes at offset 0
54 d9d33417 Stefan Hajnoczi
    512.000000 bytes, 1 ops; 0.0000 sec (81.380 MiB/sec and 166666.6667 ops/sec)
55 d9d33417 Stefan Hajnoczi
56 d9d33417 Stefan Hajnoczi
This error is caught by blkverify:
57 d9d33417 Stefan Hajnoczi
58 d9d33417 Stefan Hajnoczi
    $ ./qemu-io -c 'read 0 512' blkverify:a.img:b.img
59 d9d33417 Stefan Hajnoczi
    blkverify: read sector_num=0 nb_sectors=4 contents mismatch in sector 0
60 d9d33417 Stefan Hajnoczi
61 d9d33417 Stefan Hajnoczi
A more realistic scenario is verifying the installation of a guest OS:
62 d9d33417 Stefan Hajnoczi
63 d9d33417 Stefan Hajnoczi
    $ ./qemu-img create raw.img 16G
64 d9d33417 Stefan Hajnoczi
    $ ./qemu-img create -f qcow2 test.qcow2 16G
65 d9d33417 Stefan Hajnoczi
    $ x86_64-softmmu/qemu-system-x86_64 -cdrom debian.iso \
66 d9d33417 Stefan Hajnoczi
                                        -drive file=blkverify:raw.img:test.qcow2
67 d9d33417 Stefan Hajnoczi
68 d9d33417 Stefan Hajnoczi
If the installation is aborted when blkverify detects corruption, use qemu-io
69 d9d33417 Stefan Hajnoczi
to explore the contents of the disk image at the sector in question.