Statistics
| Branch: | Revision:

root / qemu-doc.texi @ 9d0fe224

History | View | Annotate | Download (30 kB)

1
\input texinfo @c -*- texinfo -*-
2

    
3
@settitle QEMU CPU Emulator Reference Documentation
4
@titlepage
5
@sp 7
6
@center @titlefont{QEMU CPU Emulator Reference Documentation}
7
@sp 3
8
@end titlepage
9

    
10
@chapter Introduction
11

    
12
@section Features
13

    
14
QEMU is a FAST! processor emulator. By using dynamic translation it
15
achieves a reasonnable speed while being easy to port on new host
16
CPUs.
17

    
18
QEMU has two operating modes:
19
@itemize
20
@item User mode emulation. In this mode, QEMU can launch Linux processes
21
compiled for one CPU on another CPU. Linux system calls are converted
22
because of endianness and 32/64 bit mismatches. The Wine Windows API
23
emulator (@url{http://www.winehq.org}) and the DOSEMU DOS emulator
24
(@url{www.dosemu.org}) are the main targets for QEMU.
25

    
26
@item Full system emulation. In this mode, QEMU emulates a full
27
system, including a processor and various peripherials. Currently, it
28
is only used to launch an x86 Linux kernel on an x86 Linux system. It
29
enables easier testing and debugging of system code. It can also be
30
used to provide virtual hosting of several virtual PCs on a single
31
server.
32

    
33
@end itemize
34

    
35
As QEMU requires no host kernel patches to run, it is very safe and
36
easy to use.
37

    
38
QEMU generic features:
39

    
40
@itemize 
41

    
42
@item User space only or full system emulation.
43

    
44
@item Using dynamic translation to native code for reasonnable speed.
45

    
46
@item Working on x86 and PowerPC hosts. Being tested on ARM, Sparc32, Alpha and S390.
47

    
48
@item Self-modifying code support.
49

    
50
@item Precise exceptions support.
51

    
52
@item The virtual CPU is a library (@code{libqemu}) which can be used 
53
in other projects.
54

    
55
@end itemize
56

    
57
QEMU user mode emulation features:
58
@itemize 
59
@item Generic Linux system call converter, including most ioctls.
60

    
61
@item clone() emulation using native CPU clone() to use Linux scheduler for threads.
62

    
63
@item Accurate signal handling by remapping host signals to target signals. 
64
@end itemize
65
@end itemize
66

    
67
QEMU full system emulation features:
68
@itemize 
69
@item Using mmap() system calls to simulate the MMU
70
@end itemize
71

    
72
@section x86 emulation
73

    
74
QEMU x86 target features:
75

    
76
@itemize 
77

    
78
@item The virtual x86 CPU supports 16 bit and 32 bit addressing with segmentation. 
79
LDT/GDT and IDT are emulated. VM86 mode is also supported to run DOSEMU.
80

    
81
@item Support of host page sizes bigger than 4KB in user mode emulation.
82

    
83
@item QEMU can emulate itself on x86.
84

    
85
@item An extensive Linux x86 CPU test program is included @file{tests/test-i386}. 
86
It can be used to test other x86 virtual CPUs.
87

    
88
@end itemize
89

    
90
Current QEMU limitations:
91

    
92
@itemize 
93

    
94
@item No SSE/MMX support (yet).
95

    
96
@item No x86-64 support.
97

    
98
@item IPC syscalls are missing.
99

    
100
@item The x86 segment limits and access rights are not tested at every 
101
memory access.
102

    
103
@item On non x86 host CPUs, @code{double}s are used instead of the non standard 
104
10 byte @code{long double}s of x86 for floating point emulation to get
105
maximum performances.
106

    
107
@item Full system emulation only works if no data are mapped above the virtual address 
108
0xc0000000 (yet).
109

    
110
@item Some priviledged instructions or behaviors are missing. Only the ones 
111
needed for proper Linux kernel operation are emulated.
112

    
113
@item No memory separation between the kernel and the user processes is done. 
114
It will be implemented very soon.
115

    
116
@end itemize
117

    
118
@section ARM emulation
119

    
120
@itemize
121

    
122
@item ARM emulation can currently launch small programs while using the
123
generic dynamic code generation architecture of QEMU.
124

    
125
@item No FPU support (yet).
126

    
127
@item No automatic regression testing (yet).
128

    
129
@end itemize
130

    
131
@chapter QEMU User space emulator invocation
132

    
133
@section Quick Start
134

    
135
If you need to compile QEMU, please read the @file{README} which gives
136
the related information.
137

    
138
In order to launch a Linux process, QEMU needs the process executable
139
itself and all the target (x86) dynamic libraries used by it. 
140

    
141
@itemize
142

    
143
@item On x86, you can just try to launch any process by using the native
144
libraries:
145

    
146
@example 
147
qemu -L / /bin/ls
148
@end example
149

    
150
@code{-L /} tells that the x86 dynamic linker must be searched with a
151
@file{/} prefix.
152

    
153
@item Since QEMU is also a linux process, you can launch qemu with qemu:
154

    
155
@example 
156
qemu -L / qemu -L / /bin/ls
157
@end example
158

    
159
@item On non x86 CPUs, you need first to download at least an x86 glibc
160
(@file{qemu-XXX-i386-glibc21.tar.gz} on the QEMU web page). Ensure that
161
@code{LD_LIBRARY_PATH} is not set:
162

    
163
@example
164
unset LD_LIBRARY_PATH 
165
@end example
166

    
167
Then you can launch the precompiled @file{ls} x86 executable:
168

    
169
@example
170
qemu /usr/local/qemu-i386/bin/ls-i386
171
@end example
172
You can look at @file{/usr/local/qemu-i386/bin/qemu-conf.sh} so that
173
QEMU is automatically launched by the Linux kernel when you try to
174
launch x86 executables. It requires the @code{binfmt_misc} module in the
175
Linux kernel.
176

    
177
@item The x86 version of QEMU is also included. You can try weird things such as:
178
@example
179
qemu /usr/local/qemu-i386/bin/qemu-i386 /usr/local/qemu-i386/bin/ls-i386
180
@end example
181

    
182
@end itemize
183

    
184
@section Wine launch
185

    
186
@itemize
187

    
188
@item Ensure that you have a working QEMU with the x86 glibc
189
distribution (see previous section). In order to verify it, you must be
190
able to do:
191

    
192
@example
193
qemu /usr/local/qemu-i386/bin/ls-i386
194
@end example
195

    
196
@item Download the binary x86 Wine install
197
(@file{qemu-XXX-i386-wine.tar.gz} on the QEMU web page). 
198

    
199
@item Configure Wine on your account. Look at the provided script
200
@file{/usr/local/qemu-i386/bin/wine-conf.sh}. Your previous
201
@code{$@{HOME@}/.wine} directory is saved to @code{$@{HOME@}/.wine.org}.
202

    
203
@item Then you can try the example @file{putty.exe}:
204

    
205
@example
206
qemu /usr/local/qemu-i386/wine/bin/wine /usr/local/qemu-i386/wine/c/Program\ Files/putty.exe
207
@end example
208

    
209
@end itemize
210

    
211
@section Command line options
212

    
213
@example
214
usage: qemu [-h] [-d] [-L path] [-s size] program [arguments...]
215
@end example
216

    
217
@table @option
218
@item -h
219
Print the help
220
@item -L path   
221
Set the x86 elf interpreter prefix (default=/usr/local/qemu-i386)
222
@item -s size
223
Set the x86 stack size in bytes (default=524288)
224
@end table
225

    
226
Debug options:
227

    
228
@table @option
229
@item -d
230
Activate log (logfile=/tmp/qemu.log)
231
@item -p pagesize
232
Act as if the host page size was 'pagesize' bytes
233
@end table
234

    
235
@chapter QEMU System emulator invocation
236

    
237
@section Quick Start
238

    
239
This section explains how to launch a Linux kernel inside QEMU.
240

    
241
@enumerate
242
@item
243
Download the archive @file{vl-test-xxx.tar.gz} containing a Linux
244
kernel and a disk image. The archive also contains a precompiled
245
version of @file{vl}, the QEMU System emulator.
246

    
247
@item Optional: If you want network support (for example to launch X11 examples), you
248
must copy the script @file{vl-ifup} in @file{/etc} and configure
249
properly @code{sudo} so that the command @code{ifconfig} contained in
250
@file{vl-ifup} can be executed as root. You must verify that your host
251
kernel supports the TUN/TAP network interfaces: the device
252
@file{/dev/net/tun} must be present.
253

    
254
When network is enabled, there is a virtual network connection between
255
the host kernel and the emulated kernel. The emulated kernel is seen
256
from the host kernel at IP address 172.20.0.2 and the host kernel is
257
seen from the emulated kernel at IP address 172.20.0.1.
258

    
259
@item Launch @code{vl.sh}. You should have the following output:
260

    
261
@example
262
> ./vl.sh 
263
connected to host network interface: tun0
264
Uncompressing Linux... Ok, booting the kernel.
265
Linux version 2.4.20 (fabrice@localhost.localdomain) (gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-110)) #22 lun jui 7 13:37:41 CEST 2003
266
BIOS-provided physical RAM map:
267
 BIOS-e801: 0000000000000000 - 000000000009f000 (usable)
268
 BIOS-e801: 0000000000100000 - 0000000002000000 (usable)
269
32MB LOWMEM available.
270
On node 0 totalpages: 8192
271
zone(0): 4096 pages.
272
zone(1): 4096 pages.
273
zone(2): 0 pages.
274
Kernel command line: root=/dev/hda ide1=noprobe ide2=noprobe ide3=noprobe ide4=noprobe ide5=noprobe
275
ide_setup: ide1=noprobe
276
ide_setup: ide2=noprobe
277
ide_setup: ide3=noprobe
278
ide_setup: ide4=noprobe
279
ide_setup: ide5=noprobe
280
Initializing CPU#0
281
Detected 501.285 MHz processor.
282
Calibrating delay loop... 989.59 BogoMIPS
283
Memory: 29268k/32768k available (907k kernel code, 3112k reserved, 212k data, 52k init, 0k highmem)
284
Dentry cache hash table entries: 4096 (order: 3, 32768 bytes)
285
Inode cache hash table entries: 2048 (order: 2, 16384 bytes)
286
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
287
Buffer-cache hash table entries: 1024 (order: 0, 4096 bytes)
288
Page-cache hash table entries: 8192 (order: 3, 32768 bytes)
289
CPU: Intel Pentium Pro stepping 03
290
Checking 'hlt' instruction... OK.
291
POSIX conformance testing by UNIFIX
292
Linux NET4.0 for Linux 2.4
293
Based upon Swansea University Computer Society NET3.039
294
Initializing RT netlink socket
295
apm: BIOS not found.
296
Starting kswapd
297
Journalled Block Device driver loaded
298
pty: 256 Unix98 ptys configured
299
Serial driver version 5.05c (2001-07-08) with no serial options enabled
300
ttyS00 at 0x03f8 (irq = 4) is a 16450
301
Uniform Multi-Platform E-IDE driver Revision: 6.31
302
ide: Assuming 50MHz system bus speed for PIO modes; override with idebus=xx
303
hda: QEMU HARDDISK, ATA DISK drive
304
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
305
hda: 12288 sectors (6 MB) w/256KiB Cache, CHS=12/16/63
306
Partition check:
307
 hda: unknown partition table
308
ne.c:v1.10 9/23/94 Donald Becker (becker@scyld.com)
309
Last modified Nov 1, 2000 by Paul Gortmaker
310
NE*000 ethercard probe at 0x300: 52 54 00 12 34 56
311
eth0: NE2000 found at 0x300, using IRQ 9.
312
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
313
NET4: Linux TCP/IP 1.0 for NET4.0
314
IP Protocols: ICMP, UDP, TCP, IGMP
315
IP: routing cache hash table of 512 buckets, 4Kbytes
316
TCP: Hash tables configured (established 2048 bind 4096)
317
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
318
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
319
VFS: Mounted root (ext2 filesystem).
320
Freeing unused kernel memory: 52k freed
321
sh: can't access tty; job control turned off
322
#
323
@end example
324

    
325
@item
326
Then you can play with the kernel inside the virtual serial console. You
327
can launch @code{ls} for example. Type @key{Ctrl-a h} to have an help
328
about the keys you can type inside the virtual serial console. In
329
particular, use @key{Ctrl-a x} to exit QEMU and use @key{Ctrl-a b} as
330
the Magic SysRq key.
331

    
332
@item 
333
If the network is enabled, launch the script @file{/etc/linuxrc} in the
334
emulator (don't forget the leading dot):
335
@example
336
. /etc/linuxrc
337
@end example
338

    
339
Then enable X11 connections on your PC from the emulated Linux: 
340
@example
341
xhost +172.20.0.2
342
@end example
343

    
344
You can now launch @file{xterm} or @file{xlogo} and verify that you have
345
a real Virtual Linux system !
346

    
347
@end enumerate
348

    
349
NOTES:
350
@enumerate
351
@item 
352
A 2.5.74 kernel is also included in the vl-test archive. Just
353
replace the bzImage in vl.sh to try it.
354

    
355
@item 
356
vl creates a temporary file in @var{$VLTMPDIR} (@file{/tmp} is the
357
default) containing all the simulated PC memory. If possible, try to use
358
a temporary directory using the tmpfs filesystem to avoid too many
359
unnecessary disk accesses.
360

    
361
@item 
362
In order to exit cleanly for vl, you can do a @emph{shutdown} inside
363
vl. vl will automatically exit when the Linux shutdown is done.
364

    
365
@item 
366
You can boot slightly faster by disabling the probe of non present IDE
367
interfaces. To do so, add the following options on the kernel command
368
line:
369
@example
370
ide1=noprobe ide2=noprobe ide3=noprobe ide4=noprobe ide5=noprobe
371
@end example
372

    
373
@item 
374
The example disk image is a modified version of the one made by Kevin
375
Lawton for the plex86 Project (@url{www.plex86.org}).
376

    
377
@end enumerate
378

    
379
@section Invocation
380

    
381
@example
382
usage: vl [options] bzImage [kernel parameters...]
383
@end example
384

    
385
@file{bzImage} is a Linux kernel image.
386

    
387
General options:
388
@table @option
389
@item -hda file
390
@item -hdb file
391
Use 'file' as hard disk 0 or 1 image (@xref{disk_images}). 
392

    
393
@item -snapshot
394

    
395
Write to temporary files instead of disk image files. In this case,
396
the raw disk image you use is not written back. You can however force
397
the write back by pressing @key{C-a s} (@xref{disk_images}). 
398

    
399
@item -m megs
400
Set virtual RAM size to @var{megs} megabytes.
401

    
402
@item -n script      
403
Set network init script [default=/etc/vl-ifup]. This script is
404
launched to configure the host network interface (usually tun0)
405
corresponding to the virtual NE2000 card.
406

    
407
@item -initrd file
408
Use 'file' as initial ram disk.
409
@end table
410

    
411
Debug options:
412
@table @option
413
@item -s
414
Wait gdb connection to port 1234.
415
@item -p port
416
Change gdb connection port.
417
@item -d             
418
Output log in /tmp/vl.log
419
@end table
420

    
421
During emulation, use @key{C-a h} to get terminal commands:
422

    
423
@table @key
424
@item C-a h
425
Print this help
426
@item C-a x    
427
Exit emulatior
428
@item C-a s    
429
Save disk data back to file (if -snapshot)
430
@item C-a b
431
Send break (magic sysrq)
432
@item C-a C-a
433
Send C-a
434
@end table
435

    
436
@node disk_images
437
@section Disk Images
438

    
439
@subsection Raw disk images
440

    
441
The disk images can simply be raw images of the hard disk. You can
442
create them with the command:
443
@example
444
dd if=/dev/zero of=myimage bs=1024 count=mysize
445
@end example
446
where @var{myimage} is the image filename and @var{mysize} is its size
447
in kilobytes.
448

    
449
@subsection Snapshot mode
450

    
451
If you use the option @option{-snapshot}, all disk images are
452
considered as read only. When sectors in written, they are written in
453
a temporary file created in @file{/tmp}. You can however force the
454
write back to the raw disk images by pressing @key{C-a s}.
455

    
456
NOTE: The snapshot mode only works with raw disk images.
457

    
458
@subsection Copy On Write disk images
459

    
460
QEMU also supports user mode Linux
461
(@url{http://user-mode-linux.sourceforge.net/}) Copy On Write (COW)
462
disk images. The COW disk images are much smaller than normal images
463
as they store only modified sectors. They also permit the use of the
464
same disk image template for many users.
465

    
466
To create a COW disk images, use the command:
467

    
468
@example
469
vlmkcow -f myrawimage.bin mycowimage.cow
470
@end example
471

    
472
@file{myrawimage.bin} is a raw image you want to use as original disk
473
image. It will never be written to.
474

    
475
@file{mycowimage.cow} is the COW disk image which is created by
476
@code{vlmkcow}. You can use it directly with the @option{-hdx}
477
options. You must not modify the original raw disk image if you use
478
COW images, as COW images only store the modified sectors from the raw
479
disk image. QEMU stores the original raw disk image name and its
480
modified time in the COW disk image so that chances of mistakes are
481
reduced.
482

    
483
If the raw disk image is not read-only, by pressing @key{C-a s} you
484
can flush the COW disk image back into the raw disk image, as in
485
snapshot mode.
486

    
487
COW disk images can also be created without a corresponding raw disk
488
image. It is useful to have a big initial virtual disk image without
489
using much disk space. Use:
490

    
491
@example
492
vlmkcow mycowimage.cow 1024
493
@end example
494

    
495
to create a 1 gigabyte empty COW disk image.
496

    
497
NOTES: 
498
@enumerate
499
@item
500
COW disk images must be created on file systems supporting
501
@emph{holes} such as ext2 or ext3.
502
@item 
503
Since holes are used, the displayed size of the COW disk image is not
504
the real one. To know it, use the @code{ls -ls} command.
505
@end enumerate
506

    
507
@section Linux Kernel Compilation
508

    
509
You should be able to use any kernel with QEMU provided you make the
510
following changes (only 2.4.x and 2.5.x were tested):
511

    
512
@enumerate
513
@item
514
The kernel must be mapped at 0x90000000 (the default is
515
0xc0000000). You must modify only two lines in the kernel source:
516

    
517
In @file{include/asm/page.h}, replace
518
@example
519
#define __PAGE_OFFSET           (0xc0000000)
520
@end example
521
by
522
@example
523
#define __PAGE_OFFSET           (0x90000000)
524
@end example
525

    
526
And in @file{arch/i386/vmlinux.lds}, replace
527
@example
528
  . = 0xc0000000 + 0x100000;
529
@end example
530
by 
531
@example
532
  . = 0x90000000 + 0x100000;
533
@end example
534

    
535
@item
536
If you want to enable SMP (Symmetric Multi-Processing) support, you
537
must make the following change in @file{include/asm/fixmap.h}. Replace
538
@example
539
#define FIXADDR_TOP	(0xffffX000UL)
540
@end example
541
by 
542
@example
543
#define FIXADDR_TOP	(0xa7ffX000UL)
544
@end example
545
(X is 'e' or 'f' depending on the kernel version). Although you can
546
use an SMP kernel with QEMU, it only supports one CPU.
547

    
548
@item
549
If you are not using a 2.5 kernel as host kernel but if you use a target
550
2.5 kernel, you must also ensure that the 'HZ' define is set to 100
551
(1000 is the default) as QEMU cannot currently emulate timers at
552
frequencies greater than 100 Hz on host Linux systems < 2.5. In
553
@file{include/asm/param.h}, replace:
554

    
555
@example
556
# define HZ		1000		/* Internal kernel timer frequency */
557
@end example
558
by
559
@example
560
# define HZ		100		/* Internal kernel timer frequency */
561
@end example
562

    
563
@end enumerate
564

    
565
The file config-2.x.x gives the configuration of the example kernels.
566

    
567
Just type
568
@example
569
make bzImage
570
@end example
571

    
572
As you would do to make a real kernel. Then you can use with QEMU
573
exactly the same kernel as you would boot on your PC (in
574
@file{arch/i386/boot/bzImage}).
575

    
576
@section PC Emulation
577

    
578
QEMU emulates the following PC peripherials:
579

    
580
@itemize
581
@item
582
PIC (interrupt controler)
583
@item
584
PIT (timers)
585
@item 
586
CMOS memory
587
@item
588
Dumb VGA (to print the @code{Uncompressing Linux} message)
589
@item
590
Serial port (port=0x3f8, irq=4)
591
@item 
592
NE2000 network adapter (port=0x300, irq=9)
593
@item 
594
IDE disk interface (port=0x1f0, irq=14)
595
@end itemize
596

    
597
@section GDB usage
598

    
599
QEMU has a primitive support to work with gdb, so that you can do
600
'Ctrl-C' while the kernel is running and inspect its state.
601

    
602
In order to use gdb, launch vl with the '-s' option. It will wait for a
603
gdb connection:
604
@example
605
> vl -s arch/i386/boot/bzImage initrd-2.4.20.img root=/dev/ram0 ramdisk_size=6144
606
Connected to host network interface: tun0
607
Waiting gdb connection on port 1234
608
@end example
609

    
610
Then launch gdb on the 'vmlinux' executable:
611
@example
612
> gdb vmlinux
613
@end example
614

    
615
In gdb, connect to QEMU:
616
@example
617
(gdb) target remote locahost:1234
618
@end example
619

    
620
Then you can use gdb normally. For example, type 'c' to launch the kernel:
621
@example
622
(gdb) c
623
@end example
624

    
625
WARNING: breakpoints and single stepping are not yet supported.
626

    
627
@chapter QEMU Internals
628

    
629
@section QEMU compared to other emulators
630

    
631
Like bochs [3], QEMU emulates an x86 CPU. But QEMU is much faster than
632
bochs as it uses dynamic compilation and because it uses the host MMU to
633
simulate the x86 MMU. The downside is that currently the emulation is
634
not as accurate as bochs (for example, you cannot currently run Windows
635
inside QEMU).
636

    
637
Like Valgrind [2], QEMU does user space emulation and dynamic
638
translation. Valgrind is mainly a memory debugger while QEMU has no
639
support for it (QEMU could be used to detect out of bound memory
640
accesses as Valgrind, but it has no support to track uninitialised data
641
as Valgrind does). The Valgrind dynamic translator generates better code
642
than QEMU (in particular it does register allocation) but it is closely
643
tied to an x86 host and target and has no support for precise exceptions
644
and system emulation.
645

    
646
EM86 [4] is the closest project to user space QEMU (and QEMU still uses
647
some of its code, in particular the ELF file loader). EM86 was limited
648
to an alpha host and used a proprietary and slow interpreter (the
649
interpreter part of the FX!32 Digital Win32 code translator [5]).
650

    
651
TWIN [6] is a Windows API emulator like Wine. It is less accurate than
652
Wine but includes a protected mode x86 interpreter to launch x86 Windows
653
executables. Such an approach as greater potential because most of the
654
Windows API is executed natively but it is far more difficult to develop
655
because all the data structures and function parameters exchanged
656
between the API and the x86 code must be converted.
657

    
658
User mode Linux [7] was the only solution before QEMU to launch a Linux
659
kernel as a process while not needing any host kernel patches. However,
660
user mode Linux requires heavy kernel patches while QEMU accepts
661
unpatched Linux kernels. It would be interesting to compare the
662
performance of the two approaches.
663

    
664
The new Plex86 [8] PC virtualizer is done in the same spirit as the QEMU
665
system emulator. It requires a patched Linux kernel to work (you cannot
666
launch the same kernel on your PC), but the patches are really small. As
667
it is a PC virtualizer (no emulation is done except for some priveledged
668
instructions), it has the potential of being faster than QEMU. The
669
downside is that a complicated (and potentially unsafe) host kernel
670
patch is needed.
671

    
672
@section Portable dynamic translation
673

    
674
QEMU is a dynamic translator. When it first encounters a piece of code,
675
it converts it to the host instruction set. Usually dynamic translators
676
are very complicated and highly CPU dependent. QEMU uses some tricks
677
which make it relatively easily portable and simple while achieving good
678
performances.
679

    
680
The basic idea is to split every x86 instruction into fewer simpler
681
instructions. Each simple instruction is implemented by a piece of C
682
code (see @file{op-i386.c}). Then a compile time tool (@file{dyngen})
683
takes the corresponding object file (@file{op-i386.o}) to generate a
684
dynamic code generator which concatenates the simple instructions to
685
build a function (see @file{op-i386.h:dyngen_code()}).
686

    
687
In essence, the process is similar to [1], but more work is done at
688
compile time. 
689

    
690
A key idea to get optimal performances is that constant parameters can
691
be passed to the simple operations. For that purpose, dummy ELF
692
relocations are generated with gcc for each constant parameter. Then,
693
the tool (@file{dyngen}) can locate the relocations and generate the
694
appriopriate C code to resolve them when building the dynamic code.
695

    
696
That way, QEMU is no more difficult to port than a dynamic linker.
697

    
698
To go even faster, GCC static register variables are used to keep the
699
state of the virtual CPU.
700

    
701
@section Register allocation
702

    
703
Since QEMU uses fixed simple instructions, no efficient register
704
allocation can be done. However, because RISC CPUs have a lot of
705
register, most of the virtual CPU state can be put in registers without
706
doing complicated register allocation.
707

    
708
@section Condition code optimisations
709

    
710
Good CPU condition codes emulation (@code{EFLAGS} register on x86) is a
711
critical point to get good performances. QEMU uses lazy condition code
712
evaluation: instead of computing the condition codes after each x86
713
instruction, it just stores one operand (called @code{CC_SRC}), the
714
result (called @code{CC_DST}) and the type of operation (called
715
@code{CC_OP}).
716

    
717
@code{CC_OP} is almost never explicitely set in the generated code
718
because it is known at translation time.
719

    
720
In order to increase performances, a backward pass is performed on the
721
generated simple instructions (see
722
@code{translate-i386.c:optimize_flags()}). When it can be proved that
723
the condition codes are not needed by the next instructions, no
724
condition codes are computed at all.
725

    
726
@section CPU state optimisations
727

    
728
The x86 CPU has many internal states which change the way it evaluates
729
instructions. In order to achieve a good speed, the translation phase
730
considers that some state information of the virtual x86 CPU cannot
731
change in it. For example, if the SS, DS and ES segments have a zero
732
base, then the translator does not even generate an addition for the
733
segment base.
734

    
735
[The FPU stack pointer register is not handled that way yet].
736

    
737
@section Translation cache
738

    
739
A 2MByte cache holds the most recently used translations. For
740
simplicity, it is completely flushed when it is full. A translation unit
741
contains just a single basic block (a block of x86 instructions
742
terminated by a jump or by a virtual CPU state change which the
743
translator cannot deduce statically).
744

    
745
@section Direct block chaining
746

    
747
After each translated basic block is executed, QEMU uses the simulated
748
Program Counter (PC) and other cpu state informations (such as the CS
749
segment base value) to find the next basic block.
750

    
751
In order to accelerate the most common cases where the new simulated PC
752
is known, QEMU can patch a basic block so that it jumps directly to the
753
next one.
754

    
755
The most portable code uses an indirect jump. An indirect jump makes it
756
easier to make the jump target modification atomic. On some
757
architectures (such as PowerPC), the @code{JUMP} opcode is directly
758
patched so that the block chaining has no overhead.
759

    
760
@section Self-modifying code and translated code invalidation
761

    
762
Self-modifying code is a special challenge in x86 emulation because no
763
instruction cache invalidation is signaled by the application when code
764
is modified.
765

    
766
When translated code is generated for a basic block, the corresponding
767
host page is write protected if it is not already read-only (with the
768
system call @code{mprotect()}). Then, if a write access is done to the
769
page, Linux raises a SEGV signal. QEMU then invalidates all the
770
translated code in the page and enables write accesses to the page.
771

    
772
Correct translated code invalidation is done efficiently by maintaining
773
a linked list of every translated block contained in a given page. Other
774
linked lists are also maintained to undo direct block chaining. 
775

    
776
Although the overhead of doing @code{mprotect()} calls is important,
777
most MSDOS programs can be emulated at reasonnable speed with QEMU and
778
DOSEMU.
779

    
780
Note that QEMU also invalidates pages of translated code when it detects
781
that memory mappings are modified with @code{mmap()} or @code{munmap()}.
782

    
783
@section Exception support
784

    
785
longjmp() is used when an exception such as division by zero is
786
encountered. 
787

    
788
The host SIGSEGV and SIGBUS signal handlers are used to get invalid
789
memory accesses. The exact CPU state can be retrieved because all the
790
x86 registers are stored in fixed host registers. The simulated program
791
counter is found by retranslating the corresponding basic block and by
792
looking where the host program counter was at the exception point.
793

    
794
The virtual CPU cannot retrieve the exact @code{EFLAGS} register because
795
in some cases it is not computed because of condition code
796
optimisations. It is not a big concern because the emulated code can
797
still be restarted in any cases.
798

    
799
@section Linux system call translation
800

    
801
QEMU includes a generic system call translator for Linux. It means that
802
the parameters of the system calls can be converted to fix the
803
endianness and 32/64 bit issues. The IOCTLs are converted with a generic
804
type description system (see @file{ioctls.h} and @file{thunk.c}).
805

    
806
QEMU supports host CPUs which have pages bigger than 4KB. It records all
807
the mappings the process does and try to emulated the @code{mmap()}
808
system calls in cases where the host @code{mmap()} call would fail
809
because of bad page alignment.
810

    
811
@section Linux signals
812

    
813
Normal and real-time signals are queued along with their information
814
(@code{siginfo_t}) as it is done in the Linux kernel. Then an interrupt
815
request is done to the virtual CPU. When it is interrupted, one queued
816
signal is handled by generating a stack frame in the virtual CPU as the
817
Linux kernel does. The @code{sigreturn()} system call is emulated to return
818
from the virtual signal handler.
819

    
820
Some signals (such as SIGALRM) directly come from the host. Other
821
signals are synthetized from the virtual CPU exceptions such as SIGFPE
822
when a division by zero is done (see @code{main.c:cpu_loop()}).
823

    
824
The blocked signal mask is still handled by the host Linux kernel so
825
that most signal system calls can be redirected directly to the host
826
Linux kernel. Only the @code{sigaction()} and @code{sigreturn()} system
827
calls need to be fully emulated (see @file{signal.c}).
828

    
829
@section clone() system call and threads
830

    
831
The Linux clone() system call is usually used to create a thread. QEMU
832
uses the host clone() system call so that real host threads are created
833
for each emulated thread. One virtual CPU instance is created for each
834
thread.
835

    
836
The virtual x86 CPU atomic operations are emulated with a global lock so
837
that their semantic is preserved.
838

    
839
Note that currently there are still some locking issues in QEMU. In
840
particular, the translated cache flush is not protected yet against
841
reentrancy.
842

    
843
@section Self-virtualization
844

    
845
QEMU was conceived so that ultimately it can emulate itself. Although
846
it is not very useful, it is an important test to show the power of the
847
emulator.
848

    
849
Achieving self-virtualization is not easy because there may be address
850
space conflicts. QEMU solves this problem by being an executable ELF
851
shared object as the ld-linux.so ELF interpreter. That way, it can be
852
relocated at load time.
853

    
854
@section MMU emulation
855

    
856
For system emulation, QEMU uses the mmap() system call to emulate the
857
target CPU MMU. It works as long the emulated OS does not use an area
858
reserved by the host OS (such as the area above 0xc0000000 on x86
859
Linux).
860

    
861
It is planned to add a slower but more precise MMU emulation
862
with a software MMU.
863

    
864
@section Bibliography
865

    
866
@table @asis
867

    
868
@item [1] 
869
@url{http://citeseer.nj.nec.com/piumarta98optimizing.html}, Optimizing
870
direct threaded code by selective inlining (1998) by Ian Piumarta, Fabio
871
Riccardi.
872

    
873
@item [2]
874
@url{http://developer.kde.org/~sewardj/}, Valgrind, an open-source
875
memory debugger for x86-GNU/Linux, by Julian Seward.
876

    
877
@item [3]
878
@url{http://bochs.sourceforge.net/}, the Bochs IA-32 Emulator Project,
879
by Kevin Lawton et al.
880

    
881
@item [4]
882
@url{http://www.cs.rose-hulman.edu/~donaldlf/em86/index.html}, the EM86
883
x86 emulator on Alpha-Linux.
884

    
885
@item [5]
886
@url{http://www.usenix.org/publications/library/proceedings/usenix-nt97/full_papers/chernoff/chernoff.pdf},
887
DIGITAL FX!32: Running 32-Bit x86 Applications on Alpha NT, by Anton
888
Chernoff and Ray Hookway.
889

    
890
@item [6]
891
@url{http://www.willows.com/}, Windows API library emulation from
892
Willows Software.
893

    
894
@item [7]
895
@url{http://user-mode-linux.sourceforge.net/}, 
896
The User-mode Linux Kernel.
897

    
898
@item [8]
899
@url{http://www.plex86.org/}, 
900
The new Plex86 project.
901

    
902
@end table
903

    
904
@chapter Regression Tests
905

    
906
In the directory @file{tests/}, various interesting testing programs
907
are available. There are used for regression testing.
908

    
909
@section @file{hello-i386}
910

    
911
Very simple statically linked x86 program, just to test QEMU during a
912
port to a new host CPU.
913

    
914
@section @file{hello-arm}
915

    
916
Very simple statically linked ARM program, just to test QEMU during a
917
port to a new host CPU.
918

    
919
@section @file{test-i386}
920

    
921
This program executes most of the 16 bit and 32 bit x86 instructions and
922
generates a text output. It can be compared with the output obtained with
923
a real CPU or another emulator. The target @code{make test} runs this
924
program and a @code{diff} on the generated output.
925

    
926
The Linux system call @code{modify_ldt()} is used to create x86 selectors
927
to test some 16 bit addressing and 32 bit with segmentation cases.
928

    
929
The Linux system call @code{vm86()} is used to test vm86 emulation.
930

    
931
Various exceptions are raised to test most of the x86 user space
932
exception reporting.
933

    
934
@section @file{sha1}
935

    
936
It is a simple benchmark. Care must be taken to interpret the results
937
because it mostly tests the ability of the virtual CPU to optimize the
938
@code{rol} x86 instruction and the condition code computations.
939