Statistics
| Branch: | Revision:

root / tcg / README @ 5ff9d6a4

History | View | Annotate | Download (11.7 kB)

1 c896fe29 bellard
Tiny Code Generator - Fabrice Bellard.
2 c896fe29 bellard
3 c896fe29 bellard
1) Introduction
4 c896fe29 bellard
5 c896fe29 bellard
TCG (Tiny Code Generator) began as a generic backend for a C
6 c896fe29 bellard
compiler. It was simplified to be used in QEMU. It also has its roots
7 c896fe29 bellard
in the QOP code generator written by Paul Brook. 
8 c896fe29 bellard
9 c896fe29 bellard
2) Definitions
10 c896fe29 bellard
11 c896fe29 bellard
The TCG "target" is the architecture for which we generate the
12 c896fe29 bellard
code. It is of course not the same as the "target" of QEMU which is
13 c896fe29 bellard
the emulated architecture. As TCG started as a generic C backend used
14 c896fe29 bellard
for cross compiling, it is assumed that the TCG target is different
15 c896fe29 bellard
from the host, although it is never the case for QEMU.
16 c896fe29 bellard
17 c896fe29 bellard
A TCG "function" corresponds to a QEMU Translated Block (TB).
18 c896fe29 bellard
19 c896fe29 bellard
A TCG "temporary" is a variable only live in a given
20 9804c8e2 bellard
function. Temporaries are allocated explicitly in each function.
21 c896fe29 bellard
22 c896fe29 bellard
A TCG "global" is a variable which is live in all the functions. They
23 c896fe29 bellard
are defined before the functions defined. A TCG global can be a memory
24 c896fe29 bellard
location (e.g. a QEMU CPU register), a fixed host register (e.g. the
25 c896fe29 bellard
QEMU CPU state pointer) or a memory location which is stored in a
26 c896fe29 bellard
register outside QEMU TBs (not implemented yet).
27 c896fe29 bellard
28 c896fe29 bellard
A TCG "basic block" corresponds to a list of instructions terminated
29 c896fe29 bellard
by a branch instruction. 
30 c896fe29 bellard
31 c896fe29 bellard
3) Intermediate representation
32 c896fe29 bellard
33 c896fe29 bellard
3.1) Introduction
34 c896fe29 bellard
35 c896fe29 bellard
TCG instructions operate on variables which are temporaries or
36 c896fe29 bellard
globals. TCG instructions and variables are strongly typed. Two types
37 c896fe29 bellard
are supported: 32 bit integers and 64 bit integers. Pointers are
38 c896fe29 bellard
defined as an alias to 32 bit or 64 bit integers depending on the TCG
39 c896fe29 bellard
target word size.
40 c896fe29 bellard
41 c896fe29 bellard
Each instruction has a fixed number of output variable operands, input
42 c896fe29 bellard
variable operands and always constant operands.
43 c896fe29 bellard
44 c896fe29 bellard
The notable exception is the call instruction which has a variable
45 c896fe29 bellard
number of outputs and inputs.
46 c896fe29 bellard
47 c896fe29 bellard
In the textual form, output operands come first, followed by input
48 c896fe29 bellard
operands, followed by constant operands. The output type is included
49 c896fe29 bellard
in the instruction name. Constants are prefixed with a '$'.
50 c896fe29 bellard
51 c896fe29 bellard
add_i32 t0, t1, t2  (t0 <- t1 + t2)
52 c896fe29 bellard
53 c896fe29 bellard
sub_i64 t2, t3, $4  (t2 <- t3 - 4)
54 c896fe29 bellard
55 c896fe29 bellard
3.2) Assumptions
56 c896fe29 bellard
57 c896fe29 bellard
* Basic blocks
58 c896fe29 bellard
59 c896fe29 bellard
- Basic blocks end after branches (e.g. brcond_i32 instruction),
60 c896fe29 bellard
  goto_tb and exit_tb instructions.
61 c896fe29 bellard
- Basic blocks end before legacy dyngen operations.
62 c896fe29 bellard
- Basic blocks start after the end of a previous basic block, at a
63 c896fe29 bellard
  set_label instruction or after a legacy dyngen operation.
64 c896fe29 bellard
65 c896fe29 bellard
After the end of a basic block, temporaries at destroyed and globals
66 c896fe29 bellard
are stored at their initial storage (register or memory place
67 c896fe29 bellard
depending on their declarations).
68 c896fe29 bellard
69 c896fe29 bellard
* Floating point types are not supported yet
70 c896fe29 bellard
71 c896fe29 bellard
* Pointers: depending on the TCG target, pointer size is 32 bit or 64
72 c896fe29 bellard
  bit. The type TCG_TYPE_PTR is an alias to TCG_TYPE_I32 or
73 c896fe29 bellard
  TCG_TYPE_I64.
74 c896fe29 bellard
75 c896fe29 bellard
* Helpers:
76 c896fe29 bellard
77 c896fe29 bellard
Using the tcg_gen_helper_x_y it is possible to call any function
78 c896fe29 bellard
taking i32, i64 or pointer types types. Before calling an helper, all
79 c896fe29 bellard
globals are stored at their canonical location and it is assumed that
80 c896fe29 bellard
the function can modify them. In the future, function modifiers will
81 c896fe29 bellard
be allowed to tell that the helper does not read or write some globals.
82 c896fe29 bellard
83 c896fe29 bellard
On some TCG targets (e.g. x86), several calling conventions are
84 c896fe29 bellard
supported.
85 c896fe29 bellard
86 c896fe29 bellard
* Branches:
87 c896fe29 bellard
88 c896fe29 bellard
Use the instruction 'br' to jump to a label. Use 'jmp' to jump to an
89 c896fe29 bellard
explicit address. Conditional branches can only jump to labels.
90 c896fe29 bellard
91 c896fe29 bellard
3.3) Code Optimizations
92 c896fe29 bellard
93 c896fe29 bellard
When generating instructions, you can count on at least the following
94 c896fe29 bellard
optimizations:
95 c896fe29 bellard
96 c896fe29 bellard
- Single instructions are simplified, e.g.
97 c896fe29 bellard
98 c896fe29 bellard
   and_i32 t0, t0, $0xffffffff
99 c896fe29 bellard
    
100 c896fe29 bellard
  is suppressed.
101 c896fe29 bellard
102 c896fe29 bellard
- A liveness analysis is done at the basic block level. The
103 c896fe29 bellard
  information is used to suppress moves from a dead temporary to
104 c896fe29 bellard
  another one. It is also used to remove instructions which compute
105 c896fe29 bellard
  dead results. The later is especially useful for condition code
106 9804c8e2 bellard
  optimization in QEMU.
107 c896fe29 bellard
108 c896fe29 bellard
  In the following example:
109 c896fe29 bellard
110 c896fe29 bellard
  add_i32 t0, t1, t2
111 c896fe29 bellard
  add_i32 t0, t0, $1
112 c896fe29 bellard
  mov_i32 t0, $1
113 c896fe29 bellard
114 c896fe29 bellard
  only the last instruction is kept.
115 c896fe29 bellard
116 c896fe29 bellard
- A macro system is supported (may get closer to function inlining
117 c896fe29 bellard
  some day). It is useful if the liveness analysis is likely to prove
118 c896fe29 bellard
  that some results of a computation are indeed not useful. With the
119 c896fe29 bellard
  macro system, the user can provide several alternative
120 c896fe29 bellard
  implementations which are used depending on the used results. It is
121 9804c8e2 bellard
  especially useful for condition code optimization in QEMU.
122 c896fe29 bellard
123 c896fe29 bellard
  Here is an example:
124 c896fe29 bellard
125 c896fe29 bellard
  macro_2 t0, t1, $1
126 c896fe29 bellard
  mov_i32 t0, $0x1234
127 c896fe29 bellard
128 c896fe29 bellard
  The macro identified by the ID "$1" normally returns the values t0
129 c896fe29 bellard
  and t1. Suppose its implementation is:
130 c896fe29 bellard
131 c896fe29 bellard
  macro_start
132 c896fe29 bellard
  brcond_i32  t2, $0, $TCG_COND_EQ, $1
133 c896fe29 bellard
  mov_i32 t0, $2
134 c896fe29 bellard
  br $2
135 c896fe29 bellard
  set_label $1
136 c896fe29 bellard
  mov_i32 t0, $3
137 c896fe29 bellard
  set_label $2
138 c896fe29 bellard
  add_i32 t1, t3, t4
139 c896fe29 bellard
  macro_end
140 c896fe29 bellard
  
141 c896fe29 bellard
  If t0 is not used after the macro, the user can provide a simpler
142 c896fe29 bellard
  implementation:
143 c896fe29 bellard
144 c896fe29 bellard
  macro_start
145 c896fe29 bellard
  add_i32 t1, t2, t4
146 c896fe29 bellard
  macro_end
147 c896fe29 bellard
148 c896fe29 bellard
  TCG automatically chooses the right implementation depending on
149 c896fe29 bellard
  which macro outputs are used after it.
150 c896fe29 bellard
151 c896fe29 bellard
  Note that if TCG did more expensive optimizations, macros would be
152 c896fe29 bellard
  less useful. In the previous example a macro is useful because the
153 c896fe29 bellard
  liveness analysis is done on each basic block separately. Hence TCG
154 c896fe29 bellard
  cannot remove the code computing 't0' even if it is not used after
155 c896fe29 bellard
  the first macro implementation.
156 c896fe29 bellard
157 c896fe29 bellard
3.4) Instruction Reference
158 c896fe29 bellard
159 c896fe29 bellard
********* Function call
160 c896fe29 bellard
161 c896fe29 bellard
* call <ret> <params> ptr
162 c896fe29 bellard
163 c896fe29 bellard
call function 'ptr' (pointer type)
164 c896fe29 bellard
165 c896fe29 bellard
<ret> optional 32 bit or 64 bit return value
166 c896fe29 bellard
<params> optional 32 bit or 64 bit parameters
167 c896fe29 bellard
168 c896fe29 bellard
********* Jumps/Labels
169 c896fe29 bellard
170 c896fe29 bellard
* jmp t0
171 c896fe29 bellard
172 c896fe29 bellard
Absolute jump to address t0 (pointer type).
173 c896fe29 bellard
174 c896fe29 bellard
* set_label $label
175 c896fe29 bellard
176 c896fe29 bellard
Define label 'label' at the current program point.
177 c896fe29 bellard
178 c896fe29 bellard
* br $label
179 c896fe29 bellard
180 c896fe29 bellard
Jump to label.
181 c896fe29 bellard
182 c896fe29 bellard
* brcond_i32/i64 cond, t0, t1, label
183 c896fe29 bellard
184 c896fe29 bellard
Conditional jump if t0 cond t1 is true. cond can be:
185 c896fe29 bellard
    TCG_COND_EQ
186 c896fe29 bellard
    TCG_COND_NE
187 c896fe29 bellard
    TCG_COND_LT /* signed */
188 c896fe29 bellard
    TCG_COND_GE /* signed */
189 c896fe29 bellard
    TCG_COND_LE /* signed */
190 c896fe29 bellard
    TCG_COND_GT /* signed */
191 c896fe29 bellard
    TCG_COND_LTU /* unsigned */
192 c896fe29 bellard
    TCG_COND_GEU /* unsigned */
193 c896fe29 bellard
    TCG_COND_LEU /* unsigned */
194 c896fe29 bellard
    TCG_COND_GTU /* unsigned */
195 c896fe29 bellard
196 c896fe29 bellard
********* Arithmetic
197 c896fe29 bellard
198 c896fe29 bellard
* add_i32/i64 t0, t1, t2
199 c896fe29 bellard
200 c896fe29 bellard
t0=t1+t2
201 c896fe29 bellard
202 c896fe29 bellard
* sub_i32/i64 t0, t1, t2
203 c896fe29 bellard
204 c896fe29 bellard
t0=t1-t2
205 c896fe29 bellard
206 c896fe29 bellard
* mul_i32/i64 t0, t1, t2
207 c896fe29 bellard
208 c896fe29 bellard
t0=t1*t2
209 c896fe29 bellard
210 c896fe29 bellard
* div_i32/i64 t0, t1, t2
211 c896fe29 bellard
212 c896fe29 bellard
t0=t1/t2 (signed). Undefined behavior if division by zero or overflow.
213 c896fe29 bellard
214 c896fe29 bellard
* divu_i32/i64 t0, t1, t2
215 c896fe29 bellard
216 c896fe29 bellard
t0=t1/t2 (unsigned). Undefined behavior if division by zero.
217 c896fe29 bellard
218 c896fe29 bellard
* rem_i32/i64 t0, t1, t2
219 c896fe29 bellard
220 c896fe29 bellard
t0=t1%t2 (signed). Undefined behavior if division by zero or overflow.
221 c896fe29 bellard
222 c896fe29 bellard
* remu_i32/i64 t0, t1, t2
223 c896fe29 bellard
224 c896fe29 bellard
t0=t1%t2 (unsigned). Undefined behavior if division by zero.
225 c896fe29 bellard
226 c896fe29 bellard
* and_i32/i64 t0, t1, t2
227 c896fe29 bellard
228 c896fe29 bellard
********* Logical
229 c896fe29 bellard
230 c896fe29 bellard
t0=t1&t2
231 c896fe29 bellard
232 c896fe29 bellard
* or_i32/i64 t0, t1, t2
233 c896fe29 bellard
234 c896fe29 bellard
t0=t1|t2
235 c896fe29 bellard
236 c896fe29 bellard
* xor_i32/i64 t0, t1, t2
237 c896fe29 bellard
238 c896fe29 bellard
t0=t1^t2
239 c896fe29 bellard
240 c896fe29 bellard
* shl_i32/i64 t0, t1, t2
241 c896fe29 bellard
242 c896fe29 bellard
********* Shifts
243 c896fe29 bellard
244 c896fe29 bellard
* shl_i32/i64 t0, t1, t2
245 c896fe29 bellard
246 c896fe29 bellard
t0=t1 << t2. Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
247 c896fe29 bellard
248 c896fe29 bellard
* shr_i32/i64 t0, t1, t2
249 c896fe29 bellard
250 c896fe29 bellard
t0=t1 >> t2 (unsigned). Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
251 c896fe29 bellard
252 c896fe29 bellard
* sar_i32/i64 t0, t1, t2
253 c896fe29 bellard
254 c896fe29 bellard
t0=t1 >> t2 (signed). Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
255 c896fe29 bellard
256 c896fe29 bellard
********* Misc
257 c896fe29 bellard
258 c896fe29 bellard
* mov_i32/i64 t0, t1
259 c896fe29 bellard
260 c896fe29 bellard
t0 = t1
261 c896fe29 bellard
262 c896fe29 bellard
Move t1 to t0 (both operands must have the same type).
263 c896fe29 bellard
264 c896fe29 bellard
* ext8s_i32/i64 t0, t1
265 c896fe29 bellard
ext16s_i32/i64 t0, t1
266 c896fe29 bellard
ext32s_i64 t0, t1
267 c896fe29 bellard
268 c896fe29 bellard
8, 16 or 32 bit sign extension (both operands must have the same type)
269 c896fe29 bellard
270 c896fe29 bellard
* bswap16_i32 t0, t1
271 c896fe29 bellard
272 c896fe29 bellard
16 bit byte swap on a 32 bit value. The two high order bytes must be set
273 c896fe29 bellard
to zero.
274 c896fe29 bellard
275 c896fe29 bellard
* bswap_i32 t0, t1
276 c896fe29 bellard
277 c896fe29 bellard
32 bit byte swap
278 c896fe29 bellard
279 c896fe29 bellard
* bswap_i64 t0, t1
280 c896fe29 bellard
281 c896fe29 bellard
64 bit byte swap
282 c896fe29 bellard
283 5ff9d6a4 bellard
* discard_i32/i64 t0
284 5ff9d6a4 bellard
285 5ff9d6a4 bellard
Indicate that the value of t0 won't be used later. It is useful to
286 5ff9d6a4 bellard
force dead code elimination.
287 5ff9d6a4 bellard
288 c896fe29 bellard
********* Type conversions
289 c896fe29 bellard
290 c896fe29 bellard
* ext_i32_i64 t0, t1
291 c896fe29 bellard
Convert t1 (32 bit) to t0 (64 bit) and does sign extension
292 c896fe29 bellard
293 c896fe29 bellard
* extu_i32_i64 t0, t1
294 c896fe29 bellard
Convert t1 (32 bit) to t0 (64 bit) and does zero extension
295 c896fe29 bellard
296 c896fe29 bellard
* trunc_i64_i32 t0, t1
297 c896fe29 bellard
Truncate t1 (64 bit) to t0 (32 bit)
298 c896fe29 bellard
299 c896fe29 bellard
********* Load/Store
300 c896fe29 bellard
301 c896fe29 bellard
* ld_i32/i64 t0, t1, offset
302 c896fe29 bellard
ld8s_i32/i64 t0, t1, offset
303 c896fe29 bellard
ld8u_i32/i64 t0, t1, offset
304 c896fe29 bellard
ld16s_i32/i64 t0, t1, offset
305 c896fe29 bellard
ld16u_i32/i64 t0, t1, offset
306 c896fe29 bellard
ld32s_i64 t0, t1, offset
307 c896fe29 bellard
ld32u_i64 t0, t1, offset
308 c896fe29 bellard
309 c896fe29 bellard
t0 = read(t1 + offset)
310 c896fe29 bellard
Load 8, 16, 32 or 64 bits with or without sign extension from host memory. 
311 c896fe29 bellard
offset must be a constant.
312 c896fe29 bellard
313 c896fe29 bellard
* st_i32/i64 t0, t1, offset
314 c896fe29 bellard
st8_i32/i64 t0, t1, offset
315 c896fe29 bellard
st16_i32/i64 t0, t1, offset
316 c896fe29 bellard
st32_i64 t0, t1, offset
317 c896fe29 bellard
318 c896fe29 bellard
write(t0, t1 + offset)
319 c896fe29 bellard
Write 8, 16, 32 or 64 bits to host memory.
320 c896fe29 bellard
321 c896fe29 bellard
********* QEMU specific operations
322 c896fe29 bellard
323 c896fe29 bellard
* tb_exit t0
324 c896fe29 bellard
325 c896fe29 bellard
Exit the current TB and return the value t0 (word type).
326 c896fe29 bellard
327 c896fe29 bellard
* goto_tb index
328 c896fe29 bellard
329 c896fe29 bellard
Exit the current TB and jump to the TB index 'index' (constant) if the
330 c896fe29 bellard
current TB was linked to this TB. Otherwise execute the next
331 c896fe29 bellard
instructions.
332 c896fe29 bellard
333 c896fe29 bellard
* qemu_ld_i32/i64 t0, t1, flags
334 c896fe29 bellard
qemu_ld8u_i32/i64 t0, t1, flags
335 c896fe29 bellard
qemu_ld8s_i32/i64 t0, t1, flags
336 c896fe29 bellard
qemu_ld16u_i32/i64 t0, t1, flags
337 c896fe29 bellard
qemu_ld16s_i32/i64 t0, t1, flags
338 c896fe29 bellard
qemu_ld32u_i64 t0, t1, flags
339 c896fe29 bellard
qemu_ld32s_i64 t0, t1, flags
340 c896fe29 bellard
341 c896fe29 bellard
Load data at the QEMU CPU address t1 into t0. t1 has the QEMU CPU
342 c896fe29 bellard
address type. 'flags' contains the QEMU memory index (selects user or
343 c896fe29 bellard
kernel access) for example.
344 c896fe29 bellard
345 c896fe29 bellard
* qemu_st_i32/i64 t0, t1, flags
346 c896fe29 bellard
qemu_st8_i32/i64 t0, t1, flags
347 c896fe29 bellard
qemu_st16_i32/i64 t0, t1, flags
348 c896fe29 bellard
qemu_st32_i64 t0, t1, flags
349 c896fe29 bellard
350 c896fe29 bellard
Store the data t0 at the QEMU CPU Address t1. t1 has the QEMU CPU
351 c896fe29 bellard
address type. 'flags' contains the QEMU memory index (selects user or
352 c896fe29 bellard
kernel access) for example.
353 c896fe29 bellard
354 c896fe29 bellard
Note 1: Some shortcuts are defined when the last operand is known to be
355 c896fe29 bellard
a constant (e.g. addi for add, movi for mov).
356 c896fe29 bellard
357 c896fe29 bellard
Note 2: When using TCG, the opcodes must never be generated directly
358 c896fe29 bellard
as some of them may not be available as "real" opcodes. Always use the
359 c896fe29 bellard
function tcg_gen_xxx(args).
360 c896fe29 bellard
361 c896fe29 bellard
4) Backend
362 c896fe29 bellard
363 c896fe29 bellard
tcg-target.h contains the target specific definitions. tcg-target.c
364 c896fe29 bellard
contains the target specific code.
365 c896fe29 bellard
366 c896fe29 bellard
4.1) Assumptions
367 c896fe29 bellard
368 c896fe29 bellard
The target word size (TCG_TARGET_REG_BITS) is expected to be 32 bit or
369 c896fe29 bellard
64 bit. It is expected that the pointer has the same size as the word.
370 c896fe29 bellard
371 c896fe29 bellard
On a 32 bit target, all 64 bit operations are converted to 32 bits. A
372 c896fe29 bellard
few specific operations must be implemented to allow it (see add2_i32,
373 c896fe29 bellard
sub2_i32, brcond2_i32).
374 c896fe29 bellard
375 c896fe29 bellard
Floating point operations are not supported in this version. A
376 c896fe29 bellard
previous incarnation of the code generator had full support of them,
377 c896fe29 bellard
but it is better to concentrate on integer operations first.
378 c896fe29 bellard
379 c896fe29 bellard
On a 64 bit target, no assumption is made in TCG about the storage of
380 c896fe29 bellard
the 32 bit values in 64 bit registers.
381 c896fe29 bellard
382 c896fe29 bellard
4.2) Constraints
383 c896fe29 bellard
384 c896fe29 bellard
GCC like constraints are used to define the constraints of every
385 c896fe29 bellard
instruction. Memory constraints are not supported in this
386 c896fe29 bellard
version. Aliases are specified in the input operands as for GCC.
387 c896fe29 bellard
388 c896fe29 bellard
A target can define specific register or constant constraints. If an
389 c896fe29 bellard
operation uses a constant input constraint which does not allow all
390 c896fe29 bellard
constants, it must also accept registers in order to have a fallback.
391 c896fe29 bellard
392 c896fe29 bellard
The movi_i32 and movi_i64 operations must accept any constants.
393 c896fe29 bellard
394 c896fe29 bellard
The mov_i32 and mov_i64 operations must accept any registers of the
395 c896fe29 bellard
same type.
396 c896fe29 bellard
397 c896fe29 bellard
The ld/st instructions must accept signed 32 bit constant offsets. It
398 c896fe29 bellard
can be implemented by reserving a specific register to compute the
399 c896fe29 bellard
address if the offset is too big.
400 c896fe29 bellard
401 c896fe29 bellard
The ld/st instructions must accept any destination (ld) or source (st)
402 c896fe29 bellard
register.
403 c896fe29 bellard
404 c896fe29 bellard
4.3) Function call assumptions
405 c896fe29 bellard
406 c896fe29 bellard
- The only supported types for parameters and return value are: 32 and
407 c896fe29 bellard
  64 bit integers and pointer.
408 c896fe29 bellard
- The stack grows downwards.
409 c896fe29 bellard
- The first N parameters are passed in registers.
410 c896fe29 bellard
- The next parameters are passed on the stack by storing them as words.
411 c896fe29 bellard
- Some registers are clobbered during the call. 
412 c896fe29 bellard
- The function can return 0 or 1 value in registers. On a 32 bit
413 c896fe29 bellard
  target, functions must be able to return 2 values in registers for
414 c896fe29 bellard
  64 bit return type.
415 c896fe29 bellard
416 c896fe29 bellard
5) Migration from dyngen to TCG
417 c896fe29 bellard
418 c896fe29 bellard
TCG is backward compatible with QEMU "dyngen" operations. It means
419 c896fe29 bellard
that TCG instructions can be freely mixed with dyngen operations. It
420 c896fe29 bellard
is expected that QEMU targets will be progressively fully converted to
421 9804c8e2 bellard
TCG. Once a target is fully converted to TCG, it will be possible
422 c896fe29 bellard
to apply more optimizations because more registers will be free for
423 c896fe29 bellard
the generated code.
424 c896fe29 bellard
425 c896fe29 bellard
The exception model is the same as the dyngen one.