Statistics
| Branch: | Revision:

root / tcg / README @ 5ff9d6a4

History | View | Annotate | Download (11.7 kB)

1
Tiny Code Generator - Fabrice Bellard.
2

    
3
1) Introduction
4

    
5
TCG (Tiny Code Generator) began as a generic backend for a C
6
compiler. It was simplified to be used in QEMU. It also has its roots
7
in the QOP code generator written by Paul Brook. 
8

    
9
2) Definitions
10

    
11
The TCG "target" is the architecture for which we generate the
12
code. It is of course not the same as the "target" of QEMU which is
13
the emulated architecture. As TCG started as a generic C backend used
14
for cross compiling, it is assumed that the TCG target is different
15
from the host, although it is never the case for QEMU.
16

    
17
A TCG "function" corresponds to a QEMU Translated Block (TB).
18

    
19
A TCG "temporary" is a variable only live in a given
20
function. Temporaries are allocated explicitly in each function.
21

    
22
A TCG "global" is a variable which is live in all the functions. They
23
are defined before the functions defined. A TCG global can be a memory
24
location (e.g. a QEMU CPU register), a fixed host register (e.g. the
25
QEMU CPU state pointer) or a memory location which is stored in a
26
register outside QEMU TBs (not implemented yet).
27

    
28
A TCG "basic block" corresponds to a list of instructions terminated
29
by a branch instruction. 
30

    
31
3) Intermediate representation
32

    
33
3.1) Introduction
34

    
35
TCG instructions operate on variables which are temporaries or
36
globals. TCG instructions and variables are strongly typed. Two types
37
are supported: 32 bit integers and 64 bit integers. Pointers are
38
defined as an alias to 32 bit or 64 bit integers depending on the TCG
39
target word size.
40

    
41
Each instruction has a fixed number of output variable operands, input
42
variable operands and always constant operands.
43

    
44
The notable exception is the call instruction which has a variable
45
number of outputs and inputs.
46

    
47
In the textual form, output operands come first, followed by input
48
operands, followed by constant operands. The output type is included
49
in the instruction name. Constants are prefixed with a '$'.
50

    
51
add_i32 t0, t1, t2  (t0 <- t1 + t2)
52

    
53
sub_i64 t2, t3, $4  (t2 <- t3 - 4)
54

    
55
3.2) Assumptions
56

    
57
* Basic blocks
58

    
59
- Basic blocks end after branches (e.g. brcond_i32 instruction),
60
  goto_tb and exit_tb instructions.
61
- Basic blocks end before legacy dyngen operations.
62
- Basic blocks start after the end of a previous basic block, at a
63
  set_label instruction or after a legacy dyngen operation.
64

    
65
After the end of a basic block, temporaries at destroyed and globals
66
are stored at their initial storage (register or memory place
67
depending on their declarations).
68

    
69
* Floating point types are not supported yet
70

    
71
* Pointers: depending on the TCG target, pointer size is 32 bit or 64
72
  bit. The type TCG_TYPE_PTR is an alias to TCG_TYPE_I32 or
73
  TCG_TYPE_I64.
74

    
75
* Helpers:
76

    
77
Using the tcg_gen_helper_x_y it is possible to call any function
78
taking i32, i64 or pointer types types. Before calling an helper, all
79
globals are stored at their canonical location and it is assumed that
80
the function can modify them. In the future, function modifiers will
81
be allowed to tell that the helper does not read or write some globals.
82

    
83
On some TCG targets (e.g. x86), several calling conventions are
84
supported.
85

    
86
* Branches:
87

    
88
Use the instruction 'br' to jump to a label. Use 'jmp' to jump to an
89
explicit address. Conditional branches can only jump to labels.
90

    
91
3.3) Code Optimizations
92

    
93
When generating instructions, you can count on at least the following
94
optimizations:
95

    
96
- Single instructions are simplified, e.g.
97

    
98
   and_i32 t0, t0, $0xffffffff
99
    
100
  is suppressed.
101

    
102
- A liveness analysis is done at the basic block level. The
103
  information is used to suppress moves from a dead temporary to
104
  another one. It is also used to remove instructions which compute
105
  dead results. The later is especially useful for condition code
106
  optimization in QEMU.
107

    
108
  In the following example:
109

    
110
  add_i32 t0, t1, t2
111
  add_i32 t0, t0, $1
112
  mov_i32 t0, $1
113

    
114
  only the last instruction is kept.
115

    
116
- A macro system is supported (may get closer to function inlining
117
  some day). It is useful if the liveness analysis is likely to prove
118
  that some results of a computation are indeed not useful. With the
119
  macro system, the user can provide several alternative
120
  implementations which are used depending on the used results. It is
121
  especially useful for condition code optimization in QEMU.
122

    
123
  Here is an example:
124

    
125
  macro_2 t0, t1, $1
126
  mov_i32 t0, $0x1234
127

    
128
  The macro identified by the ID "$1" normally returns the values t0
129
  and t1. Suppose its implementation is:
130

    
131
  macro_start
132
  brcond_i32  t2, $0, $TCG_COND_EQ, $1
133
  mov_i32 t0, $2
134
  br $2
135
  set_label $1
136
  mov_i32 t0, $3
137
  set_label $2
138
  add_i32 t1, t3, t4
139
  macro_end
140
  
141
  If t0 is not used after the macro, the user can provide a simpler
142
  implementation:
143

    
144
  macro_start
145
  add_i32 t1, t2, t4
146
  macro_end
147

    
148
  TCG automatically chooses the right implementation depending on
149
  which macro outputs are used after it.
150

    
151
  Note that if TCG did more expensive optimizations, macros would be
152
  less useful. In the previous example a macro is useful because the
153
  liveness analysis is done on each basic block separately. Hence TCG
154
  cannot remove the code computing 't0' even if it is not used after
155
  the first macro implementation.
156

    
157
3.4) Instruction Reference
158

    
159
********* Function call
160

    
161
* call <ret> <params> ptr
162

    
163
call function 'ptr' (pointer type)
164

    
165
<ret> optional 32 bit or 64 bit return value
166
<params> optional 32 bit or 64 bit parameters
167

    
168
********* Jumps/Labels
169

    
170
* jmp t0
171

    
172
Absolute jump to address t0 (pointer type).
173

    
174
* set_label $label
175

    
176
Define label 'label' at the current program point.
177

    
178
* br $label
179

    
180
Jump to label.
181

    
182
* brcond_i32/i64 cond, t0, t1, label
183

    
184
Conditional jump if t0 cond t1 is true. cond can be:
185
    TCG_COND_EQ
186
    TCG_COND_NE
187
    TCG_COND_LT /* signed */
188
    TCG_COND_GE /* signed */
189
    TCG_COND_LE /* signed */
190
    TCG_COND_GT /* signed */
191
    TCG_COND_LTU /* unsigned */
192
    TCG_COND_GEU /* unsigned */
193
    TCG_COND_LEU /* unsigned */
194
    TCG_COND_GTU /* unsigned */
195

    
196
********* Arithmetic
197

    
198
* add_i32/i64 t0, t1, t2
199

    
200
t0=t1+t2
201

    
202
* sub_i32/i64 t0, t1, t2
203

    
204
t0=t1-t2
205

    
206
* mul_i32/i64 t0, t1, t2
207

    
208
t0=t1*t2
209

    
210
* div_i32/i64 t0, t1, t2
211

    
212
t0=t1/t2 (signed). Undefined behavior if division by zero or overflow.
213

    
214
* divu_i32/i64 t0, t1, t2
215

    
216
t0=t1/t2 (unsigned). Undefined behavior if division by zero.
217

    
218
* rem_i32/i64 t0, t1, t2
219

    
220
t0=t1%t2 (signed). Undefined behavior if division by zero or overflow.
221

    
222
* remu_i32/i64 t0, t1, t2
223

    
224
t0=t1%t2 (unsigned). Undefined behavior if division by zero.
225

    
226
* and_i32/i64 t0, t1, t2
227

    
228
********* Logical
229

    
230
t0=t1&t2
231

    
232
* or_i32/i64 t0, t1, t2
233

    
234
t0=t1|t2
235

    
236
* xor_i32/i64 t0, t1, t2
237

    
238
t0=t1^t2
239

    
240
* shl_i32/i64 t0, t1, t2
241

    
242
********* Shifts
243

    
244
* shl_i32/i64 t0, t1, t2
245

    
246
t0=t1 << t2. Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
247

    
248
* shr_i32/i64 t0, t1, t2
249

    
250
t0=t1 >> t2 (unsigned). Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
251

    
252
* sar_i32/i64 t0, t1, t2
253

    
254
t0=t1 >> t2 (signed). Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
255

    
256
********* Misc
257

    
258
* mov_i32/i64 t0, t1
259

    
260
t0 = t1
261

    
262
Move t1 to t0 (both operands must have the same type).
263

    
264
* ext8s_i32/i64 t0, t1
265
ext16s_i32/i64 t0, t1
266
ext32s_i64 t0, t1
267

    
268
8, 16 or 32 bit sign extension (both operands must have the same type)
269

    
270
* bswap16_i32 t0, t1
271

    
272
16 bit byte swap on a 32 bit value. The two high order bytes must be set
273
to zero.
274

    
275
* bswap_i32 t0, t1
276

    
277
32 bit byte swap
278

    
279
* bswap_i64 t0, t1
280

    
281
64 bit byte swap
282

    
283
* discard_i32/i64 t0
284

    
285
Indicate that the value of t0 won't be used later. It is useful to
286
force dead code elimination.
287

    
288
********* Type conversions
289

    
290
* ext_i32_i64 t0, t1
291
Convert t1 (32 bit) to t0 (64 bit) and does sign extension
292

    
293
* extu_i32_i64 t0, t1
294
Convert t1 (32 bit) to t0 (64 bit) and does zero extension
295

    
296
* trunc_i64_i32 t0, t1
297
Truncate t1 (64 bit) to t0 (32 bit)
298

    
299
********* Load/Store
300

    
301
* ld_i32/i64 t0, t1, offset
302
ld8s_i32/i64 t0, t1, offset
303
ld8u_i32/i64 t0, t1, offset
304
ld16s_i32/i64 t0, t1, offset
305
ld16u_i32/i64 t0, t1, offset
306
ld32s_i64 t0, t1, offset
307
ld32u_i64 t0, t1, offset
308

    
309
t0 = read(t1 + offset)
310
Load 8, 16, 32 or 64 bits with or without sign extension from host memory. 
311
offset must be a constant.
312

    
313
* st_i32/i64 t0, t1, offset
314
st8_i32/i64 t0, t1, offset
315
st16_i32/i64 t0, t1, offset
316
st32_i64 t0, t1, offset
317

    
318
write(t0, t1 + offset)
319
Write 8, 16, 32 or 64 bits to host memory.
320

    
321
********* QEMU specific operations
322

    
323
* tb_exit t0
324

    
325
Exit the current TB and return the value t0 (word type).
326

    
327
* goto_tb index
328

    
329
Exit the current TB and jump to the TB index 'index' (constant) if the
330
current TB was linked to this TB. Otherwise execute the next
331
instructions.
332

    
333
* qemu_ld_i32/i64 t0, t1, flags
334
qemu_ld8u_i32/i64 t0, t1, flags
335
qemu_ld8s_i32/i64 t0, t1, flags
336
qemu_ld16u_i32/i64 t0, t1, flags
337
qemu_ld16s_i32/i64 t0, t1, flags
338
qemu_ld32u_i64 t0, t1, flags
339
qemu_ld32s_i64 t0, t1, flags
340

    
341
Load data at the QEMU CPU address t1 into t0. t1 has the QEMU CPU
342
address type. 'flags' contains the QEMU memory index (selects user or
343
kernel access) for example.
344

    
345
* qemu_st_i32/i64 t0, t1, flags
346
qemu_st8_i32/i64 t0, t1, flags
347
qemu_st16_i32/i64 t0, t1, flags
348
qemu_st32_i64 t0, t1, flags
349

    
350
Store the data t0 at the QEMU CPU Address t1. t1 has the QEMU CPU
351
address type. 'flags' contains the QEMU memory index (selects user or
352
kernel access) for example.
353

    
354
Note 1: Some shortcuts are defined when the last operand is known to be
355
a constant (e.g. addi for add, movi for mov).
356

    
357
Note 2: When using TCG, the opcodes must never be generated directly
358
as some of them may not be available as "real" opcodes. Always use the
359
function tcg_gen_xxx(args).
360

    
361
4) Backend
362

    
363
tcg-target.h contains the target specific definitions. tcg-target.c
364
contains the target specific code.
365

    
366
4.1) Assumptions
367

    
368
The target word size (TCG_TARGET_REG_BITS) is expected to be 32 bit or
369
64 bit. It is expected that the pointer has the same size as the word.
370

    
371
On a 32 bit target, all 64 bit operations are converted to 32 bits. A
372
few specific operations must be implemented to allow it (see add2_i32,
373
sub2_i32, brcond2_i32).
374

    
375
Floating point operations are not supported in this version. A
376
previous incarnation of the code generator had full support of them,
377
but it is better to concentrate on integer operations first.
378

    
379
On a 64 bit target, no assumption is made in TCG about the storage of
380
the 32 bit values in 64 bit registers.
381

    
382
4.2) Constraints
383

    
384
GCC like constraints are used to define the constraints of every
385
instruction. Memory constraints are not supported in this
386
version. Aliases are specified in the input operands as for GCC.
387

    
388
A target can define specific register or constant constraints. If an
389
operation uses a constant input constraint which does not allow all
390
constants, it must also accept registers in order to have a fallback.
391

    
392
The movi_i32 and movi_i64 operations must accept any constants.
393

    
394
The mov_i32 and mov_i64 operations must accept any registers of the
395
same type.
396

    
397
The ld/st instructions must accept signed 32 bit constant offsets. It
398
can be implemented by reserving a specific register to compute the
399
address if the offset is too big.
400

    
401
The ld/st instructions must accept any destination (ld) or source (st)
402
register.
403

    
404
4.3) Function call assumptions
405

    
406
- The only supported types for parameters and return value are: 32 and
407
  64 bit integers and pointer.
408
- The stack grows downwards.
409
- The first N parameters are passed in registers.
410
- The next parameters are passed on the stack by storing them as words.
411
- Some registers are clobbered during the call. 
412
- The function can return 0 or 1 value in registers. On a 32 bit
413
  target, functions must be able to return 2 values in registers for
414
  64 bit return type.
415

    
416
5) Migration from dyngen to TCG
417

    
418
TCG is backward compatible with QEMU "dyngen" operations. It means
419
that TCG instructions can be freely mixed with dyngen operations. It
420
is expected that QEMU targets will be progressively fully converted to
421
TCG. Once a target is fully converted to TCG, it will be possible
422
to apply more optimizations because more registers will be free for
423
the generated code.
424

    
425
The exception model is the same as the dyngen one.