Revision c896fe29
b/tcg/LICENSE | ||
---|---|---|
1 |
All the files in this directory and subdirectories are released under |
|
2 |
a BSD like license (see header in each file). No other license is |
|
3 |
accepted. |
b/tcg/README | ||
---|---|---|
1 |
Tiny Code Generator - Fabrice Bellard. |
|
2 |
|
|
3 |
1) Introduction |
|
4 |
|
|
5 |
TCG (Tiny Code Generator) began as a generic backend for a C |
|
6 |
compiler. It was simplified to be used in QEMU. It also has its roots |
|
7 |
in the QOP code generator written by Paul Brook. |
|
8 |
|
|
9 |
2) Definitions |
|
10 |
|
|
11 |
The TCG "target" is the architecture for which we generate the |
|
12 |
code. It is of course not the same as the "target" of QEMU which is |
|
13 |
the emulated architecture. As TCG started as a generic C backend used |
|
14 |
for cross compiling, it is assumed that the TCG target is different |
|
15 |
from the host, although it is never the case for QEMU. |
|
16 |
|
|
17 |
A TCG "function" corresponds to a QEMU Translated Block (TB). |
|
18 |
|
|
19 |
A TCG "temporary" is a variable only live in a given |
|
20 |
function. Temporaries are allocated explicitely in each function. |
|
21 |
|
|
22 |
A TCG "global" is a variable which is live in all the functions. They |
|
23 |
are defined before the functions defined. A TCG global can be a memory |
|
24 |
location (e.g. a QEMU CPU register), a fixed host register (e.g. the |
|
25 |
QEMU CPU state pointer) or a memory location which is stored in a |
|
26 |
register outside QEMU TBs (not implemented yet). |
|
27 |
|
|
28 |
A TCG "basic block" corresponds to a list of instructions terminated |
|
29 |
by a branch instruction. |
|
30 |
|
|
31 |
3) Intermediate representation |
|
32 |
|
|
33 |
3.1) Introduction |
|
34 |
|
|
35 |
TCG instructions operate on variables which are temporaries or |
|
36 |
globals. TCG instructions and variables are strongly typed. Two types |
|
37 |
are supported: 32 bit integers and 64 bit integers. Pointers are |
|
38 |
defined as an alias to 32 bit or 64 bit integers depending on the TCG |
|
39 |
target word size. |
|
40 |
|
|
41 |
Each instruction has a fixed number of output variable operands, input |
|
42 |
variable operands and always constant operands. |
|
43 |
|
|
44 |
The notable exception is the call instruction which has a variable |
|
45 |
number of outputs and inputs. |
|
46 |
|
|
47 |
In the textual form, output operands come first, followed by input |
|
48 |
operands, followed by constant operands. The output type is included |
|
49 |
in the instruction name. Constants are prefixed with a '$'. |
|
50 |
|
|
51 |
add_i32 t0, t1, t2 (t0 <- t1 + t2) |
|
52 |
|
|
53 |
sub_i64 t2, t3, $4 (t2 <- t3 - 4) |
|
54 |
|
|
55 |
3.2) Assumptions |
|
56 |
|
|
57 |
* Basic blocks |
|
58 |
|
|
59 |
- Basic blocks end after branches (e.g. brcond_i32 instruction), |
|
60 |
goto_tb and exit_tb instructions. |
|
61 |
- Basic blocks end before legacy dyngen operations. |
|
62 |
- Basic blocks start after the end of a previous basic block, at a |
|
63 |
set_label instruction or after a legacy dyngen operation. |
|
64 |
|
|
65 |
After the end of a basic block, temporaries at destroyed and globals |
|
66 |
are stored at their initial storage (register or memory place |
|
67 |
depending on their declarations). |
|
68 |
|
|
69 |
* Floating point types are not supported yet |
|
70 |
|
|
71 |
* Pointers: depending on the TCG target, pointer size is 32 bit or 64 |
|
72 |
bit. The type TCG_TYPE_PTR is an alias to TCG_TYPE_I32 or |
|
73 |
TCG_TYPE_I64. |
|
74 |
|
|
75 |
* Helpers: |
|
76 |
|
|
77 |
Using the tcg_gen_helper_x_y it is possible to call any function |
|
78 |
taking i32, i64 or pointer types types. Before calling an helper, all |
|
79 |
globals are stored at their canonical location and it is assumed that |
|
80 |
the function can modify them. In the future, function modifiers will |
|
81 |
be allowed to tell that the helper does not read or write some globals. |
|
82 |
|
|
83 |
On some TCG targets (e.g. x86), several calling conventions are |
|
84 |
supported. |
|
85 |
|
|
86 |
* Branches: |
|
87 |
|
|
88 |
Use the instruction 'br' to jump to a label. Use 'jmp' to jump to an |
|
89 |
explicit address. Conditional branches can only jump to labels. |
|
90 |
|
|
91 |
3.3) Code Optimizations |
|
92 |
|
|
93 |
When generating instructions, you can count on at least the following |
|
94 |
optimizations: |
|
95 |
|
|
96 |
- Single instructions are simplified, e.g. |
|
97 |
|
|
98 |
and_i32 t0, t0, $0xffffffff |
|
99 |
|
|
100 |
is suppressed. |
|
101 |
|
|
102 |
- A liveness analysis is done at the basic block level. The |
|
103 |
information is used to suppress moves from a dead temporary to |
|
104 |
another one. It is also used to remove instructions which compute |
|
105 |
dead results. The later is especially useful for condition code |
|
106 |
optimisation in QEMU. |
|
107 |
|
|
108 |
In the following example: |
|
109 |
|
|
110 |
add_i32 t0, t1, t2 |
|
111 |
add_i32 t0, t0, $1 |
|
112 |
mov_i32 t0, $1 |
|
113 |
|
|
114 |
only the last instruction is kept. |
|
115 |
|
|
116 |
- A macro system is supported (may get closer to function inlining |
|
117 |
some day). It is useful if the liveness analysis is likely to prove |
|
118 |
that some results of a computation are indeed not useful. With the |
|
119 |
macro system, the user can provide several alternative |
|
120 |
implementations which are used depending on the used results. It is |
|
121 |
especially useful for condition code optimisation in QEMU. |
|
122 |
|
|
123 |
Here is an example: |
|
124 |
|
|
125 |
macro_2 t0, t1, $1 |
|
126 |
mov_i32 t0, $0x1234 |
|
127 |
|
|
128 |
The macro identified by the ID "$1" normally returns the values t0 |
|
129 |
and t1. Suppose its implementation is: |
|
130 |
|
|
131 |
macro_start |
|
132 |
brcond_i32 t2, $0, $TCG_COND_EQ, $1 |
|
133 |
mov_i32 t0, $2 |
|
134 |
br $2 |
|
135 |
set_label $1 |
|
136 |
mov_i32 t0, $3 |
|
137 |
set_label $2 |
|
138 |
add_i32 t1, t3, t4 |
|
139 |
macro_end |
|
140 |
|
|
141 |
If t0 is not used after the macro, the user can provide a simpler |
|
142 |
implementation: |
|
143 |
|
|
144 |
macro_start |
|
145 |
add_i32 t1, t2, t4 |
|
146 |
macro_end |
|
147 |
|
|
148 |
TCG automatically chooses the right implementation depending on |
|
149 |
which macro outputs are used after it. |
|
150 |
|
|
151 |
Note that if TCG did more expensive optimizations, macros would be |
|
152 |
less useful. In the previous example a macro is useful because the |
|
153 |
liveness analysis is done on each basic block separately. Hence TCG |
|
154 |
cannot remove the code computing 't0' even if it is not used after |
|
155 |
the first macro implementation. |
|
156 |
|
|
157 |
3.4) Instruction Reference |
|
158 |
|
|
159 |
********* Function call |
|
160 |
|
|
161 |
* call <ret> <params> ptr |
|
162 |
|
|
163 |
call function 'ptr' (pointer type) |
|
164 |
|
|
165 |
<ret> optional 32 bit or 64 bit return value |
|
166 |
<params> optional 32 bit or 64 bit parameters |
|
167 |
|
|
168 |
********* Jumps/Labels |
|
169 |
|
|
170 |
* jmp t0 |
|
171 |
|
|
172 |
Absolute jump to address t0 (pointer type). |
|
173 |
|
|
174 |
* set_label $label |
|
175 |
|
|
176 |
Define label 'label' at the current program point. |
|
177 |
|
|
178 |
* br $label |
|
179 |
|
|
180 |
Jump to label. |
|
181 |
|
|
182 |
* brcond_i32/i64 cond, t0, t1, label |
|
183 |
|
|
184 |
Conditional jump if t0 cond t1 is true. cond can be: |
|
185 |
TCG_COND_EQ |
|
186 |
TCG_COND_NE |
|
187 |
TCG_COND_LT /* signed */ |
|
188 |
TCG_COND_GE /* signed */ |
|
189 |
TCG_COND_LE /* signed */ |
|
190 |
TCG_COND_GT /* signed */ |
|
191 |
TCG_COND_LTU /* unsigned */ |
|
192 |
TCG_COND_GEU /* unsigned */ |
|
193 |
TCG_COND_LEU /* unsigned */ |
|
194 |
TCG_COND_GTU /* unsigned */ |
|
195 |
|
|
196 |
********* Arithmetic |
|
197 |
|
|
198 |
* add_i32/i64 t0, t1, t2 |
|
199 |
|
|
200 |
t0=t1+t2 |
|
201 |
|
|
202 |
* sub_i32/i64 t0, t1, t2 |
|
203 |
|
|
204 |
t0=t1-t2 |
|
205 |
|
|
206 |
* mul_i32/i64 t0, t1, t2 |
|
207 |
|
|
208 |
t0=t1*t2 |
|
209 |
|
|
210 |
* div_i32/i64 t0, t1, t2 |
|
211 |
|
|
212 |
t0=t1/t2 (signed). Undefined behavior if division by zero or overflow. |
|
213 |
|
|
214 |
* divu_i32/i64 t0, t1, t2 |
|
215 |
|
|
216 |
t0=t1/t2 (unsigned). Undefined behavior if division by zero. |
|
217 |
|
|
218 |
* rem_i32/i64 t0, t1, t2 |
|
219 |
|
|
220 |
t0=t1%t2 (signed). Undefined behavior if division by zero or overflow. |
|
221 |
|
|
222 |
* remu_i32/i64 t0, t1, t2 |
|
223 |
|
|
224 |
t0=t1%t2 (unsigned). Undefined behavior if division by zero. |
|
225 |
|
|
226 |
* and_i32/i64 t0, t1, t2 |
|
227 |
|
|
228 |
********* Logical |
|
229 |
|
|
230 |
t0=t1&t2 |
|
231 |
|
|
232 |
* or_i32/i64 t0, t1, t2 |
|
233 |
|
|
234 |
t0=t1|t2 |
|
235 |
|
|
236 |
* xor_i32/i64 t0, t1, t2 |
|
237 |
|
|
238 |
t0=t1^t2 |
|
239 |
|
|
240 |
* shl_i32/i64 t0, t1, t2 |
|
241 |
|
|
242 |
********* Shifts |
|
243 |
|
|
244 |
* shl_i32/i64 t0, t1, t2 |
|
245 |
|
|
246 |
t0=t1 << t2. Undefined behavior if t2 < 0 or t2 >= 32 (resp 64) |
|
247 |
|
|
248 |
* shr_i32/i64 t0, t1, t2 |
|
249 |
|
|
250 |
t0=t1 >> t2 (unsigned). Undefined behavior if t2 < 0 or t2 >= 32 (resp 64) |
|
251 |
|
|
252 |
* sar_i32/i64 t0, t1, t2 |
|
253 |
|
|
254 |
t0=t1 >> t2 (signed). Undefined behavior if t2 < 0 or t2 >= 32 (resp 64) |
|
255 |
|
|
256 |
********* Misc |
|
257 |
|
|
258 |
* mov_i32/i64 t0, t1 |
|
259 |
|
|
260 |
t0 = t1 |
|
261 |
|
|
262 |
Move t1 to t0 (both operands must have the same type). |
|
263 |
|
|
264 |
* ext8s_i32/i64 t0, t1 |
|
265 |
ext16s_i32/i64 t0, t1 |
|
266 |
ext32s_i64 t0, t1 |
|
267 |
|
|
268 |
8, 16 or 32 bit sign extension (both operands must have the same type) |
|
269 |
|
|
270 |
* bswap16_i32 t0, t1 |
|
271 |
|
|
272 |
16 bit byte swap on a 32 bit value. The two high order bytes must be set |
|
273 |
to zero. |
|
274 |
|
|
275 |
* bswap_i32 t0, t1 |
|
276 |
|
|
277 |
32 bit byte swap |
|
278 |
|
|
279 |
* bswap_i64 t0, t1 |
|
280 |
|
|
281 |
64 bit byte swap |
|
282 |
|
|
283 |
********* Type conversions |
|
284 |
|
|
285 |
* ext_i32_i64 t0, t1 |
|
286 |
Convert t1 (32 bit) to t0 (64 bit) and does sign extension |
|
287 |
|
|
288 |
* extu_i32_i64 t0, t1 |
|
289 |
Convert t1 (32 bit) to t0 (64 bit) and does zero extension |
|
290 |
|
|
291 |
* trunc_i64_i32 t0, t1 |
|
292 |
Truncate t1 (64 bit) to t0 (32 bit) |
|
293 |
|
|
294 |
********* Load/Store |
|
295 |
|
|
296 |
* ld_i32/i64 t0, t1, offset |
|
297 |
ld8s_i32/i64 t0, t1, offset |
|
298 |
ld8u_i32/i64 t0, t1, offset |
|
299 |
ld16s_i32/i64 t0, t1, offset |
|
300 |
ld16u_i32/i64 t0, t1, offset |
|
301 |
ld32s_i64 t0, t1, offset |
|
302 |
ld32u_i64 t0, t1, offset |
|
303 |
|
|
304 |
t0 = read(t1 + offset) |
|
305 |
Load 8, 16, 32 or 64 bits with or without sign extension from host memory. |
|
306 |
offset must be a constant. |
|
307 |
|
|
308 |
* st_i32/i64 t0, t1, offset |
|
309 |
st8_i32/i64 t0, t1, offset |
|
310 |
st16_i32/i64 t0, t1, offset |
|
311 |
st32_i64 t0, t1, offset |
|
312 |
|
|
313 |
write(t0, t1 + offset) |
|
314 |
Write 8, 16, 32 or 64 bits to host memory. |
|
315 |
|
|
316 |
********* QEMU specific operations |
|
317 |
|
|
318 |
* tb_exit t0 |
|
319 |
|
|
320 |
Exit the current TB and return the value t0 (word type). |
|
321 |
|
|
322 |
* goto_tb index |
|
323 |
|
|
324 |
Exit the current TB and jump to the TB index 'index' (constant) if the |
|
325 |
current TB was linked to this TB. Otherwise execute the next |
|
326 |
instructions. |
|
327 |
|
|
328 |
* qemu_ld_i32/i64 t0, t1, flags |
|
329 |
qemu_ld8u_i32/i64 t0, t1, flags |
|
330 |
qemu_ld8s_i32/i64 t0, t1, flags |
|
331 |
qemu_ld16u_i32/i64 t0, t1, flags |
|
332 |
qemu_ld16s_i32/i64 t0, t1, flags |
|
333 |
qemu_ld32u_i64 t0, t1, flags |
|
334 |
qemu_ld32s_i64 t0, t1, flags |
|
335 |
|
|
336 |
Load data at the QEMU CPU address t1 into t0. t1 has the QEMU CPU |
|
337 |
address type. 'flags' contains the QEMU memory index (selects user or |
|
338 |
kernel access) for example. |
|
339 |
|
|
340 |
* qemu_st_i32/i64 t0, t1, flags |
|
341 |
qemu_st8_i32/i64 t0, t1, flags |
|
342 |
qemu_st16_i32/i64 t0, t1, flags |
|
343 |
qemu_st32_i64 t0, t1, flags |
|
344 |
|
|
345 |
Store the data t0 at the QEMU CPU Address t1. t1 has the QEMU CPU |
|
346 |
address type. 'flags' contains the QEMU memory index (selects user or |
|
347 |
kernel access) for example. |
|
348 |
|
|
349 |
Note 1: Some shortcuts are defined when the last operand is known to be |
|
350 |
a constant (e.g. addi for add, movi for mov). |
|
351 |
|
|
352 |
Note 2: When using TCG, the opcodes must never be generated directly |
|
353 |
as some of them may not be available as "real" opcodes. Always use the |
|
354 |
function tcg_gen_xxx(args). |
|
355 |
|
|
356 |
4) Backend |
|
357 |
|
|
358 |
tcg-target.h contains the target specific definitions. tcg-target.c |
|
359 |
contains the target specific code. |
|
360 |
|
|
361 |
4.1) Assumptions |
|
362 |
|
|
363 |
The target word size (TCG_TARGET_REG_BITS) is expected to be 32 bit or |
|
364 |
64 bit. It is expected that the pointer has the same size as the word. |
|
365 |
|
|
366 |
On a 32 bit target, all 64 bit operations are converted to 32 bits. A |
|
367 |
few specific operations must be implemented to allow it (see add2_i32, |
|
368 |
sub2_i32, brcond2_i32). |
|
369 |
|
|
370 |
Floating point operations are not supported in this version. A |
|
371 |
previous incarnation of the code generator had full support of them, |
|
372 |
but it is better to concentrate on integer operations first. |
|
373 |
|
|
374 |
On a 64 bit target, no assumption is made in TCG about the storage of |
|
375 |
the 32 bit values in 64 bit registers. |
|
376 |
|
|
377 |
4.2) Constraints |
|
378 |
|
|
379 |
GCC like constraints are used to define the constraints of every |
|
380 |
instruction. Memory constraints are not supported in this |
|
381 |
version. Aliases are specified in the input operands as for GCC. |
|
382 |
|
|
383 |
A target can define specific register or constant constraints. If an |
|
384 |
operation uses a constant input constraint which does not allow all |
|
385 |
constants, it must also accept registers in order to have a fallback. |
|
386 |
|
|
387 |
The movi_i32 and movi_i64 operations must accept any constants. |
|
388 |
|
|
389 |
The mov_i32 and mov_i64 operations must accept any registers of the |
|
390 |
same type. |
|
391 |
|
|
392 |
The ld/st instructions must accept signed 32 bit constant offsets. It |
|
393 |
can be implemented by reserving a specific register to compute the |
|
394 |
address if the offset is too big. |
|
395 |
|
|
396 |
The ld/st instructions must accept any destination (ld) or source (st) |
|
397 |
register. |
|
398 |
|
|
399 |
4.3) Function call assumptions |
|
400 |
|
|
401 |
- The only supported types for parameters and return value are: 32 and |
|
402 |
64 bit integers and pointer. |
|
403 |
- The stack grows downwards. |
|
404 |
- The first N parameters are passed in registers. |
|
405 |
- The next parameters are passed on the stack by storing them as words. |
|
406 |
- Some registers are clobbered during the call. |
|
407 |
- The function can return 0 or 1 value in registers. On a 32 bit |
|
408 |
target, functions must be able to return 2 values in registers for |
|
409 |
64 bit return type. |
|
410 |
|
|
411 |
5) Migration from dyngen to TCG |
|
412 |
|
|
413 |
TCG is backward compatible with QEMU "dyngen" operations. It means |
|
414 |
that TCG instructions can be freely mixed with dyngen operations. It |
|
415 |
is expected that QEMU targets will be progressively fully converted to |
|
416 |
TCG. Once a target is fully converted to dyngen, it will be possible |
|
417 |
to apply more optimizations because more registers will be free for |
|
418 |
the generated code. |
|
419 |
|
|
420 |
The exception model is the same as the dyngen one. |
b/tcg/TODO | ||
---|---|---|
1 |
- test macro system |
|
2 |
|
|
3 |
- test conditional jumps |
|
4 |
|
|
5 |
- test mul, div, ext8s, ext16s, bswap |
|
6 |
|
|
7 |
- generate a global TB prologue and epilogue to save/restore registers |
|
8 |
to/from the CPU state and to reserve a stack frame to optimize |
|
9 |
helper calls. Modify cpu-exec.c so that it does not use global |
|
10 |
register variables (except maybe for 'env'). |
|
11 |
|
|
12 |
- fully convert the x86 target. The minimal amount of work includes: |
|
13 |
- add cc_src, cc_dst and cc_op as globals |
|
14 |
- disable its eflags optimization (the liveness analysis should |
|
15 |
suffice) |
|
16 |
- move complicated operations to helpers (in particular FPU, SSE, MMX). |
|
17 |
|
|
18 |
- optimize the x86 target: |
|
19 |
- move some or all the registers as globals |
|
20 |
- use the TB prologue and epilogue to have QEMU target registers in |
|
21 |
pre assigned host registers. |
|
22 |
|
|
23 |
Ideas: |
|
24 |
|
|
25 |
- Move the slow part of the qemu_ld/st ops after the end of the TB. |
|
26 |
|
|
27 |
- Experiment: change instruction storage to simplify macro handling |
|
28 |
and to handle dynamic allocation and see if the translation speed is |
|
29 |
OK. |
|
30 |
|
|
31 |
- change exception syntax to get closer to QOP system (exception |
|
32 |
parameters given with a specific instruction). |
b/tcg/i386/tcg-target.c | ||
---|---|---|
1 |
/* |
|
2 |
* Tiny Code Generator for QEMU |
|
3 |
* |
|
4 |
* Copyright (c) 2008 Fabrice Bellard |
|
5 |
* |
|
6 |
* Permission is hereby granted, free of charge, to any person obtaining a copy |
|
7 |
* of this software and associated documentation files (the "Software"), to deal |
|
8 |
* in the Software without restriction, including without limitation the rights |
|
9 |
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell |
|
10 |
* copies of the Software, and to permit persons to whom the Software is |
|
11 |
* furnished to do so, subject to the following conditions: |
|
12 |
* |
|
13 |
* The above copyright notice and this permission notice shall be included in |
|
14 |
* all copies or substantial portions of the Software. |
|
15 |
* |
|
16 |
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR |
|
17 |
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, |
|
18 |
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL |
|
19 |
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER |
|
20 |
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, |
|
21 |
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN |
|
22 |
* THE SOFTWARE. |
|
23 |
*/ |
|
24 |
const char *tcg_target_reg_names[TCG_TARGET_NB_REGS] = { |
|
25 |
"%eax", |
|
26 |
"%ecx", |
|
27 |
"%edx", |
|
28 |
"%ebx", |
|
29 |
"%esp", |
|
30 |
"%ebp", |
|
31 |
"%esi", |
|
32 |
"%edi", |
|
33 |
}; |
|
34 |
|
|
35 |
int tcg_target_reg_alloc_order[TCG_TARGET_NB_REGS] = { |
|
36 |
TCG_REG_EAX, |
|
37 |
TCG_REG_EDX, |
|
38 |
TCG_REG_ECX, |
|
39 |
TCG_REG_EBX, |
|
40 |
TCG_REG_ESI, |
|
41 |
TCG_REG_EDI, |
|
42 |
TCG_REG_EBP, |
|
43 |
TCG_REG_ESP, |
|
44 |
}; |
|
45 |
|
|
46 |
const int tcg_target_call_iarg_regs[3] = { TCG_REG_EAX, TCG_REG_EDX, TCG_REG_ECX }; |
|
47 |
const int tcg_target_call_oarg_regs[2] = { TCG_REG_EAX, TCG_REG_EDX }; |
|
48 |
|
|
49 |
static void patch_reloc(uint8_t *code_ptr, int type, |
|
50 |
tcg_target_long value) |
|
51 |
{ |
|
52 |
switch(type) { |
|
53 |
case R_386_32: |
|
54 |
*(uint32_t *)code_ptr = value; |
|
55 |
break; |
|
56 |
case R_386_PC32: |
|
57 |
*(uint32_t *)code_ptr = value - (long)code_ptr; |
|
58 |
break; |
|
59 |
default: |
|
60 |
tcg_abort(); |
|
61 |
} |
|
62 |
} |
|
63 |
|
|
64 |
/* maximum number of register used for input function arguments */ |
|
65 |
static inline int tcg_target_get_call_iarg_regs_count(int flags) |
|
66 |
{ |
|
67 |
flags &= TCG_CALL_TYPE_MASK; |
|
68 |
switch(flags) { |
|
69 |
case TCG_CALL_TYPE_STD: |
|
70 |
return 0; |
|
71 |
case TCG_CALL_TYPE_REGPARM_1: |
|
72 |
case TCG_CALL_TYPE_REGPARM_2: |
|
73 |
case TCG_CALL_TYPE_REGPARM: |
|
74 |
return flags - TCG_CALL_TYPE_REGPARM_1 + 1; |
|
75 |
default: |
|
76 |
tcg_abort(); |
|
77 |
} |
|
78 |
} |
|
79 |
|
|
80 |
/* parse target specific constraints */ |
|
81 |
int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str) |
|
82 |
{ |
|
83 |
const char *ct_str; |
|
84 |
|
|
85 |
ct_str = *pct_str; |
|
86 |
switch(ct_str[0]) { |
|
87 |
case 'a': |
|
88 |
ct->ct |= TCG_CT_REG; |
|
89 |
tcg_regset_set_reg(ct->u.regs, TCG_REG_EAX); |
|
90 |
break; |
|
91 |
case 'b': |
|
92 |
ct->ct |= TCG_CT_REG; |
|
93 |
tcg_regset_set_reg(ct->u.regs, TCG_REG_EBX); |
|
94 |
break; |
|
95 |
case 'c': |
|
96 |
ct->ct |= TCG_CT_REG; |
|
97 |
tcg_regset_set_reg(ct->u.regs, TCG_REG_ECX); |
|
98 |
break; |
|
99 |
case 'd': |
|
100 |
ct->ct |= TCG_CT_REG; |
|
101 |
tcg_regset_set_reg(ct->u.regs, TCG_REG_EDX); |
|
102 |
break; |
|
103 |
case 'S': |
|
104 |
ct->ct |= TCG_CT_REG; |
|
105 |
tcg_regset_set_reg(ct->u.regs, TCG_REG_ESI); |
|
106 |
break; |
|
107 |
case 'D': |
|
108 |
ct->ct |= TCG_CT_REG; |
|
109 |
tcg_regset_set_reg(ct->u.regs, TCG_REG_EDI); |
|
110 |
break; |
|
111 |
case 'q': |
|
112 |
ct->ct |= TCG_CT_REG; |
|
113 |
tcg_regset_set32(ct->u.regs, 0, 0xf); |
|
114 |
break; |
|
115 |
case 'r': |
|
116 |
ct->ct |= TCG_CT_REG; |
|
117 |
tcg_regset_set32(ct->u.regs, 0, 0xff); |
|
118 |
break; |
|
119 |
|
|
120 |
/* qemu_ld/st address constraint */ |
|
121 |
case 'L': |
|
122 |
ct->ct |= TCG_CT_REG; |
|
123 |
tcg_regset_set32(ct->u.regs, 0, 0xff); |
|
124 |
tcg_regset_reset_reg(ct->u.regs, TCG_REG_EAX); |
|
125 |
tcg_regset_reset_reg(ct->u.regs, TCG_REG_EDX); |
|
126 |
break; |
|
127 |
default: |
|
128 |
return -1; |
|
129 |
} |
|
130 |
ct_str++; |
|
131 |
*pct_str = ct_str; |
|
132 |
return 0; |
|
133 |
} |
|
134 |
|
|
135 |
/* test if a constant matches the constraint */ |
|
136 |
static inline int tcg_target_const_match(tcg_target_long val, |
|
137 |
const TCGArgConstraint *arg_ct) |
|
138 |
{ |
|
139 |
int ct; |
|
140 |
ct = arg_ct->ct; |
|
141 |
if (ct & TCG_CT_CONST) |
|
142 |
return 1; |
|
143 |
else |
|
144 |
return 0; |
|
145 |
} |
|
146 |
|
|
147 |
#define ARITH_ADD 0 |
|
148 |
#define ARITH_OR 1 |
|
149 |
#define ARITH_ADC 2 |
|
150 |
#define ARITH_SBB 3 |
|
151 |
#define ARITH_AND 4 |
|
152 |
#define ARITH_SUB 5 |
|
153 |
#define ARITH_XOR 6 |
|
154 |
#define ARITH_CMP 7 |
|
155 |
|
|
156 |
#define SHIFT_SHL 4 |
|
157 |
#define SHIFT_SHR 5 |
|
158 |
#define SHIFT_SAR 7 |
|
159 |
|
|
160 |
#define JCC_JMP (-1) |
|
161 |
#define JCC_JO 0x0 |
|
162 |
#define JCC_JNO 0x1 |
|
163 |
#define JCC_JB 0x2 |
|
164 |
#define JCC_JAE 0x3 |
|
165 |
#define JCC_JE 0x4 |
|
166 |
#define JCC_JNE 0x5 |
|
167 |
#define JCC_JBE 0x6 |
|
168 |
#define JCC_JA 0x7 |
|
169 |
#define JCC_JS 0x8 |
|
170 |
#define JCC_JNS 0x9 |
|
171 |
#define JCC_JP 0xa |
|
172 |
#define JCC_JNP 0xb |
|
173 |
#define JCC_JL 0xc |
|
174 |
#define JCC_JGE 0xd |
|
175 |
#define JCC_JLE 0xe |
|
176 |
#define JCC_JG 0xf |
|
177 |
|
|
178 |
#define P_EXT 0x100 /* 0x0f opcode prefix */ |
|
179 |
|
|
180 |
static const uint8_t tcg_cond_to_jcc[10] = { |
|
181 |
[TCG_COND_EQ] = JCC_JE, |
|
182 |
[TCG_COND_NE] = JCC_JNE, |
|
183 |
[TCG_COND_LT] = JCC_JL, |
|
184 |
[TCG_COND_GE] = JCC_JGE, |
|
185 |
[TCG_COND_LE] = JCC_JLE, |
|
186 |
[TCG_COND_GT] = JCC_JG, |
|
187 |
[TCG_COND_LTU] = JCC_JB, |
|
188 |
[TCG_COND_GEU] = JCC_JAE, |
|
189 |
[TCG_COND_LEU] = JCC_JBE, |
|
190 |
[TCG_COND_GTU] = JCC_JA, |
|
191 |
}; |
|
192 |
|
|
193 |
static inline void tcg_out_opc(TCGContext *s, int opc) |
|
194 |
{ |
|
195 |
if (opc & P_EXT) |
|
196 |
tcg_out8(s, 0x0f); |
|
197 |
tcg_out8(s, opc); |
|
198 |
} |
|
199 |
|
|
200 |
static inline void tcg_out_modrm(TCGContext *s, int opc, int r, int rm) |
|
201 |
{ |
|
202 |
tcg_out_opc(s, opc); |
|
203 |
tcg_out8(s, 0xc0 | (r << 3) | rm); |
|
204 |
} |
|
205 |
|
|
206 |
/* rm == -1 means no register index */ |
|
207 |
static inline void tcg_out_modrm_offset(TCGContext *s, int opc, int r, int rm, |
|
208 |
int32_t offset) |
|
209 |
{ |
|
210 |
tcg_out_opc(s, opc); |
|
211 |
if (rm == -1) { |
|
212 |
tcg_out8(s, 0x05 | (r << 3)); |
|
213 |
tcg_out32(s, offset); |
|
214 |
} else if (offset == 0 && rm != TCG_REG_EBP) { |
|
215 |
if (rm == TCG_REG_ESP) { |
|
216 |
tcg_out8(s, 0x04 | (r << 3)); |
|
217 |
tcg_out8(s, 0x24); |
|
218 |
} else { |
|
219 |
tcg_out8(s, 0x00 | (r << 3) | rm); |
|
220 |
} |
|
221 |
} else if ((int8_t)offset == offset) { |
|
222 |
if (rm == TCG_REG_ESP) { |
|
223 |
tcg_out8(s, 0x44 | (r << 3)); |
|
224 |
tcg_out8(s, 0x24); |
|
225 |
} else { |
|
226 |
tcg_out8(s, 0x40 | (r << 3) | rm); |
|
227 |
} |
|
228 |
tcg_out8(s, offset); |
|
229 |
} else { |
|
230 |
if (rm == TCG_REG_ESP) { |
|
231 |
tcg_out8(s, 0x84 | (r << 3)); |
|
232 |
tcg_out8(s, 0x24); |
|
233 |
} else { |
|
234 |
tcg_out8(s, 0x80 | (r << 3) | rm); |
|
235 |
} |
|
236 |
tcg_out32(s, offset); |
|
237 |
} |
|
238 |
} |
|
239 |
|
|
240 |
static inline void tcg_out_mov(TCGContext *s, int ret, int arg) |
|
241 |
{ |
|
242 |
if (arg != ret) |
|
243 |
tcg_out_modrm(s, 0x8b, ret, arg); |
|
244 |
} |
|
245 |
|
|
246 |
static inline void tcg_out_movi(TCGContext *s, TCGType type, |
|
247 |
int ret, int32_t arg) |
|
248 |
{ |
|
249 |
if (arg == 0) { |
|
250 |
/* xor r0,r0 */ |
|
251 |
tcg_out_modrm(s, 0x01 | (ARITH_XOR << 3), ret, ret); |
|
252 |
} else { |
|
253 |
tcg_out8(s, 0xb8 + ret); |
|
254 |
tcg_out32(s, arg); |
|
255 |
} |
|
256 |
} |
|
257 |
|
|
258 |
static inline void tcg_out_ld(TCGContext *s, int ret, |
|
259 |
int arg1, int32_t arg2) |
|
260 |
{ |
|
261 |
/* movl */ |
|
262 |
tcg_out_modrm_offset(s, 0x8b, ret, arg1, arg2); |
|
263 |
} |
|
264 |
|
|
265 |
static inline void tcg_out_st(TCGContext *s, int arg, |
|
266 |
int arg1, int32_t arg2) |
|
267 |
{ |
|
268 |
/* movl */ |
|
269 |
tcg_out_modrm_offset(s, 0x89, arg, arg1, arg2); |
|
270 |
} |
|
271 |
|
|
272 |
static inline void tgen_arithi(TCGContext *s, int c, int r0, int32_t val) |
|
273 |
{ |
|
274 |
if (val == (int8_t)val) { |
|
275 |
tcg_out_modrm(s, 0x83, c, r0); |
|
276 |
tcg_out8(s, val); |
|
277 |
} else { |
|
278 |
tcg_out_modrm(s, 0x81, c, r0); |
|
279 |
tcg_out32(s, val); |
|
280 |
} |
|
281 |
} |
|
282 |
|
|
283 |
void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val) |
|
284 |
{ |
|
285 |
if (val != 0) |
|
286 |
tgen_arithi(s, ARITH_ADD, reg, val); |
|
287 |
} |
|
288 |
|
|
289 |
static void tcg_out_jxx(TCGContext *s, int opc, int label_index) |
|
290 |
{ |
|
291 |
int32_t val, val1; |
|
292 |
TCGLabel *l = &s->labels[label_index]; |
|
293 |
|
|
294 |
if (l->has_value) { |
|
295 |
val = l->u.value - (tcg_target_long)s->code_ptr; |
|
296 |
val1 = val - 2; |
|
297 |
if ((int8_t)val1 == val1) { |
|
298 |
if (opc == -1) |
|
299 |
tcg_out8(s, 0xeb); |
|
300 |
else |
|
301 |
tcg_out8(s, 0x70 + opc); |
|
302 |
tcg_out8(s, val1); |
|
303 |
} else { |
|
304 |
if (opc == -1) { |
|
305 |
tcg_out8(s, 0xe9); |
|
306 |
tcg_out32(s, val - 5); |
|
307 |
} else { |
|
308 |
tcg_out8(s, 0x0f); |
|
309 |
tcg_out8(s, 0x80 + opc); |
|
310 |
tcg_out32(s, val - 6); |
|
311 |
} |
|
312 |
} |
|
313 |
} else { |
|
314 |
if (opc == -1) { |
|
315 |
tcg_out8(s, 0xe9); |
|
316 |
} else { |
|
317 |
tcg_out8(s, 0x0f); |
|
318 |
tcg_out8(s, 0x80 + opc); |
|
319 |
} |
|
320 |
tcg_out_reloc(s, s->code_ptr, R_386_PC32, label_index, -4); |
|
321 |
tcg_out32(s, -4); |
|
322 |
} |
|
323 |
} |
|
324 |
|
|
325 |
static void tcg_out_brcond(TCGContext *s, int cond, |
|
326 |
TCGArg arg1, TCGArg arg2, int const_arg2, |
|
327 |
int label_index) |
|
328 |
{ |
|
329 |
int c; |
|
330 |
if (const_arg2) { |
|
331 |
if (arg2 == 0) { |
|
332 |
/* use test */ |
|
333 |
switch(cond) { |
|
334 |
case TCG_COND_EQ: |
|
335 |
c = JCC_JNE; |
|
336 |
break; |
|
337 |
case TCG_COND_NE: |
|
338 |
c = JCC_JNE; |
|
339 |
break; |
|
340 |
case TCG_COND_LT: |
|
341 |
c = JCC_JS; |
|
342 |
break; |
|
343 |
case TCG_COND_GE: |
|
344 |
c = JCC_JNS; |
|
345 |
break; |
|
346 |
default: |
|
347 |
goto do_cmpi; |
|
348 |
} |
|
349 |
/* test r, r */ |
|
350 |
tcg_out_modrm(s, 0x85, arg1, arg1); |
|
351 |
tcg_out_jxx(s, c, label_index); |
|
352 |
} else { |
|
353 |
do_cmpi: |
|
354 |
tgen_arithi(s, ARITH_CMP, arg1, arg2); |
|
355 |
tcg_out_jxx(s, tcg_cond_to_jcc[cond], label_index); |
|
356 |
} |
|
357 |
} else { |
|
358 |
tcg_out_modrm(s, 0x01 | (ARITH_CMP << 3), arg1, arg2); |
|
359 |
tcg_out_jxx(s, tcg_cond_to_jcc[cond], label_index); |
|
360 |
} |
|
361 |
} |
|
362 |
|
|
363 |
/* XXX: we implement it at the target level to avoid having to |
|
364 |
handle cross basic blocks temporaries */ |
|
365 |
static void tcg_out_brcond2(TCGContext *s, |
|
366 |
const TCGArg *args, const int *const_args) |
|
367 |
{ |
|
368 |
int label_next; |
|
369 |
label_next = gen_new_label(); |
|
370 |
switch(args[4]) { |
|
371 |
case TCG_COND_EQ: |
|
372 |
tcg_out_brcond(s, TCG_COND_NE, args[0], args[2], const_args[2], label_next); |
|
373 |
tcg_out_brcond(s, TCG_COND_EQ, args[1], args[3], const_args[3], args[5]); |
|
374 |
break; |
|
375 |
case TCG_COND_NE: |
|
376 |
tcg_out_brcond(s, TCG_COND_NE, args[0], args[2], const_args[2], args[5]); |
|
377 |
tcg_out_brcond(s, TCG_COND_EQ, args[1], args[3], const_args[3], label_next); |
|
378 |
break; |
|
379 |
case TCG_COND_LT: |
|
380 |
tcg_out_brcond(s, TCG_COND_LT, args[1], args[3], const_args[3], args[5]); |
|
381 |
tcg_out_brcond(s, TCG_COND_NE, args[1], args[3], const_args[3], label_next); |
|
382 |
tcg_out_brcond(s, TCG_COND_LT, args[0], args[2], const_args[2], args[5]); |
|
383 |
break; |
|
384 |
case TCG_COND_LE: |
|
385 |
tcg_out_brcond(s, TCG_COND_LT, args[1], args[3], const_args[3], args[5]); |
|
386 |
tcg_out_brcond(s, TCG_COND_NE, args[1], args[3], const_args[3], label_next); |
|
387 |
tcg_out_brcond(s, TCG_COND_LE, args[0], args[2], const_args[2], args[5]); |
|
388 |
break; |
|
389 |
case TCG_COND_GT: |
|
390 |
tcg_out_brcond(s, TCG_COND_GT, args[1], args[3], const_args[3], args[5]); |
|
391 |
tcg_out_brcond(s, TCG_COND_NE, args[1], args[3], const_args[3], label_next); |
|
392 |
tcg_out_brcond(s, TCG_COND_GT, args[0], args[2], const_args[2], args[5]); |
|
393 |
break; |
|
394 |
case TCG_COND_GE: |
|
395 |
tcg_out_brcond(s, TCG_COND_GT, args[1], args[3], const_args[3], args[5]); |
|
396 |
tcg_out_brcond(s, TCG_COND_NE, args[1], args[3], const_args[3], label_next); |
|
397 |
tcg_out_brcond(s, TCG_COND_GE, args[0], args[2], const_args[2], args[5]); |
|
398 |
break; |
|
399 |
case TCG_COND_LTU: |
|
400 |
tcg_out_brcond(s, TCG_COND_LTU, args[1], args[3], const_args[3], args[5]); |
|
401 |
tcg_out_brcond(s, TCG_COND_NE, args[1], args[3], const_args[3], label_next); |
|
402 |
tcg_out_brcond(s, TCG_COND_LTU, args[0], args[2], const_args[2], args[5]); |
|
403 |
break; |
|
404 |
case TCG_COND_LEU: |
|
405 |
tcg_out_brcond(s, TCG_COND_LTU, args[1], args[3], const_args[3], args[5]); |
|
406 |
tcg_out_brcond(s, TCG_COND_NE, args[1], args[3], const_args[3], label_next); |
|
407 |
tcg_out_brcond(s, TCG_COND_LEU, args[0], args[2], const_args[2], args[5]); |
|
408 |
break; |
|
409 |
case TCG_COND_GTU: |
|
410 |
tcg_out_brcond(s, TCG_COND_GTU, args[1], args[3], const_args[3], args[5]); |
|
411 |
tcg_out_brcond(s, TCG_COND_NE, args[1], args[3], const_args[3], label_next); |
|
412 |
tcg_out_brcond(s, TCG_COND_GTU, args[0], args[2], const_args[2], args[5]); |
|
413 |
break; |
|
414 |
case TCG_COND_GEU: |
|
415 |
tcg_out_brcond(s, TCG_COND_GTU, args[1], args[3], const_args[3], args[5]); |
|
416 |
tcg_out_brcond(s, TCG_COND_NE, args[1], args[3], const_args[3], label_next); |
|
417 |
tcg_out_brcond(s, TCG_COND_GEU, args[0], args[2], const_args[2], args[5]); |
|
418 |
break; |
|
419 |
default: |
|
420 |
tcg_abort(); |
|
421 |
} |
|
422 |
tcg_out_label(s, label_next, (tcg_target_long)s->code_ptr); |
|
423 |
} |
|
424 |
|
|
425 |
#if defined(CONFIG_SOFTMMU) |
|
426 |
extern void __ldb_mmu(void); |
|
427 |
extern void __ldw_mmu(void); |
|
428 |
extern void __ldl_mmu(void); |
|
429 |
extern void __ldq_mmu(void); |
|
430 |
|
|
431 |
extern void __stb_mmu(void); |
|
432 |
extern void __stw_mmu(void); |
|
433 |
extern void __stl_mmu(void); |
|
434 |
extern void __stq_mmu(void); |
|
435 |
|
|
436 |
static void *qemu_ld_helpers[4] = { |
|
437 |
__ldb_mmu, |
|
438 |
__ldw_mmu, |
|
439 |
__ldl_mmu, |
|
440 |
__ldq_mmu, |
|
441 |
}; |
|
442 |
|
|
443 |
static void *qemu_st_helpers[4] = { |
|
444 |
__stb_mmu, |
|
445 |
__stw_mmu, |
|
446 |
__stl_mmu, |
|
447 |
__stq_mmu, |
|
448 |
}; |
|
449 |
#endif |
|
450 |
|
|
451 |
/* XXX: qemu_ld and qemu_st could be modified to clobber only EDX and |
|
452 |
EAX. It will be useful once fixed registers globals are less |
|
453 |
common. */ |
|
454 |
static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, |
|
455 |
int opc) |
|
456 |
{ |
|
457 |
int addr_reg, data_reg, data_reg2, r0, r1, mem_index, s_bits, bswap; |
|
458 |
#if defined(CONFIG_SOFTMMU) |
|
459 |
uint8_t *label1_ptr, *label2_ptr; |
|
460 |
#endif |
|
461 |
#if TARGET_LONG_BITS == 64 |
|
462 |
#if defined(CONFIG_SOFTMMU) |
|
463 |
uint8_t *label3_ptr; |
|
464 |
#endif |
|
465 |
int addr_reg2; |
|
466 |
#endif |
|
467 |
|
|
468 |
data_reg = *args++; |
|
469 |
if (opc == 3) |
|
470 |
data_reg2 = *args++; |
|
471 |
else |
|
472 |
data_reg2 = 0; |
|
473 |
addr_reg = *args++; |
|
474 |
#if TARGET_LONG_BITS == 64 |
|
475 |
addr_reg2 = *args++; |
|
476 |
#endif |
|
477 |
mem_index = *args; |
|
478 |
s_bits = opc & 3; |
|
479 |
|
|
480 |
r0 = TCG_REG_EAX; |
|
481 |
r1 = TCG_REG_EDX; |
|
482 |
|
|
483 |
#if defined(CONFIG_SOFTMMU) |
|
484 |
tcg_out_mov(s, r1, addr_reg); |
|
485 |
|
|
486 |
tcg_out_mov(s, r0, addr_reg); |
|
487 |
|
|
488 |
tcg_out_modrm(s, 0xc1, 5, r1); /* shr $x, r1 */ |
|
489 |
tcg_out8(s, TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS); |
|
490 |
|
|
491 |
tcg_out_modrm(s, 0x81, 4, r0); /* andl $x, r0 */ |
|
492 |
tcg_out32(s, TARGET_PAGE_MASK | ((1 << s_bits) - 1)); |
|
493 |
|
|
494 |
tcg_out_modrm(s, 0x81, 4, r1); /* andl $x, r1 */ |
|
495 |
tcg_out32(s, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS); |
|
496 |
|
|
497 |
tcg_out_opc(s, 0x8d); /* lea offset(r1, %ebp), r1 */ |
|
498 |
tcg_out8(s, 0x80 | (r1 << 3) | 0x04); |
|
499 |
tcg_out8(s, (5 << 3) | r1); |
|
500 |
tcg_out32(s, offsetof(CPUState, tlb_table[mem_index][0].addr_read)); |
|
501 |
|
|
502 |
/* cmp 0(r1), r0 */ |
|
503 |
tcg_out_modrm_offset(s, 0x3b, r0, r1, 0); |
|
504 |
|
|
505 |
tcg_out_mov(s, r0, addr_reg); |
|
506 |
|
|
507 |
#if TARGET_LONG_BITS == 32 |
|
508 |
/* je label1 */ |
|
509 |
tcg_out8(s, 0x70 + JCC_JE); |
|
510 |
label1_ptr = s->code_ptr; |
|
511 |
s->code_ptr++; |
|
512 |
#else |
|
513 |
/* jne label3 */ |
|
514 |
tcg_out8(s, 0x70 + JCC_JNE); |
|
515 |
label3_ptr = s->code_ptr; |
|
516 |
s->code_ptr++; |
|
517 |
|
|
518 |
/* cmp 4(r1), addr_reg2 */ |
|
519 |
tcg_out_modrm_offset(s, 0x3b, addr_reg2, r1, 4); |
|
520 |
|
|
521 |
/* je label1 */ |
|
522 |
tcg_out8(s, 0x70 + JCC_JE); |
|
523 |
label1_ptr = s->code_ptr; |
|
524 |
s->code_ptr++; |
|
525 |
|
|
526 |
/* label3: */ |
|
527 |
*label3_ptr = s->code_ptr - label3_ptr - 1; |
|
528 |
#endif |
|
529 |
|
|
530 |
/* XXX: move that code at the end of the TB */ |
|
531 |
#if TARGET_LONG_BITS == 32 |
|
532 |
tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_EDX, mem_index); |
|
533 |
#else |
|
534 |
tcg_out_mov(s, TCG_REG_EDX, addr_reg2); |
|
535 |
tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_ECX, mem_index); |
|
536 |
#endif |
|
537 |
tcg_out8(s, 0xe8); |
|
538 |
tcg_out32(s, (tcg_target_long)qemu_ld_helpers[s_bits] - |
|
539 |
(tcg_target_long)s->code_ptr - 4); |
|
540 |
|
|
541 |
switch(opc) { |
|
542 |
case 0 | 4: |
|
543 |
/* movsbl */ |
|
544 |
tcg_out_modrm(s, 0xbe | P_EXT, data_reg, TCG_REG_EAX); |
|
545 |
break; |
|
546 |
case 1 | 4: |
|
547 |
/* movswl */ |
|
548 |
tcg_out_modrm(s, 0xbf | P_EXT, data_reg, TCG_REG_EAX); |
|
549 |
break; |
|
550 |
case 0: |
|
551 |
case 1: |
|
552 |
case 2: |
|
553 |
default: |
|
554 |
tcg_out_mov(s, data_reg, TCG_REG_EAX); |
|
555 |
break; |
|
556 |
case 3: |
|
557 |
if (data_reg == TCG_REG_EDX) { |
|
558 |
tcg_out_opc(s, 0x90 + TCG_REG_EDX); /* xchg %edx, %eax */ |
|
559 |
tcg_out_mov(s, data_reg2, TCG_REG_EAX); |
|
560 |
} else { |
|
561 |
tcg_out_mov(s, data_reg, TCG_REG_EAX); |
|
562 |
tcg_out_mov(s, data_reg2, TCG_REG_EDX); |
|
563 |
} |
|
564 |
break; |
|
565 |
} |
|
566 |
|
|
567 |
/* jmp label2 */ |
|
568 |
tcg_out8(s, 0xeb); |
|
569 |
label2_ptr = s->code_ptr; |
|
570 |
s->code_ptr++; |
|
571 |
|
|
572 |
/* label1: */ |
|
573 |
*label1_ptr = s->code_ptr - label1_ptr - 1; |
|
574 |
|
|
575 |
/* add x(r1), r0 */ |
|
576 |
tcg_out_modrm_offset(s, 0x03, r0, r1, offsetof(CPUTLBEntry, addend) - |
|
577 |
offsetof(CPUTLBEntry, addr_read)); |
|
578 |
#else |
|
579 |
r0 = addr_reg; |
|
580 |
#endif |
|
581 |
|
|
582 |
#ifdef TARGET_WORDS_BIGENDIAN |
|
583 |
bswap = 1; |
|
584 |
#else |
|
585 |
bswap = 0; |
|
586 |
#endif |
|
587 |
switch(opc) { |
|
588 |
case 0: |
|
589 |
/* movzbl */ |
|
590 |
tcg_out_modrm_offset(s, 0xb6 | P_EXT, data_reg, r0, 0); |
|
591 |
break; |
|
592 |
case 0 | 4: |
|
593 |
/* movsbl */ |
|
594 |
tcg_out_modrm_offset(s, 0xbe | P_EXT, data_reg, r0, 0); |
|
595 |
break; |
|
596 |
case 1: |
|
597 |
/* movzwl */ |
|
598 |
tcg_out_modrm_offset(s, 0xb7 | P_EXT, data_reg, r0, 0); |
|
599 |
if (bswap) { |
|
600 |
/* rolw $8, data_reg */ |
|
601 |
tcg_out8(s, 0x66); |
|
602 |
tcg_out_modrm(s, 0xc1, 0, data_reg); |
|
603 |
tcg_out8(s, 8); |
|
604 |
} |
|
605 |
break; |
|
606 |
case 1 | 4: |
|
607 |
/* movswl */ |
|
608 |
tcg_out_modrm_offset(s, 0xbf | P_EXT, data_reg, r0, 0); |
|
609 |
if (bswap) { |
|
610 |
/* rolw $8, data_reg */ |
|
611 |
tcg_out8(s, 0x66); |
|
612 |
tcg_out_modrm(s, 0xc1, 0, data_reg); |
|
613 |
tcg_out8(s, 8); |
|
614 |
|
|
615 |
/* movswl data_reg, data_reg */ |
|
616 |
tcg_out_modrm(s, 0xbf | P_EXT, data_reg, data_reg); |
|
617 |
} |
|
618 |
break; |
|
619 |
case 2: |
|
620 |
/* movl (r0), data_reg */ |
|
621 |
tcg_out_modrm_offset(s, 0x8b, data_reg, r0, 0); |
|
622 |
if (bswap) { |
|
623 |
/* bswap */ |
|
624 |
tcg_out_opc(s, (0xc8 + data_reg) | P_EXT); |
|
625 |
} |
|
626 |
break; |
|
627 |
case 3: |
|
628 |
/* XXX: could be nicer */ |
|
629 |
if (r0 == data_reg) { |
|
630 |
r1 = TCG_REG_EDX; |
|
631 |
if (r1 == data_reg) |
|
632 |
r1 = TCG_REG_EAX; |
|
633 |
tcg_out_mov(s, r1, r0); |
|
634 |
r0 = r1; |
|
635 |
} |
|
636 |
if (!bswap) { |
|
637 |
tcg_out_modrm_offset(s, 0x8b, data_reg, r0, 0); |
|
638 |
tcg_out_modrm_offset(s, 0x8b, data_reg2, r0, 4); |
|
639 |
} else { |
|
640 |
tcg_out_modrm_offset(s, 0x8b, data_reg, r0, 4); |
|
641 |
tcg_out_opc(s, (0xc8 + data_reg) | P_EXT); |
|
642 |
|
|
643 |
tcg_out_modrm_offset(s, 0x8b, data_reg2, r0, 0); |
|
644 |
/* bswap */ |
|
645 |
tcg_out_opc(s, (0xc8 + data_reg2) | P_EXT); |
|
646 |
} |
|
647 |
break; |
|
648 |
default: |
|
649 |
tcg_abort(); |
|
650 |
} |
|
651 |
|
|
652 |
#if defined(CONFIG_SOFTMMU) |
|
653 |
/* label2: */ |
|
654 |
*label2_ptr = s->code_ptr - label2_ptr - 1; |
|
655 |
#endif |
|
656 |
} |
|
657 |
|
|
658 |
|
|
659 |
static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, |
|
660 |
int opc) |
|
661 |
{ |
|
662 |
int addr_reg, data_reg, data_reg2, r0, r1, mem_index, s_bits, bswap; |
|
663 |
#if defined(CONFIG_SOFTMMU) |
|
664 |
uint8_t *label1_ptr, *label2_ptr; |
|
665 |
#endif |
|
666 |
#if TARGET_LONG_BITS == 64 |
|
667 |
#if defined(CONFIG_SOFTMMU) |
|
668 |
uint8_t *label3_ptr; |
|
669 |
#endif |
|
670 |
int addr_reg2; |
|
671 |
#endif |
|
672 |
|
|
673 |
data_reg = *args++; |
|
674 |
if (opc == 3) |
|
675 |
data_reg2 = *args++; |
|
676 |
else |
|
677 |
data_reg2 = 0; |
|
678 |
addr_reg = *args++; |
|
679 |
#if TARGET_LONG_BITS == 64 |
|
680 |
addr_reg2 = *args++; |
|
681 |
#endif |
|
682 |
mem_index = *args; |
|
683 |
|
|
684 |
s_bits = opc; |
|
685 |
|
|
686 |
r0 = TCG_REG_EAX; |
|
687 |
r1 = TCG_REG_EDX; |
|
688 |
|
|
689 |
#if defined(CONFIG_SOFTMMU) |
|
690 |
tcg_out_mov(s, r1, addr_reg); |
|
691 |
|
|
692 |
tcg_out_mov(s, r0, addr_reg); |
|
693 |
|
|
694 |
tcg_out_modrm(s, 0xc1, 5, r1); /* shr $x, r1 */ |
|
695 |
tcg_out8(s, TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS); |
|
696 |
|
|
697 |
tcg_out_modrm(s, 0x81, 4, r0); /* andl $x, r0 */ |
|
698 |
tcg_out32(s, TARGET_PAGE_MASK | ((1 << s_bits) - 1)); |
|
699 |
|
|
700 |
tcg_out_modrm(s, 0x81, 4, r1); /* andl $x, r1 */ |
|
701 |
tcg_out32(s, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS); |
|
702 |
|
|
703 |
tcg_out_opc(s, 0x8d); /* lea offset(r1, %ebp), r1 */ |
|
704 |
tcg_out8(s, 0x80 | (r1 << 3) | 0x04); |
|
705 |
tcg_out8(s, (5 << 3) | r1); |
|
706 |
tcg_out32(s, offsetof(CPUState, tlb_table[mem_index][0].addr_write)); |
|
707 |
|
|
708 |
/* cmp 0(r1), r0 */ |
|
709 |
tcg_out_modrm_offset(s, 0x3b, r0, r1, 0); |
|
710 |
|
|
711 |
tcg_out_mov(s, r0, addr_reg); |
|
712 |
|
|
713 |
#if TARGET_LONG_BITS == 32 |
|
714 |
/* je label1 */ |
|
715 |
tcg_out8(s, 0x70 + JCC_JE); |
|
716 |
label1_ptr = s->code_ptr; |
|
717 |
s->code_ptr++; |
|
718 |
#else |
|
719 |
/* jne label3 */ |
|
720 |
tcg_out8(s, 0x70 + JCC_JNE); |
|
721 |
label3_ptr = s->code_ptr; |
|
722 |
s->code_ptr++; |
|
723 |
|
|
724 |
/* cmp 4(r1), addr_reg2 */ |
|
725 |
tcg_out_modrm_offset(s, 0x3b, addr_reg2, r1, 4); |
|
726 |
|
|
727 |
/* je label1 */ |
|
728 |
tcg_out8(s, 0x70 + JCC_JE); |
|
729 |
label1_ptr = s->code_ptr; |
|
730 |
s->code_ptr++; |
|
731 |
|
|
732 |
/* label3: */ |
|
733 |
*label3_ptr = s->code_ptr - label3_ptr - 1; |
|
734 |
#endif |
|
735 |
|
|
736 |
/* XXX: move that code at the end of the TB */ |
|
737 |
#if TARGET_LONG_BITS == 32 |
|
738 |
if (opc == 3) { |
|
739 |
tcg_out_mov(s, TCG_REG_EDX, data_reg); |
|
740 |
tcg_out_mov(s, TCG_REG_ECX, data_reg2); |
|
741 |
tcg_out8(s, 0x6a); /* push Ib */ |
|
742 |
tcg_out8(s, mem_index); |
|
743 |
tcg_out8(s, 0xe8); |
|
744 |
tcg_out32(s, (tcg_target_long)qemu_st_helpers[s_bits] - |
|
745 |
(tcg_target_long)s->code_ptr - 4); |
|
746 |
tcg_out_addi(s, TCG_REG_ESP, 4); |
|
747 |
} else { |
|
748 |
switch(opc) { |
|
749 |
case 0: |
|
750 |
/* movzbl */ |
|
751 |
tcg_out_modrm(s, 0xb6 | P_EXT, TCG_REG_EDX, data_reg); |
|
752 |
break; |
|
753 |
case 1: |
|
754 |
/* movzwl */ |
|
755 |
tcg_out_modrm(s, 0xb7 | P_EXT, TCG_REG_EDX, data_reg); |
|
756 |
break; |
|
757 |
case 2: |
|
758 |
tcg_out_mov(s, TCG_REG_EDX, data_reg); |
|
759 |
break; |
|
760 |
} |
|
761 |
tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_ECX, mem_index); |
|
762 |
tcg_out8(s, 0xe8); |
|
763 |
tcg_out32(s, (tcg_target_long)qemu_st_helpers[s_bits] - |
|
764 |
(tcg_target_long)s->code_ptr - 4); |
|
765 |
} |
|
766 |
#else |
|
767 |
if (opc == 3) { |
|
768 |
tcg_out_mov(s, TCG_REG_EDX, addr_reg2); |
|
769 |
tcg_out8(s, 0x6a); /* push Ib */ |
|
770 |
tcg_out8(s, mem_index); |
|
771 |
tcg_out_opc(s, 0x50 + data_reg2); /* push */ |
|
772 |
tcg_out_opc(s, 0x50 + data_reg); /* push */ |
|
773 |
tcg_out8(s, 0xe8); |
|
774 |
tcg_out32(s, (tcg_target_long)qemu_st_helpers[s_bits] - |
|
775 |
(tcg_target_long)s->code_ptr - 4); |
|
776 |
tcg_out_addi(s, TCG_REG_ESP, 12); |
|
777 |
} else { |
|
778 |
tcg_out_mov(s, TCG_REG_EDX, addr_reg2); |
|
779 |
switch(opc) { |
|
780 |
case 0: |
|
781 |
/* movzbl */ |
|
782 |
tcg_out_modrm(s, 0xb6 | P_EXT, TCG_REG_ECX, data_reg); |
|
783 |
break; |
|
784 |
case 1: |
|
785 |
/* movzwl */ |
|
786 |
tcg_out_modrm(s, 0xb7 | P_EXT, TCG_REG_ECX, data_reg); |
|
787 |
break; |
|
788 |
case 2: |
|
789 |
tcg_out_mov(s, TCG_REG_ECX, data_reg); |
|
790 |
break; |
|
791 |
} |
|
792 |
tcg_out8(s, 0x6a); /* push Ib */ |
|
793 |
tcg_out8(s, mem_index); |
|
794 |
tcg_out8(s, 0xe8); |
|
795 |
tcg_out32(s, (tcg_target_long)qemu_st_helpers[s_bits] - |
|
796 |
(tcg_target_long)s->code_ptr - 4); |
|
797 |
tcg_out_addi(s, TCG_REG_ESP, 4); |
|
798 |
} |
|
799 |
#endif |
|
800 |
|
|
801 |
/* jmp label2 */ |
|
802 |
tcg_out8(s, 0xeb); |
|
803 |
label2_ptr = s->code_ptr; |
|
804 |
s->code_ptr++; |
|
805 |
|
|
806 |
/* label1: */ |
|
807 |
*label1_ptr = s->code_ptr - label1_ptr - 1; |
|
808 |
|
|
809 |
/* add x(r1), r0 */ |
|
810 |
tcg_out_modrm_offset(s, 0x03, r0, r1, offsetof(CPUTLBEntry, addend) - |
|
811 |
offsetof(CPUTLBEntry, addr_write)); |
|
812 |
#else |
|
813 |
r0 = addr_reg; |
|
814 |
#endif |
|
815 |
|
|
816 |
#ifdef TARGET_WORDS_BIGENDIAN |
|
817 |
bswap = 1; |
|
818 |
#else |
|
819 |
bswap = 0; |
|
820 |
#endif |
|
821 |
switch(opc) { |
|
822 |
case 0: |
|
823 |
/* movb */ |
|
824 |
tcg_out_modrm_offset(s, 0x88, data_reg, r0, 0); |
|
825 |
break; |
|
826 |
case 1: |
|
827 |
if (bswap) { |
|
828 |
tcg_out_mov(s, r1, data_reg); |
|
829 |
tcg_out8(s, 0x66); /* rolw $8, %ecx */ |
|
830 |
tcg_out_modrm(s, 0xc1, 0, r1); |
|
831 |
tcg_out8(s, 8); |
|
832 |
data_reg = r1; |
|
833 |
} |
|
834 |
/* movw */ |
|
835 |
tcg_out8(s, 0x66); |
|
836 |
tcg_out_modrm_offset(s, 0x89, data_reg, r0, 0); |
|
837 |
break; |
|
838 |
case 2: |
|
839 |
if (bswap) { |
|
840 |
tcg_out_mov(s, r1, data_reg); |
|
841 |
/* bswap data_reg */ |
|
842 |
tcg_out_opc(s, (0xc8 + r1) | P_EXT); |
|
843 |
data_reg = r1; |
|
844 |
} |
|
845 |
/* movl */ |
|
846 |
tcg_out_modrm_offset(s, 0x89, data_reg, r0, 0); |
|
847 |
break; |
|
848 |
case 3: |
|
849 |
if (bswap) { |
|
850 |
tcg_out_mov(s, r1, data_reg2); |
|
851 |
/* bswap data_reg */ |
|
852 |
tcg_out_opc(s, (0xc8 + r1) | P_EXT); |
|
853 |
tcg_out_modrm_offset(s, 0x89, r1, r0, 0); |
|
854 |
tcg_out_mov(s, r1, data_reg); |
|
855 |
/* bswap data_reg */ |
|
856 |
tcg_out_opc(s, (0xc8 + r1) | P_EXT); |
|
857 |
tcg_out_modrm_offset(s, 0x89, r1, r0, 4); |
|
858 |
} else { |
|
859 |
tcg_out_modrm_offset(s, 0x89, data_reg, r0, 0); |
|
860 |
tcg_out_modrm_offset(s, 0x89, data_reg2, r0, 4); |
|
861 |
} |
|
862 |
break; |
|
863 |
default: |
|
864 |
tcg_abort(); |
|
865 |
} |
|
866 |
|
|
867 |
#if defined(CONFIG_SOFTMMU) |
|
868 |
/* label2: */ |
|
869 |
*label2_ptr = s->code_ptr - label2_ptr - 1; |
|
870 |
#endif |
|
871 |
} |
|
872 |
|
|
873 |
static inline void tcg_out_op(TCGContext *s, int opc, |
|
874 |
const TCGArg *args, const int *const_args) |
|
875 |
{ |
|
876 |
int c; |
|
877 |
|
|
878 |
switch(opc) { |
|
879 |
case INDEX_op_exit_tb: |
|
880 |
tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_EAX, args[0]); |
|
881 |
tcg_out8(s, 0xc3); /* ret */ |
|
882 |
break; |
|
883 |
case INDEX_op_goto_tb: |
|
884 |
if (s->tb_jmp_offset) { |
|
885 |
/* direct jump method */ |
|
886 |
tcg_out8(s, 0xe9); /* jmp im */ |
|
887 |
s->tb_jmp_offset[args[0]] = s->code_ptr - s->code_buf; |
|
888 |
tcg_out32(s, 0); |
|
889 |
} else { |
|
890 |
/* indirect jump method */ |
|
891 |
/* jmp Ev */ |
|
892 |
tcg_out_modrm_offset(s, 0xff, 4, -1, |
|
893 |
(tcg_target_long)(s->tb_next + args[0])); |
|
894 |
} |
|
895 |
s->tb_next_offset[args[0]] = s->code_ptr - s->code_buf; |
|
896 |
break; |
|
897 |
case INDEX_op_call: |
|
898 |
if (const_args[0]) { |
|
899 |
tcg_out8(s, 0xe8); |
|
900 |
tcg_out32(s, args[0] - (tcg_target_long)s->code_ptr - 4); |
|
901 |
} else { |
|
902 |
tcg_out_modrm(s, 0xff, 2, args[0]); |
|
903 |
} |
|
904 |
break; |
|
905 |
case INDEX_op_jmp: |
|
906 |
if (const_args[0]) { |
|
907 |
tcg_out8(s, 0xe9); |
|
908 |
tcg_out32(s, args[0] - (tcg_target_long)s->code_ptr - 4); |
|
909 |
} else { |
|
910 |
tcg_out_modrm(s, 0xff, 4, args[0]); |
|
911 |
} |
|
912 |
break; |
|
913 |
case INDEX_op_br: |
|
914 |
tcg_out_jxx(s, JCC_JMP, args[0]); |
|
915 |
break; |
|
916 |
case INDEX_op_movi_i32: |
|
917 |
tcg_out_movi(s, TCG_TYPE_I32, args[0], args[1]); |
|
918 |
break; |
|
919 |
case INDEX_op_ld8u_i32: |
|
920 |
/* movzbl */ |
|
921 |
tcg_out_modrm_offset(s, 0xb6 | P_EXT, args[0], args[1], args[2]); |
|
922 |
break; |
|
923 |
case INDEX_op_ld8s_i32: |
|
924 |
/* movsbl */ |
|
925 |
tcg_out_modrm_offset(s, 0xbe | P_EXT, args[0], args[1], args[2]); |
|
926 |
break; |
|
927 |
case INDEX_op_ld16u_i32: |
|
928 |
/* movzwl */ |
|
929 |
tcg_out_modrm_offset(s, 0xb7 | P_EXT, args[0], args[1], args[2]); |
|
930 |
break; |
|
931 |
case INDEX_op_ld16s_i32: |
|
932 |
/* movswl */ |
|
933 |
tcg_out_modrm_offset(s, 0xbf | P_EXT, args[0], args[1], args[2]); |
|
934 |
break; |
|
935 |
case INDEX_op_ld_i32: |
|
936 |
/* movl */ |
|
937 |
tcg_out_modrm_offset(s, 0x8b, args[0], args[1], args[2]); |
|
938 |
break; |
|
939 |
case INDEX_op_st8_i32: |
|
940 |
/* movb */ |
|
941 |
tcg_out_modrm_offset(s, 0x88, args[0], args[1], args[2]); |
|
942 |
break; |
|
943 |
case INDEX_op_st16_i32: |
|
944 |
/* movw */ |
|
945 |
tcg_out8(s, 0x66); |
|
946 |
tcg_out_modrm_offset(s, 0x89, args[0], args[1], args[2]); |
|
947 |
break; |
|
948 |
case INDEX_op_st_i32: |
|
949 |
/* movl */ |
|
950 |
tcg_out_modrm_offset(s, 0x89, args[0], args[1], args[2]); |
|
951 |
break; |
|
952 |
case INDEX_op_sub_i32: |
|
953 |
c = ARITH_SUB; |
|
954 |
goto gen_arith; |
|
955 |
case INDEX_op_and_i32: |
|
956 |
c = ARITH_AND; |
|
957 |
goto gen_arith; |
|
958 |
case INDEX_op_or_i32: |
|
959 |
c = ARITH_OR; |
|
960 |
goto gen_arith; |
|
961 |
case INDEX_op_xor_i32: |
|
962 |
c = ARITH_XOR; |
|
963 |
goto gen_arith; |
|
964 |
case INDEX_op_add_i32: |
|
965 |
c = ARITH_ADD; |
|
966 |
gen_arith: |
|
967 |
if (const_args[2]) { |
|
968 |
tgen_arithi(s, c, args[0], args[2]); |
|
969 |
} else { |
|
970 |
tcg_out_modrm(s, 0x01 | (c << 3), args[2], args[0]); |
|
971 |
} |
|
972 |
break; |
|
973 |
case INDEX_op_mul_i32: |
|
974 |
if (const_args[2]) { |
|
975 |
int32_t val; |
|
976 |
val = args[2]; |
|
977 |
if (val == (int8_t)val) { |
|
978 |
tcg_out_modrm(s, 0x6b, args[0], args[0]); |
|
979 |
tcg_out8(s, val); |
|
980 |
} else { |
|
981 |
tcg_out_modrm(s, 0x69, args[0], args[0]); |
|
982 |
tcg_out32(s, val); |
|
983 |
} |
|
984 |
} else { |
|
985 |
tcg_out_modrm(s, 0xaf | P_EXT, args[0], args[2]); |
|
986 |
} |
|
987 |
break; |
|
988 |
case INDEX_op_mulu2_i32: |
|
989 |
tcg_out_modrm(s, 0xf7, 4, args[3]); |
|
990 |
break; |
|
991 |
case INDEX_op_div2_i32: |
|
992 |
tcg_out_modrm(s, 0xf7, 7, args[4]); |
|
993 |
break; |
|
994 |
case INDEX_op_divu2_i32: |
|
995 |
tcg_out_modrm(s, 0xf7, 6, args[4]); |
|
996 |
break; |
|
997 |
case INDEX_op_shl_i32: |
|
998 |
c = SHIFT_SHL; |
|
999 |
gen_shift32: |
|
1000 |
if (const_args[2]) { |
|
1001 |
if (args[2] == 1) { |
|
1002 |
tcg_out_modrm(s, 0xd1, c, args[0]); |
|
1003 |
} else { |
|
1004 |
tcg_out_modrm(s, 0xc1, c, args[0]); |
|
1005 |
tcg_out8(s, args[2]); |
|
1006 |
} |
|
1007 |
} else { |
|
1008 |
tcg_out_modrm(s, 0xd3, c, args[0]); |
|
1009 |
} |
|
1010 |
break; |
|
1011 |
case INDEX_op_shr_i32: |
|
1012 |
c = SHIFT_SHR; |
|
1013 |
goto gen_shift32; |
Also available in: Unified diff