tcg: Fix typo in comment (dependancies -> dependencies)
Signed-off-by: Stefan Weil <sw@weilnetz.de>Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
tcg/i386: Fix build for systems without working cpuid.h (MacOSX, Win32)
Win32 doesn't have a cpuid.h, and MacOSX may have one but withoutthe __cpuid() function we use, which means that commit 9d2eec20broke the build for those platforms. Fix this by tightening up...
tcg/optimize: Handle known-zeros masks for ANDC
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg/optimize: Simply some logical ops to NOT
Given, of course, an appropriate constant. These could be generatedfrom the "canonical" operation for inversion on the guest, or viaother optimizations.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>...
tcg/optimize: Optmize ANDC X,Y,Y to MOV X,0
Like we already do for SUB and XOR.
tcg/optimize: Add more identity simplifications
Recognize 0 operand to andc, and -1 operands to and, orc, eqv.
tcg/i386: Move TCG_CT_CONST_* to tcg-target.c
These are not needed by users of tcg-target.h. No need to recompilewhen we adjust them.
tcg/i386: Add tcg_out_vex_modrm
Prepare for emitting BMI insns which require VEX encoding.
tcg/i386: Use ANDN instruction
Note that the optimizer cannot simplify ANDC X,Y,C to AND X,Y,~Cso we must handle constants in the implementation of andc.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg/i386: Use SHLX/SHRX/SARX instructions
These three-operand shift instructions do not require the shift countto be placed into ECX. This reduces the number of mov insns required,with the mere addition of a new register constraint.
Don't attempt to get rid of the matching constraint, as that's impossible...
tcg/optimize: fix known-zero bits for right shift ops
32-bit versions of sar and shr ops should not propagate known-zero bitsfrom the unused 32 high bits. For sar it could even lead to wrong codebeing generated.
Cc: qemu-stable@nongnu.orgReviewed-by: Paolo Bonzini <pbonzini@redhat.com>...
tcg/optimize: fix known-zero bits optimization
Known-zero bits optimization is a great idea that helps to generate moreoptimized code. However the current implementation only works in very fewcases as the computed mask is not saved.
Fix this to make it really working....
tcg/optimize: improve known-zero bits for 32-bit ops
The shl_i32 op might set some bits of the unused 32 high bits of themask. Fix that by clearing the unused 32 high bits for all 32-bit opsexcept load/store which operate on tl values.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>...
tcg/optimize: add known-zero bits compute for load ops
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg-arm: The shift count of op_rotl_i32 is in args2 not args1.
It's this that should be subtracted from 0x20 when converting to a right rotate.
Cc: qemu-stable@nongnu.orgSigned-off-by: Huw Davies <huw@codeweavers.com>Signed-off-by: Richard Henderson <rth@twiddle.net>
TCG: Fix 32-bit host allocation typo
The second half register of a 64-bit temp on a 32-bit hostwas allocated with the wrong base_type.
The base_type of the second half register is never checked,but for consistency it should be the same as the first half....
tcg: Add TCGV_UNUSED_PTR, TCGV_IS_UNUSED_PTR, TCGV_EQUAL_PTR
We have macros for marking TCGv values as unused, checking if theyare unused and comparing them to each other. However these only existfor TCGv_i32 and TCGv_i64; add them for TCGv_ptr as well....
tcg/s390: Remove sigill_handler
Commit c9baa30f42a87f61627391698f63fa4d1566d9d8 failed todelete all of the relevant code, leading to Werrors aboutunused symbols.
Signed-off-by: Richard Henderson <rth@twiddle.net>Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Merge remote-tracking branch 'rth/tcg-movbe' into staging
TCG: Fix I64-on-32bit-host temporaries
We have cache pools of temporaries that we can reuse later when they'vealready been allocated before.
These cache pools differenciate between the target TCG variable type theycontain. So we have one pool for I32 and one pool for I64 variables....
tcg/i386: cleanup useless #ifdef
TCG_TARGET_HAS_movcond_i32 is always defined to 1 in tcg-target.h, soremove the corresponding #ifdef #endif sequence, left from a previousrefactoring.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg/i386: use movbe instruction in qemu_ldst routines
The movbe instruction has been added on some Intel Atom CPUs and onrecent Intel Haswell CPUs. It allows to load/store a value and at thesame time bswap it.
This patch detects the avaibility of this instruction and when available...
tcg/i386: add support for three-byte opcodes
Add support for three-byte opcodes, starting with the 0x0f 0x38 prefix.Use P_EXT38 as the new constant, and shift all other constants so thatP_EXT and P_EXT38 have neighbouring values.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>...
tcg/i386: remove hardcoded P_REXW value
P_REXW is defined has a constant at the beginning of i386/tcg-target.c,but the corresponding bit is later used in a harcoded way, which defeatthe purpose of a constant.
Fix that by using a conditional expression operator instead of a shift....
tcg/i386: fix a comment
The comments apply to 8-bit stores, not 8-byte stores.
Reviewed-by: Richard Henderson <rth@twiddle.net>Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
tcg: Use bitmaps for free temporaries
We previously allocated 32-bits per temp for the next_free_temp entry.We now allocate 4 bits per temp across the 4 bitmaps.
Using a linked list meant that if a translator is tweeked, resulting intemps being freed in a different order, that would have follow-on effects...
tcg-s390: Use qemu_getauxval in query_facilities
No need to set up a SIGILL signal handler for detection anymore.
Remove a ton of sanity checks that must be true, given that we'rerequiring a 64-bit build (the note about 31-bit KVM is satisfiedby configuring with TCI)....
tcg-arm: Use qemu_getauxval
Allow host detection on linux systems without glibc 2.16 or later.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg-ppc64: Use qemu_getauxval
tcg-ia64: Introduce tcg_opc_bswap64_i
Acked-by: Aurelien Jarno <aurelien@aurel32.net>Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg-ia64: Introduce tcg_opc_ext_i
Being able to "extend" from 64-bits (with a mov) simplifiesa few places where the conditional breaks the train of thought.
tcg-ia64: Introduce tcg_opc_movi_a
tcg-ia64: Introduce tcg_opc_mov_a
tcg-ia64: Use A3 form of logical operations
We can and/or/xor/andcm small constants, saving one cycle.
tcg-ia64: Use SUB_A3 and ADDS_A4 for subtraction
We can subtract from more small constants that just 0 with one insn,and we can add the negative for most small constants.
tcg-ia64: Use ADDS for small addition
Avoids a wasted cycle loading up small constants.
Simplify the code assuming the tcg optimizer is going to workand don't expect the first operand of the add to be constant.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>...
tcg-ia64: Avoid unnecessary stop bit in tcg_out_alu
When performing an operation with two input registers, we'd leavethe stop bit (and thus an extra cycle) that's only needed when oneor the other input is a constant.
tcg-ia64: Move AREG0 to R32
Since the move away from the global areg0, we're no longer globallyreserving areg0. Which means our use of R7 clobbers a call-savedregister. Shift areg0 into the windowed registers. Indeed, choosethe incoming parameter register that it comes to us by....
tcg-ia64: Simplify brcond
There was a misconception that a stop bit is required between a compareand the branch that uses the predicate set by the compare. This lead tothe usage of an extra bundle in which to perform the compare. The extrabundle left room for constants to be loaded for use with the compare insn....
tcg-ia64: Handle constant calls
Using only indirect calls results in 3 bundles (one to load thedescriptor address), and 4 stop bits. By looking through thedescriptor to the constants, we can perform the call with 2bundles and only 1 stop bit.
tcg-ia64: Use shortcuts for nop insns
There's no need to go through the full opcode-to-insn function callto generate nops. This makes the source a bit more readable.
tcg-ia64: Use TCGMemOp within qemu_ldst routines
tcg-arm: Tidy variable naming convention in qemu_ld/st
s/addr_reg2/addrhi/s/addr_reg/addrlo/s/data_reg2/datahi/s/data_reg/datalo/
Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg-arm: Convert to new ldst opcodes
tcg-arm: Improve GUEST_BASE qemu_ld/st
If we pull the code to emit the actual load/store into a subroutine,we can share the reg+reg addressing mode code between softmmu andusermode. This lets us load GUEST_BASE into a temporary registerrather than attempting to add it piece-wise to the address....
tcg-ppc: Use TCGMemOp within qemu_ldst routines
tcg-ppc64: Use TCGMemOp within qemu_ldst routines
tcg-ppc: Convert to le/be ldst helpers
tcg-ppc64: Convert to le/be ldst helpers
tcg-ppc: Support new ldst opcodes
tcg-ppc64: Support new ldst opcodes
tcg: Use TCGMemOp for TCGLabelQemuLdst.opc
tcg-i386: Use TCGMemOp within qemu_ldst routines
Step one in the transition, with constants passed down from tcg_out_op.
tcg-i386: Tidy softmmu routines
Pass two TCGReg to tcg_out_tlb_load, rather than idx+args.
Move ldst_optimization routines just below tcg_out_tlb_load to avoidthe need for forward declarations.
Use TCGReg enum in preference to int where apprpriate.
tcg-i386: Remove "cb" output restriction from qemu_st8 for i386
Once we form a combined qemu_st_i32 opcode, we won't be able tohave separate constraints based on size. This one is fairly easyto work around, since eax is available as a scratch register....
tcg-i386: Support new ldst opcodes
No support for helpers with non-default endianness yet,but good enough to test the opcodes.
tcg-arm: Use TCGMemOp within qemu_ldst routines
tcg-arm: Convert to le/be ldst helpers
tcg: Add qemu_ld_st_i32/64
Step two in the transition, adding the new ldst opcodes. Keep the oldopcodes around until all backends support the new opcodes.
exec: Add both big- and little-endian memory helpers
Step three in the transition: helpers not tied to the target"default" endianness. To be used when the guest uses a memoryoperation with non-default endianness.
tcg: Add TCGMemOp
tcg: Add tcg-be-null.h
This is a no-op backend data implementation, for those targets thatare not currently using the load/store optimization path.
This is prepatory to always requiring these functions in all backends.
tcg: Add tcg-be-ldst.h
Move TCGLabelQemuLdst and related stuff out of tcg.h.
tcg: Put target helper data into an array.
One call inside of a loop to tcg_register_helper instead of hundredsof sequential calls.
Presumably more icache and branch prediction friendly; resulting binarysize mostly unchanged on x86_64, as we're trading 32-bit rip-relative...
tcg: Add tcg-runtime.c helpers to all_helpers
For the few targets that actually use these, we'd not reportthem symbolicly in the tcg opcode logs.
tcg: Merge tcg_register_helper into tcg_context_init
Eliminates the repeated checks for having createdthe s->helpers hash table.
tcg-aarch64: Update to helper_ret_*_mmu routines
A minimal update to use the new helpers with the return address argument.
Tested-by: Claudio Fontana <claudio.fontana@linaro.org>Reviewed-by: Claudio Fontana <claudio.fontana@linaro.org>Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg: Move helper registration into tcg_context_init
No longer needs to be done on a per-target basis.
tcg: Use a GHashTable for tcg_find_helper
Slightly changes the interface, in that we now return nameinstead of a TCGHelperInfo structure, which goes away.
Reviewed-by: Stefan Weil <sw@weilnetz.de>Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg: Delete tcg_helper_get_name declaration
The function was deleted in 4dc81f2822187f4503d4bdb76785cafa5b28db0b.
tcg-hppa: Remove tcg backend
Merge remote-tracking branch 'rth/tcg-arm-pull' into staging
Merge remote-tracking branch 'sweil/tci' into staging
Message-id: 1380137693-3729-1-git-send-email-sw@weilnetz.de...
tcg-arm: Use ldrd/strd for appropriate qemu_ld/st64
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg-arm: Rearrange slow-path qemu_ld/st
Use the new helper_ret_*_mmu routines. Use a conditional callto arrange for a tail-call from the store path, and to load thereturn address for the helper for the load path.
tcg-arm: Use strd for tcg_out_arg_reg64
tcg-arm: Use QEMU_BUILD_BUG_ON to verify constraints on tlb
One of the two constraints we already checked via #if, butthe tlb offset distance was only checked at runtime.
tcg-arm: Move load of tlb addend into tcg_out_tlb_read
This allows us to make more intelligent decisions about the relativeoffsets of the tlb comparator and the addend, avoiding any need ofwriteback addressing.
tcg-arm: Return register containing tlb addend
Preparatory to rescheduling the tlb load, and changing said register.Continues to use R1 for now.
tcg-arm: Remove restriction on qemu_ld output register
The main intent of the patch is to allow the tlb addend registerto be changed, without tying that change to the constraint. Butthe most common side-effect seems to be to enable usage of ldrdwith the r0,r1 pair....
tcg-arm: Move the tlb addend load earlier
There are free scheduling slots between the sequence ofcomparison instructions. This requires changing theregister in use to avoid conflict with those compares.
misc: Use new rotate functions
Signed-off-by: Stefan Weil <sw@weilnetz.de>
tci: Add implementation of rotl_i64, rotr_i64
It is used by qemu-ppc64 when running Debian's busybox-static.
Cc: qemu-stable <qemu-stable@nongnu.org>Signed-off-by: Stefan Weil <sw@weilnetz.de>Reviewed-by: Richard Henderson <rth@twiddle.net>
tcg-ppc64: Implement CONFIG_QEMU_LDST_OPTIMIZATION
tcg-ppc64: More use of TAI and SAI helper macros
Finish conversion of all memory operations.
tcg-ppc64: Use TCG_REG_Rn constants
Instead of bare N, for clarity. The only (intentional) exception madeis for insns that encode R|0, i.e. when R0 encoded into the insn isinterpreted as zero not the contents of the register.
tcg-ppc64: Use tcg_out64
tcg-ppc64: Avoid code for nop move
While these are rare from code that's been through the optimizer,it's not uncommon within the tcg backend.
tcg-ppc64: Don't load the static chain from TCG
There are no helpers that require the static chain.
tcg-ppc64: Fold constant call address into descriptor load
Eliminates one insn per call:
: lis r2,4165: ori r2,r2,59616: ld r0,0(r2): ld r0,-5920(r2) : mtctr r0-: ld r2,8(r2): ld r2,-5912(r2) : bctrl
tcg-ppc64: Look through a constant function descriptor
Especially in the user-only configurations, a direct branch intothe executable may be in range.
tcg-ppc64: Tidy register allocation order
Remove conditionalization from tcg_target_reg_alloc_order, relying onreserved_regs to prevent register allocation that shouldn't happen.So R11 is now present in reg_alloc_order for APPLE, but also nowreserved....
tcg-ppc64: Handle long offsets better
Previously we'd only handle 16-bit offsets from memory operand without fallingback to indexed, but it's easy to use ADDIS to handle full 32-bit offsets.
This also lets us unify code that existed inline in tcg_out_op for handling...
tcg-ppc64: Implement tcg_register_jit
tcg-ppc64: Streamline tcg_out_tlb_read
Less conditional compilation. Merge an add insn with the indexedmemory load insn. Load the tlb addend earlier. Avoid the addressupdate memory form.
Fix a bug in not allowing large enough tlb offsets for some guests....
tcg-ppc64: Add _noaddr functions for emitting forward branches
... rather than open-coding this stuff through the file.
tcg-ppc: Cleanup tcg_out_qemu_ld/st_slow_path
Coding style fixes. Use TCGReg enumeration values instead of rawnumbers. Don't needlessly pull the whole TCGLabelQemuLdst structinto local variables. Less conditional compilation.
No functional changes....
tcg-ppc: Use conditional branch and link to slow path
Saves one insn per slow path. Note that we can no longer usea tail call into the store helper.
tcg-ppc: Fix and cleanup tcg_out_tlb_check
The fix is that sparc has so many mmu modes that the last one overflowedthe 16-bit signed offset we assumed would fit. Handle this, and checkthe new assumption at compile time.
Load the tlb addend earlier for the fast path....
tcg-ppc64: Reformat tcg-target.c
Whitespace and brace changes only.