History | View | Annotate | Download (283.2 kB)
target-i386: add AES-NI instructions
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>Reviewed-by: Richard Henderson <rth@twiddle.net>Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
target-i386: add pclmulqdq instruction
Reviewed-by: Richard Henderson <rth@twiddle.net>Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
target-i386: SSE4.1: fix pinsrb instruction
gen_op_mov_TN_reg() loads the value in cpu_T0, so this temporary shouldbe used instead of cpu_tmp0.
Reviewed-by: Richard Henderson <rth@twiddle.net>Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
target-i386: Fix flags computation for ADOX
When starting from CC_OP_DYNAMIC, and issuing adox before adcx,a typo used the wrong value for the resulting CC_OP.
Cc: Blue Swirl <blauwirbel@gmail.com>Reported-by: Torbjorn Granlund <tg@gmplib.org>Signed-off-by: Richard Henderson <rth@twiddle.net>...
Fix typos and misspellings
Fix various typos and misspellings. The bulk of these were found withcodespell.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>Reviewed-by: Stefan Weil <sw@weilnetz.de>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
gen-icount.h: Rename gen_icount_start/end to gen_tb_start/end
The gen_icount_start/end functions are now somewhat misnamed since theyare useful for generic "start/end of TB" code, used for more than justicount. Rename them to gen_tb_start/end.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>...
target-i386: Use mulu2 and muls2
These correspond very closely to the insns that we're emulating.
Signed-off-by: Richard Henderson <rth@twiddle.net>Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
target-i386: Use add2 to implement the ADX extension
target-i386: Use movcond to implement shift flags.
With this being all straight-line code, it can get deletedwhen the cc variables die.
Signed-off-by: Richard Henderson <rth@twiddle.net>
target-i386: Use movcond to implement rotate flags.
target-i386: Discard CC_OP computation in set_cc_op also
The shift and rotate insns use movcond to set CC_OP, and thusachieve a conditional EFLAGS setting. By discarding CC_OP ina later flags setting insn, we can discard that movcond.
target-i386: Use movcond to implement shiftd.
target-i386: Implement ADX extension
target-i386: Implement tzcnt and fix lzcnt
We weren't computing flags for lzcnt at all. At the same time,adjust the implementation of bsf/bsr to avoid the local branch,using movcond instead.
target-i386: Add CC_OP_CLR
Special case xor with self. We need not even store the knownzero into cc_src.
target-i386: Implement BZHI
target-i386: Implement MULX
target-i386: Implement PDEP, PEXT
target-i386: Implement SHLX, SARX, SHRX
target-i386: Implement RORX
target-i386: Implement BLSR, BLSMSK, BLSI
Do all of group 17 at one time for ease.
target-i386: Decode the VEX prefixes
No actual required uses of these encodings yet.
target-i386: Implement MOVBE
target-i386: Implement ANDN
As this is the first of the BMI insns to be implemented,this carries quite a bit more baggage than normal.
target-i386: Implement BEXTR
target-i386: Tidy prefix parsing
Avoid duplicating switch statement between 32 and 64-bit modes.
target-i386: Use CC_SRC2 for ADC and SBB
Add another slot in ENV and store two of the three inputs. This lets usdo less work when carry-out is not needed, and avoids the unpredictableCC_OP after translating these insns.
target-i386: Make helper_cc_compute_{all,c} const
Pass the data in explicitly, rather than indirectly via env.This avoids all sorts of unnecessary register spillage.
target-i386: use gen_op for cmps/scas
Replace low-level ops with a higher-level "cmp %al, (A0)" in the caseof scas, and "cmp T0, (A0)" in the case of cmps.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Richard Henderson <rth@twiddle.net>
target-i386: introduce gen_jcc1_noeob
A jump that ends a basic block or otherwise falls back to CC_OP_DYNAMICwill always have to call gen_op_set_cc_op. However, not all jumps enda basic block, so introduce a variant that does not do this.
This was partially undone earlier (i386: drop cc_op argument of gen_jcc1),...
target-i386: Update cc_op before TCG branches
Placing the CC_OP_DYNAMIC at the join is less effective thanbefore the branch, as the branch will have forced global registersto their home locations. This way we have a chance to discardCC_SRC2 before it gets stored....
target-i386: optimize flags checking after sub using CC_SRCT
After a comparison or subtraction, the original value of the LHS willcurrently be reconstructed using an addition. However, in most casesit is already available: store it in a temp-local variable and save 1...
target-i386: do not call helper to compute ZF/SF
ZF, SF and PF can always be computed from CC_DST except in theCC_OP_EFLAGS case (and CC_OP_DYNAMIC, which just resolves to CC_OP_EFLAGSin gen_compute_eflags). Use setcond to compute ZF and SF.
We could also use a table lookup to compute PF....
target-i386: use inverted setcond when computing NS or NZ
Make gen_compute_eflags_z and gen_compute_eflags_s able to compute theinverted condition, and use this in gen_setcc_slow_T0. We cannot do ityet in gen_compute_eflags_c, but prepare the code for it anyway. It is...
target-i386: convert gen_compute_eflags_c to TCG
Do the switch at translation time, converting the helper templates toTCG opcodes. In some cases CF can be computed with a single setcond,though others it may require a little more work.
In the CC_OP_DYNAMIC case, compute the whole EFLAGS, same as for ZF/SF/PF....
target-i386: change gen_setcc_slow_T0 to gen_setcc_slow
Do not hard code the destination register.
Reviewed-by: Blue Swirl <blauwirbel@gmail.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Richard Henderson <rth@twiddle.net>
target-i386: optimize setbe
This is looking at EFLAGS, but it can do so more efficiently withsetcond.
target-i386: optimize setle
And allow gen_setcc_slow to operate on cpu_cc_src.
target-i386: optimize setcc instructions
Reconstruct the arguments for complex conditions involving CC_OP_SUBx (BE,L, LE). In the others do it via setcond and gen_setcc_slow (which isnot that slow in many cases).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>...
target-i386: introduce CCPrepare
Introduce a struct that describes how to build a cond operationthat checks for a given x86 condition code. For now, just changegen_compute_eflags_ to return the new struct, generate code forthe CCPrepare struct, and go on as before....
target-i386: introduce gen_prepare_cc
This makes the i386 front-end able to create CCPrepare structs for allcondition, not just those that come from a single flag. In particular,JCC_L and JCC_LE can be optimized because gen_prepare_cc is not forcedto return a result in bit 0 (unlike gen_setcc_slow)....
target-i386: use CCPrepare to generate conditional jumps
This simplifies all the jump generation code. CCPrepare allows thecode to create an efficient brcond always, so there is no need toduplicate the setcc and jcc code.
target-i386: inline gen_prepare_cc_slow
target-i386: cleanup temporary macros for CCPrepare
target-i386: introduce gen_cmovcc1
target-i386: expand cmov via movcond
target-i386: kill cpu_T3
It is almost unused, and it is simpler to pass a TCG value directlyto gen_shiftd_rm_T1_T3. This value is then written to t2 withoutgoing through a temporary register.
target-i386: compute eflags outside rcl/rcr helper
Always compute EFLAGS first since it is needed wheneverthe shift is non-zero, i.e. most of the time. This makes it possibleto remove some writes of CC_OP_EFLAGS to cpu_cc_op and more importantlyremoves cases where s->cc_op becomes CC_OP_DYNAMIC. Also, we can...
target-i386: clean up sahf
Discard CC_DST and set s->cc_op immediately after computing EFLAGS.
target-i386: use gen_jcc1 to compile loopz
target-i386: factor gen_op_set_cc_op/tcg_gen_discard_tl around computing flags
Before computing flags we need to store the cc_op to memory. Move thisto gen_compute_eflags_c and gen_compute_eflags rather than doing it allover the place.
Alo, after computing the flags in cpu_cc_src we are in EFLAGS mode....
target-i386: Name the cc_op enumeration
target-i386: Introduce set_cc_op
This will provide a good hook into which we can consolidateall of the cc variable discards.
target-i386: Don't clobber s->cc_op in gen_update_cc_op
Use a dirty flag to know whether env->cc_op is up to date,rather than forcing s->cc_op to DYNAMIC and losing info.
target-i386: Use gen_update_cc_op everywhere
All of the conditional calls to gen_op_set_cc_op go away, andgen_op_set_cc_op itself gets inlined into its only remaining caller.
target-i386: add helper functions to get other flags
Introduce new functions to extract PF, SF, OF, ZF in addition to CF.These provide single entry points for optimizing accesses to a singleflag.
Reviewed-by: Blue Swirl <blauwirbel@gmail.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>...
target-i386: do not compute eflags multiple times consecutively
After calling gen_compute_eflags, leave the computed value in cc_reg_srcand set cc_op to CC_OP_EFLAGS. The next few patches will remove anywaymost calls to gen_compute_eflags.
As a result of this change it is more natural to remove the register...
target-i386: no need to flush out cc_op before gen_eob
This makes code more similar to the other callers of gen_eob, especiallyloopz/loopnz/jcxz.
target-i386: Move CC discards to set_cc_op
This gets us universal coverage, rather than scattering discardsaround at various places. As a bonus, we do not emit redundantdiscards e.g. between sequential logic insns.
target-i386: use OT_* consistently
target-i386: introduce gen_ext_tl
Introduce a function that abstracts extracting an 8, 16, 32 or 64-bit valuewith or without sign, generalizing gen_extu and gen_exts.
target-i386: factor setting of s->cc_op handling for string functions
Set it to the appropriate CC_OP_SUBx constant in gen_scas/gen_cmps.In the repz case it can be overridden to CC_OP_DYNAMIC after generatingthe code.
target-i386: drop cc_op argument of gen_jcc1
As in the gen_repz_scas/gen_repz_cmps case, delay settingCC_OP_DYNAMIC in gen_jcc until after code generation. All ofgen_jcc1/is_fast_jcc/gen_setcc_slow_T0 now work on s->cc_op, which makesthings a bit easier to follow and to patch....
target-i386: move carry computation for inc/dec closer to gen_op_set_cc_op
This ensures the invariant that cpu_cc_op matches s->cc_op when callingthe helpers. The next patches need this because gen_compute_eflags andgen_compute_eflags_c will take care of setting cpu_cc_op....
target-i386: move eflags computation closer to gen_op_set_cc_op
qemu-log: Rename the public-facing cpu_set_log function to qemu_set_log
Rename the public-facing function cpu_set_log to qemu_set_log. Thisrequires us to rename the internal-only qemu_set_log() todo_qemu_set_log().
exec: move include files to include/exec/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
build: kill libdis, move disassemblers to disas/
TCG: Use gen_opc_instr_start from context instead of global variable.
Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com>Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
TCG: Use gen_opc_pc from context instead of global variable.
TCG: Use gen_opc_icount from context instead of global variable.
TCG: Use gen_opc_buf from context instead of global variable.
Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com>Reviewed-by: Richard Henderson <rth@twiddle.net>Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
TCG: Use gen_opc_ptr from context instead of global variable.
target-i386: avoid using cpu_single_env
Pass around CPUArchState instead of using global cpu_single_env.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>Reviewed-by: Andreas Färber <afaerber@suse.de>
disas: avoid using cpu_single_env
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>Acked-by: Richard Henderson <rth@twiddle.net>Acked-by: Aurelien Jarno <aurelien@aurel32.net>Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
Fix popcnt in long mode
Thanks to Andriy Gapon for initial problem report.
Signed-off-by: malc <av1474@comtv.ru>
x86: Implement SMEP and SMAP
This patch implements Supervisor Mode Execution Prevention (SMEP) andSupervisor Mode Access Prevention (SMAP) for x86. The purpose of thepatch, obviously, is to help kernel developers debug the support forthose features....
Emit debug_insn for CPU_LOG_TB_OP_OPT as well.
For all targets that currently call tcg_gen_debug_insn_start,add CPU_LOG_TB_OP_OPT to the condition that gates it.
This is useful for comparing optimization dumps, when thepre-optimization dump is merely noise....
target-i386/translate.c: mov to/from crN/drN: ignore mod bits
This instruction is always treated as a register-to-register (MOD = 11)instruction, regardless of the encoding of the MOD field in the MODR/Mbyte.
Also, Microport UNIX System V/386 v 2.1 (ca 1987) runs fine on...
x86: avoid AREG0 for misc helpers
Add an explicit CPUX86State parameter instead of relying on AREG0.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
x86: avoid AREG0 in segmentation helpers
Rename remains of op_helper.c to seg_helper.c.
x86: switch to AREG0 free mode
Remove temporary wrappers and switch to AREG0 free mode.
x86: avoid AREG0 for FPU helpers
Make FPU helpers take a parameter for CPUState insteadof relying on global env.
Introduce temporary wrappers for FPU load and store ops. Removewrappers for non-AREG0 code. Don't call unconverted helpersdirectly.
x86: avoid AREG0 for condition code helpers
x86: avoid AREG0 for integer helpers
x86: avoid AREG0 for SVM helpers
x86: avoid AREG0 for SMM helpers
x86: Fixed incorrect segment base address addition in 64-bits mode
According to the Intel manual"Intel® 64 and IA-32 Architectures Software Developer’s ManualVolume 3", "3.4.4 Segment Loading Instructions in IA-32e Mode":
"When in compatibility mode, FS and GS overrides operate as defined by...
target-i386: make it clearer that op table accesses don't overrun
Rephrase some of the expressions used to select an entryin the SSE op table arrays so that it's clearer that theydon't overrun the op table array size.
target-i386: Remove confusing X86_64_DEF macro
The X86_64_DEF macro is a confusing way of making some termsin a conditional only appear if TARGET_X86_64 is defined. Weonly use it in two places, and in both cases this is for makingthe same test, so abstract that check out into a function...
target-i386: Remove unused macros
Commit 11f8cdb removed all the uses of the X86_64_ONLYmacro. The BUGGY_64() macro has been unused for a long time:it originally marked some ops which couldn't be enabledbecause of issues with the pre-TCG code generation scheme....
target-i386: Fix compilation with --enable-debug
commit c4baa0503d9623f1ce891f525ccd140c598bc29a improved SSE tabletype safety which now raises compiler errors when latest QEMU wasconfigured with --enable-debug.
Fix this by splitting the SSE tables even further to separate...
x86: avoid AREG0 for exceptions
Merge raise_exception_env() to raise_exception(), likewise withraise_exception_err_env() and raise_exception_err().
Introduce cpu_svm_check_intercept_param() and cpu_vmexit()...
x86: improve SSE table type safety
SSE function tables could easily be corrupted because of useof void pointers.
Introduce function pointer types and helper variables in orderto improve type safety.
Split sse_op_table3 according to types used.
target-i386: Don't overuse CPUState
Scripted conversion: sed -i "s/CPUState/CPUX86State/g" target-i386/*.[hc] sed -i "s/#define CPUX86State/#define CPUState/" target-i386/cpu.h
Signed-off-by: Andreas Färber <afaerber@suse.de>Acked-by: Anthony Liguori <aliguori@us.ibm.com>
target-i386: fix compilation with --enable-debug-tcg
Commit 2355c16e74ffa4d14e7fc2b4a23b055565ac0221 introduced a new ldmxcsrhelper taking an i32 argument, but the helper is actually passed a long.Fix that by truncating the long to i32.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
target-i386: fix SSE rounding and flush to zero
SSE rounding and flush to zero control has never been implemented. Howevergiven that softfloat-native was using a single state for FPU and SSE andgiven that glibc is setting both FPU and SSE state in fesetround(), this...
target-i386: fix cmpxchg instruction emulation
When the i386 cmpxchg instruction is executed with a memory operandand the comparison result is "unequal", do the memory write beforechanging the accumulator instead of the other way around, becauseotherwise the new accumulator value will incorrectly be used in the...
target-i386: Remove redundant word mask in port out instructions
T0 was already masked to 16 bits when loading it.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>Reviewed-by: Richard Henderson <rth@twiddle.net>Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
target-i386: Remove data type CCTable
Remove also two assert statements which were the last remaining users.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>