qemu-cr16

Author	SHA1	Message	Date
Markus Armbruster	8a2b516ba2	cleanup: Drop pointless return at end of function A few functions now end with a label. The next commit will clean them up. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20250407082643.2310002-3-armbru@redhat.com> [Straightforward conflict with commit `988ad4cceb` (hw/loongarch/virt: Fix cpuslot::cpu set at last in virt_cpu_plug()) resolved]	2025-04-24 09:33:42 +02:00
Paolo Bonzini	9a688e70bd	target/i386: tcg: remove some more uses of temporaries Remove all uses of 32-bit temporaries in emit.c.inc. Remove uses in translate.c outside the large multiplexed generator functions. tmp3_i32 is not used anymore and can go away. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-04-17 18:23:26 +02:00
Paolo Bonzini	d387eb7fa9	target/i386: tcg: remove tmp0 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-04-17 18:23:26 +02:00
Paolo Bonzini	a440975cc3	target/i386: tcg: remove subf from SHLD/SHRD expansion It is computing 33-count but 32-count had just been used, so just shift further by one. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-04-17 18:23:26 +02:00
Paolo Bonzini	ec6019f230	target/i386: tcg: remove tmp0 and tmp4 from SHLD/SHRD Apply some of the simplifications used for RCL and RCR. tmp4 is not used anywhere else, so remove it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-04-17 18:23:26 +02:00
Paolo Bonzini	bd65810538	target/i386: special case ADC/SBB x,0 and SBB x,x Avoid the three-operand CC_OP_ADD and CC_OP_ADC in these relatively common cases. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-04-17 18:23:26 +02:00
Paolo Bonzini	22063f03a7	target/i386: avoid using s->tmp0 for add to implicit registers For updates to implicit registers (RCX in LOOP instructions, RSI or RDI in string instructions, or the stack pointer) do the add directly using the registers (with no temporary) if 32-bit or 64-bit, or use a temporary created for the occasion if 16-bit. This is more efficient and removes move instructions for the MO_TL case. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-14-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:50:53 +01:00
Paolo Bonzini	82290c7647	target/i386: extract common bits of gen_repz/gen_repz_nz Now that everything has been cleaned up, look at DF and prefixes in a single function, and call that one from gen_repz and gen_repz_nz. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:50:44 +01:00
Paolo Bonzini	4f094e27f3	target/i386: pull computation of string update value out of loop This is a common operation that is executed many times in rep movs or rep stos loops. It can improve performance by several percentage points. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241215090613.89588-13-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	456709db50	target/i386: execute multiple REP/REPZ iterations without leaving TB Use a TCG loop so that it is not necessary to go through the setup steps of REP and through the I/O check on every iteration. Interestingly, this is not a particularly effective optimization on its own, though it avoids the cost of correct RF emulation that was added in the previous patch. The main benefit lies in allowing the hoisting of loop invariants outside the loop, which will happen separately. The loop exits when the low 16 bits of CX/ECX/RCX are zero (so generally speaking the string operation runs in 65536 iteration batches) to give the main loop an opportunity to pick up interrupts. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-12-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	0360b78187	target/i386: optimize CX handling in repeated string operations In a repeated string operation, CX/ECX will be decremented until it is 0 but never underflow. Use this observation to avoid a deposit or zero-extend operation if the address size of the operation is smaller than MO_TL. As in the previous patch, the patch is structured to include some preparatory work for subsequent changes. In particular, introducing cx_next prepares for when ECX will be decremented before calling fn(s, ot), and therefore cannot yet be written back to cpu_regs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-11-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	3658116025	target/i386: do not use gen_op_jz_ecx for repeated string operations Explicitly generate a TSTEQ branch (which is optimized to NE x,0 if possible). This does not make much sense yet, but later we will add more checks and some will use a temporary to check on the decremented value of CX/ECX/RCX; it will be clearer for all checks to share the same logic using TSTEQ(reg, cx_mask). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-10-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	6986cf0032	target/i386: make cc_op handling more explicit for repeated string instructions. Since the cost of gen_update_cc_op() must be paid anyway, it's easier to place them manually and not rely on spilling that is buried under multiple levels of function calls. While at it, clarify the circumstances in which the gen_update_cc_op() is needed, and why it is not for REPxx SCAS and REPxx CMPS. And since cc_op will have been spilled at the point of a fault, just make the whole insn CC_OP_DYNAMIC. Once repz_opt is reintroduced, a fault could happen either before or after the first execution of CMPS/SCAS, and CC_OP_DYNAMIC sidesteps the complicated matter of what x86_restore_state_to_opc would do. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241215090613.89588-9-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	0d82d9e846	target/i386: fix RF handling for string instructions RF must be set on traps and interrupts from a string instruction, except if they occur after the last iteration. Ensure it is set before giving the main loop a chance to execute. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-8-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	4d7704ebc5	target/i386: tcg: move gen_set/reset_* earlier in the file Allow using them in the code that translates REP/REPZ, without forward declarations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-7-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	0eb7046e1b	target/i386: reorganize ops emitted by do_gen_rep, drop repz_opt The condition for optimizing repeat instruction is more or less the opposite of what you imagine: almost always the string instruction was _not_ optimized and optimizing the loop relied on goto_tb. This is obviously not great for performance, due to the cost of the exit-to-main-loop check, but also wrong. In fact, after expanding dc->jmp_opt and simplifying "!!x" to "x", the condition for looping used to be: ((cflags & CF_NO_GOTO_TB) \|\| (flags & (HF_RF_MASK \| HF_TF_MASK \| HF_INHIBIT_IRQ_MASK))) && !(cflags & CF_USE_ICOUNT) In other words, setting aside RF (it requires special handling for REP instructions and it was completely missing), repeat instruction were being optimized if TF or inhibit IRQ flags were set. This is certainly wrong for TF, because string instructions trap after every execution, and probably for interrupt shadow too. Get rid of repz_opt completely. The next patches will reintroduce the optimization, applying it in the common case instead of the unlikely and wrong one. While at it, place the CX/ECX/RCX=0 case is at the end of the function, which saves a label and is clearer when reading the generated ops. For clarity, mark the cc_op explicitly as DYNAMIC even if at the end of the translation block; the cc_op can come from either the previous instruction or the string instruction, and currently we rely on a gen_update_cc_op() that is hidden in the bowels of gen_jcc() to spill cc_op and mark it clean. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-6-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	d8d552d459	target/i386: unify choice between single and repeated string instructions The same "if" is present in all generator functions for string instructions. Push it inside gen_repz() and gen_repz_nz() instead. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241215090613.89588-5-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	b519556f58	target/i386: unify REP and REPZ/REPNZ generation It only differs in a single call to gen_jcc, so use a "bool" argument to distinguish the two cases; do not duplicate code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-4-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	e604be4fb4	target/i386: remove trailing 1 from gen_{j, cmov, set}cc1 This is not needed anymore now that gen_jcc has been eliminated (merged into the similarly-named gen_Jcc, where the uppercase letter gives away that it is an emission function). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-3-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	6ace2d5163	target/i386: inline gen_jcc into sole caller The code of gen_Jcc is very similar to gen_LOOP* and gen_JCXZ, but this is hidden by gen_jcc. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-2-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	ef682b08a0	target/i386: use shr to load high-byte registers into T0/T1 Using a sextract or extract operation is only necessary if a sign or zero extended value is needed. If not, a shift is enough. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-10 23:34:44 +01:00
Richard Henderson	e4a8e093dc	accel/tcg: Move gen_intermediate_code to TCGCPUOps.translate_core Convert all targets simultaneously, as the gen_intermediate_code function disappears from the target. While there are possible workarounds, they're larger than simply performing the conversion. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-12-24 08:32:15 -08:00
Philippe Mathieu-Daudé	a9ca97ea9e	accel/tcg: Un-inline translator_is_same_page() Remove the single target-specific definition used in "exec/translator.h" (TARGET_PAGE_MASK) by un-inlining is_same_page(). Rename the method as translator_is_same_page() and improve its documentation. Use it in translator_use_goto_tb(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20241218154145.71353-1-philmd@linaro.org>	2024-12-20 17:44:57 +01:00
Philippe Mathieu-Daudé	68df8c8dba	accel/tcg: Include missing 'exec/translation-block.h' header TB compile flags, tb_page_addr_t type, tb_cflags() and few other methods are defined in "exec/translation-block.h". All these files don't include "exec/translation-block.h" but include "exec/exec-all.h" which include it. Explicitly include "exec/translation-block.h" to be able to remove it from "exec/exec-all.h" later when it won't be necessary. Otherwise we'd get errors such: accel/tcg/internal-target.h:59:20: error: a parameter list without types is only allowed in a function definition 59 \| void tb_lock_page0(tb_page_addr_t); \| ^ accel/tcg/tb-hash.h:64:23: error: unknown type name 'tb_page_addr_t' 64 \| uint32_t tb_hash_func(tb_page_addr_t phys_pc, vaddr pc, \| ^ accel/tcg/tcg-accel-ops.c:62:36: error: use of undeclared identifier 'CF_CLUSTER_SHIFT' 62 \| cflags = cpu->cluster_index << CF_CLUSTER_SHIFT; \| ^ accel/tcg/watchpoint.c:102:47: error: use of undeclared identifier 'CF_NOIRQ' 102 \| cpu->cflags_next_tb = 1 \| CF_NOIRQ \| curr_cflags(cpu); \| ^ target/i386/helper.c:536:28: error: use of undeclared identifier 'CF_PCREL' 536 \| if (tcg_cflags_has(cs, CF_PCREL)) { \| ^ target/rx/cpu.c:51:21: error: incomplete definition of type 'struct TranslationBlock' 51 \| cpu->env.pc = tb->pc; \| ~~^ system/physmem.c:2977:9: error: call to undeclared function 'tb_invalidate_phys_range'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 2977 \| tb_invalidate_phys_range(addr, addr + length - 1); \| ^ plugins/api.c:96:12: error: call to undeclared function 'tb_cflags'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 96 \| return tb_cflags(tcg_ctx->gen_tb) & CF_MEMI_ONLY; \| ^ Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20241114011310.3615-5-philmd@linaro.org>	2024-12-20 17:44:57 +01:00
Philippe Mathieu-Daudé	32cad1ffb8	include: Rename sysemu/ -> system/ Headers in include/sysemu/ are not only related to system emulation, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>	2024-12-20 17:44:56 +01:00
Paolo Bonzini	44d58e938b	target/i386: add a note about gen_jcc1 Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	cea677e821	target/i386: add a few more trivial CCPrepare cases Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	37df7c4d57	target/i386: optimize TEST+Jxx sequences Mostly used for TEST+JG and TEST+JLE, but it is easy to cover also JBE/JA and JL/JGE; shaves about 0.5% TCG ops. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	ae14b33de8	target/i386: optimize computation of ZF from CC_OP_DYNAMIC Most uses of CC_OP_DYNAMIC are for CMP/JB/JE or similar sequences. We can optimize many of them to avoid computation of the flags. This eliminates both TCG ops to set up the new cc_op, and helper instructions because evaluating just ZF is much cheaper. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Richard Henderson	1f7f72bdc4	target/i386: Wrap cc_op_live with a validity check Assert that op is known and that cc_op_live_ is populated. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Richard Henderson	f359b2fb71	target/i386: Introduce cc_op_size Replace arithmetic on cc_op with a helper function. Assert that the op has a size and that it is valid for the configuration. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20240701025115.1265117-6-richard.henderson@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	e09447c39f	target/i386: remove CC_OP_CLR Just use CC_OP_EFLAGS; it is not that likely that the flags computed by CC_OP_CLR survive the end of the basic block, in which case there is no need to spill cc_op_src. cc_op_src now does need spilling if the XOR is followed by a memory operation, but this only costs 0.2% extra TCG ops. They will be recouped by simplifications in how QEMU evaluates ZF at runtime, which are even greater with this change. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	a635390f05	target/i386: use tcg_gen_ext_tl when applicable Prefer it to gen_ext_tl in the common case where the destination is known. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	ac92afd19e	target/i386: assert that cc_op* and pc_save are preserved Now all decoding has been done before any code generation. There is no need anymore to save and restore cc_op* and pc_save but, for the time being, assert that this is indeed the case. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-17 19:41:30 +02:00
Paolo Bonzini	f091a3f324	target/i386: do not check PREFIX_LOCK in old-style decoder It is already checked before getting there. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-17 19:41:30 +02:00
Paolo Bonzini	fcd16539eb	target/i386: convert CMPXCHG8B/CMPXCHG16B to new decoder The gen_cmpxchg8b and gen_cmpxchg16b functions even have the correct prototype already; the only thing that needs to be done is removing the gen_lea_modrm() call. This moves the last LOCK-enabled instructions to the new decoder. It is now possible to assume that gen_multi0F is called only after checking that PREFIX_LOCK was not specified. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-17 19:41:29 +02:00
Paolo Bonzini	a2e2c78d2a	target/i386: decode address before going back to translate.c There are now relatively few unconverted opcodes in translate.c (there are 13 of them including 8 for x87), and all of them have the same format with a mod/rm byte and no immediate. A good next step is to remove the early bail out to disas_insn_x87/disas_insn_old, instead giving these legacy translator functions the same prototype as the other gen_* functions. To do this, the X86DecodeInsn can be passed down to the places that used to fetch address bytes from the instruction stream. To make sure that everything is done cleanly, the CPUX86State* argument is removed. As part of the unification, the gen_lea_modrm() name is now free, so rename gen_load_ea() to gen_lea_modrm(). This is as good a name and it makes the changes to translate.c easier to review. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-17 19:41:29 +02:00
Paolo Bonzini	10eae89937	target/i386: convert bit test instructions to new decoder Code generation was rewritten; it reuses the same trick to use the CC_OP_SAR values for cc_op, but it tries to use CC_OP_ADCX or CC_OP_ADCOX instead of CC_OP_EFLAGS. This is a tiny bit more efficient in the common case where only CF is checked in the resulting flags. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-17 19:41:29 +02:00
Richard Henderson	83a3a20e59	target/i386: Fix carry flag for BLSI BLSI has inverted semantics for C as compared to the other two BMI1 instructions, BLSMSK and BLSR. Introduce CC_OP_BLSI* for this purpose. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2175 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240801075845.573075-3-richard.henderson@linaro.org>	2024-08-21 09:11:26 +10:00
Richard Henderson	266d6dddbd	target/i386: Split out gen_prepare_val_nz Split out the TCG_COND_TSTEQ logic from gen_prepare_eflags_z, and use it for CC_OP_BMILG* as well. Prepare for requiring both zero and non-zero senses. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20240801075845.573075-2-richard.henderson@linaro.org>	2024-08-21 09:11:26 +10:00
Richard Henderson	ac63755b20	target/i386: Fix VSIB decode With normal SIB, index == 4 indicates no index. With VSIB, there is no exception for VR4/VR12. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2474 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20240805003130.1421051-3-richard.henderson@linaro.org Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-08-05 14:14:47 +02:00
Paolo Bonzini	74f73c2918	target/i386: remove unused enum Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-28 19:26:54 +02:00
Paolo Bonzini	460231ad36	target/i386: give CC_OP_POPCNT low bits corresponding to MO_TL Handle it like the other arithmetic cc_ops. This simplifies a bit the implementation of bit test instructions. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-28 14:44:52 +02:00
Paolo Bonzini	944f400134	target/i386: use cpu_cc_dst for CC_OP_POPCNT It is the only CCOp, among those that compute ZF from one of the cc_op_* registers, that uses cpu_cc_src. Do not make it the odd one off, instead use cpu_cc_dst like the others. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-28 14:44:52 +02:00
Paolo Bonzini	0c4da54883	target/i386: convert CMPXCHG to new decoder Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-17 09:47:39 +02:00
Paolo Bonzini	7b1f25ac3a	target/i386: convert XADD to new decoder Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-17 09:47:39 +02:00
Paolo Bonzini	11ffaf8c73	target/i386: convert LZCNT/TZCNT/BSF/BSR/POPCNT to new decoder Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-17 09:47:39 +02:00
Paolo Bonzini	6476902740	target/i386: convert SHLD/SHRD to new decoder Use the same flag generation code as SHL and SHR, but use the existing gen_shiftd_rm_T1 function to compute the result as well as CC_SRC. Decoding-wise, SHLD/SHRD by immediate count as a 4 operand instruction because s->T0 and s->T1 actually occupy three op slots. The infrastructure used by opcodes in the 0F 3A table works fine. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-17 09:47:39 +02:00
Paolo Bonzini	87b2037b65	target/i386: pull load/writeback out of gen_shiftd_rm_T1 Use gen_ld_modrm/gen_st_modrm, moving them and gen_shift_flags to the caller. This way, gen_shiftd_rm_T1 becomes something that the new decoder can call. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-17 09:47:39 +02:00
Paolo Bonzini	ae541c0eb4	target/i386: convert non-grouped, helper-based 2-byte opcodes These have very simple generators and no need for complex group decoding. Apart from LAR/LSL which are simplified to use gen_op_deposit_reg_v and movcond, the code is generally lifted from translate.c into the generators. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-17 09:47:39 +02:00

1 2 3 4 5 ...

311 commits