qemu-cr16

Author	SHA1	Message	Date
Peter Maydell	d168a08147	target/arm: Remove redundant advsimd float16 helpers The advsimd_addh etc helpers defined in helper-a64.c are identical to the vfp_addh etc helpers defined in helper-vfp.c: both take two float16 inputs (in a uint32_t type) plus a float_status* and are simple wrappers around the softfloat float16_* functions. (The duplication seems to be a historical accident: we added the advsimd helpers in 2018 as part of the A64 implementation, and at that time there was no f16 emulation in A32. Then later we added the A32 f16 handling by extending the existing VFP helper macros to generate f16 versions as well as f32 and f64, and didn't realise we could clean things up.) Remove the now-unnecessary advsimd helpers and make the places that generated calls to them use the vfp helpers instead. Many of the helper functions were already unused. (The remaining advsimd_ helpers are those which don't have vfp versions.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-26-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	7af64d103d	fpu: Rename float_flag_output_denormal to float_flag_output_denormal_flushed Our float_flag_output_denormal exception flag is set when the fpu code flushes an output denormal to zero. Rename it to float_flag_output_denormal_flushed: * this keeps it parallel with the flag for flushing input denormals, which we just renamed * it makes it clearer that it doesn't mean "set when the output is a denormal" Commit created with for f in `git grep -l float_flag_output_denormal`; do sed -i -e 's/float_flag_output_denormal/float_flag_output_denormal_flushed/' $f; done Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-21-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	584b7aec81	fpu: Rename float_flag_input_denormal to float_flag_input_denormal_flushed Our float_flag_input_denormal exception flag is set when the fpu code flushes an input denormal to zero. This is what many guest architectures (eg classic Arm behaviour) require, but it is not the only donarmal-related reason we might want to set an exception flag. The x86 behaviour (which we do not currently model correctly) wants to see an exception flag when a denormal input is not flushed to zero and is actually used in an arithmetic operation. Arm's FEAT_AFP also wants these semantics. Rename float_flag_input_denormal to float_flag_input_denormal_flushed to make it clearer when it is set and to allow us to add a new float_flag_input_denormal_used next to it for the x86/FEAT_AFP semantics. Commit created with for f in `git grep -l float_flag_input_denormal`; do sed -i -e 's/float_flag_input_denormal/float_flag_input_denormal_flushed/' $f; done and manual editing of softfloat-types.h and softfloat.c to clean up the indentation afterwards and to fix a comment which wasn't using the full name of the flag. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-20-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	3847b5b1fb	target/arm: Remove now-unused vfp.fp_status_f16 and FPST_FPCR_F16 Now we have moved all the uses of vfp.fp_status_f16 and FPST_FPCR_F16 to the new A32 or A64 fields, we can remove these. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-19-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	230c2bd3f2	target/arm: Use FPST_A64_F16 in A64 decoder In the A32 decoder, use FPST_A64_F16 rather than FPST_FPCR_F16. By doing an automated conversion of the whole file we avoid possibly using more than one fpst value in a set_rmode/op/restore_rmode sequence. Patch created with perl -p -i -e 's/FPST_FPCR_F16(?!_)/FPST_A64_F16/g' target/arm/tcg/translate-{a64,sve,sme}.c Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-18-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	e935710bc8	target/arm: Use FPST_A32_F16 in A32 decoder In the A32 decoder, use FPST_A32_F16 rather than FPST_FPCR_F16. By doing an automated conversion of the whole file we avoid possibly using more than one fpst value in a set_rmode/op/restore_rmode sequence. Patch created with perl -p -i -e 's/FPST_FPCR_F16(?!_)/FPST_A32_F16/g' target/arm/tcg/translate-vfp.c Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-17-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	e4b3c388f9	target/arm: Use fp_status_f16_a64 in AArch64-only helpers We directly use fp_status_f16 in a handful of helpers that are AArch64-specific; switch to fp_status_f16_a64 for these. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-16-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	85fffc1085	target/arm: Use fp_status_f16_a32 in AArch32-only helpers We directly use fp_status_f16 in a handful of helpers that are AArch32-specific; switch to fp_status_f16_a32 for these. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-15-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	5f4ed6da85	target/arm: Define new fp_status_f16_a32 and fp_status_f16_a64 As the first part of splitting the existing fp_status_f16 into separate float_status fields for AArch32 and AArch64 (so that we can make FEAT_AFP control bits apply only for AArch64), define the two new fp_status_f16_a32 and fp_status_f16_a64 fields, but don't use them yet. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-14-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	2aa9656ebc	target/arm: Remove now-unused vfp.fp_status and FPST_FPCR Now we have moved all the uses of vfp.fp_status and FPST_FPCR to either the A32 or A64 fields, we can remove these. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-13-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	e107a7a54e	target/arm: Use FPST_A64 in A64 decoder In the A64 decoder, use FPST_A64 rather than FPST_FPCR. By doing an automated conversion of the whole file we avoid possibly using more than one fpst value in a set_rmode/op/restore_rmode sequence. Patch created with perl -p -i -e 's/FPST_FPCR(?!_)/FPST_A64/g' target/arm/tcg/translate-{a64,sve,sme}.c Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-12-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	961a8b3fb8	target/arm: Use FPST_A32 in A32 decoder In the A32 decoder, use FPST_A32 rather than FPST_FPCR. By doing an automated conversion of the whole file we avoid possibly using more than one fpst value in a set_rmode/op/restore_rmode sequence. Patch created with perl -p -i -e 's/FPST_FPCR(?!_)/FPST_A32/g' target/arm/tcg/translate-vfp.c Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-11-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	d1ce6db3b1	target/arm: Use fp_status_a32 in vfp_cmp helpers The helpers vfp_cmps, vfp_cmpes, vfp_cmpd, vfp_cmped are used only from the A32 decoder; the A64 decoder uses separate vfp_cmps_a64 etc helpers (because for A64 we update the main NZCV flags and for A32 we update the FPSCR NZCV flags). So we can make these helpers use the fp_status_a32 field instead of fp_status. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-10-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	1069d8ab30	target/arm: Use fp_status_a32 in vjvct helper Use fp_status_a32 in the vjcvt helper function; this is called only from the A32/T32 decoder and is not used inside a set_rmode/restore_rmode sequence. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-9-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	75df4e8609	target/arm: Use fp_status_a64 or fp_status_a32 in is_ebf() In is_ebf(), we might be called for A64 or A32, but we have the CPUARMState* so we can select fp_status_a64 or fp_status_a32 accordingly. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2025-01-28 18:40:19 +00:00
Peter Maydell	57bd2f30ff	target/arm: Use vfp.fp_status_a64 in A64-only helper functions Switch from vfp.fp_status to vfp.fp_status_a64 for helpers which: * directly reference an fp_status field * are called only from the A64 decoder * are not called inside a set_rmode/restore_rmode sequence Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20250124162836.2332150-8-peter.maydell@linaro.org Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2025-01-28 18:40:19 +00:00
Peter Maydell	2208cb46e6	target/arm: Define new fp_status_a32 and fp_status_a64 We want to split the existing fp_status in the Arm CPUState into separate float_status fields for AArch32 and AArch64. (This is because new control bits defined by FEAT_AFP only have an effect for AArch64, not AArch32.) To make this split we will: * define new fp_status_a32 and fp_status_a64 which have identical behaviour to the existing fp_status * move existing uses of fp_status to fp_status_a32 or fp_status_a64 as appropriate * delete the old fp_status when it has no uses left In this patch we add the new float_status fields. We will also need to split fp_status_f16, but we will do that as a separate series of patches. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-7-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	eda8d53083	target/arm: Use uint32_t in vfp_exceptbits_from_host() In vfp_exceptbits_from_host(), we accumulate the FPSR flags in an "int", and our return type is also "int". However, the only callsite returns the same information as a uint32_t, and more generally we handle FPSR values in the code as uint32_t, not int. Bring this function in to line with that convention. There is no behaviour change because none of the FPSR bits we set in this function are bit 31. The input argument to the function remains 'int' because that is the return type of the softfloat get_float_exception_flags(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-6-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	f10dee833f	target/arm: Use FPSR_ constants in vfp_exceptbits_from_host() Use the FPSR_ named constants in vfp_exceptbits_from_host(), rather than hardcoded magic numbers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-5-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Peter Maydell	1edc3d43f2	target/arm: arm_reset_sve_state() should set FPSR, not FPCR The pseudocode ResetSVEState() does: FPSR = ZeroExtend(0x0800009f<31:0>, 64); but QEMU's arm_reset_sve_state() called vfp_set_fpcr() by accident. Before the advent of FEAT_AFP, this was only setting a collection of RES0 bits, which vfp_set_fpsr() would then ignore, so the only effect was that we didn't actually set the FPSR the way we are supposed to do. Once FEAT_AFP is implemented, setting the bottom bits of FPSR will change the floating point behaviour. Call vfp_set_fpsr(), as we ought to. (Note for stable backports: commit `7f2a01e736` moved this function from sme_helper.c to helper.c, but it had the same bug before the move too.) Cc: qemu-stable@nongnu.org Fixes: `f84734b874` ("target/arm: Implement SMSTART, SMSTOP") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-4-peter.maydell@linaro.org	2025-01-28 18:40:19 +00:00
Stefan Hajnoczi	b5afd8c023	hppa updates * Fixes booting a Linux kernel which is provided on the command line. * Allow more than 4GB RAM on 64-bit boxes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZ5PvvgAKCRD3ErUQojoP X7JQAQCn2MR4k4lfClDZHNmAFUNw51j56SB5HC/FCUKfOx4dCQD/Tf2OV/gstMOz nfpvIH6ouXZ2/p5npzTyOt+A8fwUpw0= =qrs7 -----END PGP SIGNATURE----- Merge tag 'hppa-system-for-v10-pull-request' of https://github.com/hdeller/qemu-hppa into staging hppa updates * Fixes booting a Linux kernel which is provided on the command line. * Allow more than 4GB RAM on 64-bit boxes # -----BEGIN PGP SIGNATURE----- # # iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZ5PvvgAKCRD3ErUQojoP # X7JQAQCn2MR4k4lfClDZHNmAFUNw51j56SB5HC/FCUKfOx4dCQD/Tf2OV/gstMOz # nfpvIH6ouXZ2/p5npzTyOt+A8fwUpw0= # =qrs7 # -----END PGP SIGNATURE----- # gpg: Signature made Fri 24 Jan 2025 14:53:34 EST # gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F # gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown] # gpg: aka "Helge Deller <deller@kernel.org>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603 # Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F * tag 'hppa-system-for-v10-pull-request' of https://github.com/hdeller/qemu-hppa: hw/hppa: Fix booting Linux kernel with initrd hw/hppa: Support up to 256 GiB RAM on 64-bit machines Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2025-01-27 11:20:21 -05:00
Helge Deller	c656f293df	hw/hppa: Fix booting Linux kernel with initrd Commit `20f7b89017` ("hw/hppa: Reset vCPUs calling resettable_reset()") broke booting the Linux kernel with initrd which may have been provided on the command line. The problem is, that the mentioned commit zeroes out initial registers which were preset with addresses for the Linux kernel and initrd. Fix it by adding proper variables which are set shortly before starting the firmware. Signed-off-by: Helge Deller <deller@gmx.de> Fixes: `20f7b89017` ("hw/hppa: Reset vCPUs calling resettable_reset()") Cc: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2025-01-24 20:51:53 +01:00
Bibo Mao	3215fe8528	target/loongarch: Dump all generic CSR registers CSR registers is import system control registers, it had better dump all CSR registers when VM is running in system mode. Here is dump output example of CSR registers: CSR000: CRMD b4 PRMD 4 EUEN 0 MISC 0 CSR004: ECFG 71c1c ESTAT 0 ERA 9000000002c31300 BADV 12022c0e0 CSR008: BADI 2b0000 CSR012: EENTRY 90000000046b0000 CSR016: TLBIDX ffffffff8e000228 TLBEHI 120228000 TLBELO0 400000016f19001f TLBELO1 400000016f1a401f CSR024: ASID a0004 PGDL 90000001016f0000 PGDH 9000000004680000 PGD 0 CSR028: PWCL 5e56e PWCH 2e4 STLBPS e RVACFG 0 CSR032: CPUID 0 PRCFG1 72f8 PRCFG2 3ffff000 PRCFG3 8073f2 CSR048: SAVE0 0 SAVE1 af9c SAVE2 12010d6a8 SAVE3 8300000 CSR052: SAVE4 0 SAVE5 0 SAVE6 0 SAVE7 0 CSR064: TID 0 TCFG 8f0ca15 TVAL 4cefd8b CNTC fffffffffe688aaa CSR068: TICLR 0 CSR096: LLBCTL 1 CSR136: TLBRENTRY 46ba000 TLBRBADV ffff8000130d81e2 TLBRERA 9000000003585cb8 TLBRSAVE ffff8000130d81e0 CSR140: TLBRELO0 1fe00043 TLBRELO1 40 TLBREHI ffff8000130d800e TLBRPRMD 0 CSR384: DMW0 8000000000000001 DMW1 9000000000000011 DMW2 0 DMW3 0 Signed-off-by: Bibo Mao <maobibo@loongson.cn>	2025-01-24 14:49:24 +08:00
Bibo Mao	b5b13eb712	target/loongarch: Set unused flag with CSR registers On LA464, some CSR registers are not used such as CSR_SAVE8 - CSR_SAVE15, also CSR registers relative with MCE is not used now. Flag CSRFL_UNUSED is added for these registers, so that it will not dumped. In order to keep compatiblity, these CSR registers are not removed since it is used in vmstate already. Signed-off-by: Bibo Mao <maobibo@loongson.cn>	2025-01-24 14:49:24 +08:00
Bibo Mao	cb6fa4142f	target/loongarch: Add common source file for CSR register Common source file csr.c is added here, it can be used by both TCG mode and kvm mode. The common code is removed from file tcg/insn_trans/trans_privileged.c.inc to csrc.c Signed-off-by: Bibo Mao <maobibo@loongson.cn>	2025-01-24 14:49:24 +08:00
Bibo Mao	d03114ea20	target/loongarch: Add common header file for CSR registers Common header file csr.h is added here, it can be used by both TCG mode and kvm mode. Signed-off-by: Bibo Mao <maobibo@loongson.cn>	2025-01-24 14:49:24 +08:00
Bibo Mao	75b2c5da94	target/loongarch: Add generic csr function type Parameter type TCGv and TCGv_ptr for function GenCSRRead and GenCSRWrite is not used in non-TCG mode. Generic csr function type is added here with parameter void type, so that it passes to compile with non-TCG mode. Signed-off-by: Bibo Mao <maobibo@loongson.cn>	2025-01-24 14:49:24 +08:00
Bibo Mao	3156b1c1e9	target/loongarch: Remove static CSR function setting Since CSR function setting is done dynamically in TCG mode, remove static CSR function setting here. Signed-off-by: Bibo Mao <maobibo@loongson.cn>	2025-01-24 14:49:24 +08:00
Bibo Mao	90f73c2d7f	target/loongarch: Add dynamic function access with CSR register With CSR register, dynamic function access is used for CSR register access in TCG mode, so that csr info can be used by other modules. Signed-off-by: Bibo Mao <maobibo@loongson.cn>	2025-01-24 14:49:24 +08:00
Tao Su	56e84d898f	target/i386: Add new CPU model ClearwaterForest According to table 1-2 in Intel Architecture Instruction Set Extensions and Future Features (rev 056) [1], ClearwaterForest has the following new features which have already been virtualized: - AVX-VNNI-INT16 CPUID.(EAX=7,ECX=1):EDX[bit 10] - SHA512 CPUID.(EAX=7,ECX=1):EAX[bit 0] - SM3 CPUID.(EAX=7,ECX=1):EAX[bit 1] - SM4 CPUID.(EAX=7,ECX=1):EAX[bit 2] Add above features to new CPU model ClearwaterForest. Comparing with SierraForest, ClearwaterForest bare-metal contains all features of SierraForest-v2 CPU model and adds: - PREFETCHI CPUID.(EAX=7,ECX=1):EDX[bit 14] - DDPD_U CPUID.(EAX=7,ECX=2):EDX[bit 3] - BHI_NO IA32_ARCH_CAPABILITIES[bit 20] Add above and all features of SierraForest-v2 CPU model to new CPU model ClearwaterForest. [1] https://cdrdv2.intel.com/v1/dl/getContent/671368 Tested-by: Xuelian Guo <xuelian.guo@intel.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20250121020650.1899618-4-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:50:53 +01:00
Tao Su	b611931d4f	target/i386: Export BHI_NO bit to guests Branch History Injection (BHI) is a CPU side-channel vulnerability, where an attacker may manipulate branch history before transitioning from user to supervisor mode or from VMX non-root/guest to root mode. CPUs that set BHI_NO bit in MSR IA32_ARCH_CAPABILITIES to indicate no additional mitigation is required to prevent BHI. Make BHI_NO bit available to guests. Tested-by: Xuelian Guo <xuelian.guo@intel.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20250121020650.1899618-3-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:50:53 +01:00
Tao Su	c597ff5339	target/i386: Introduce SierraForest-v2 model Update SierraForest CPU model to add LAM, 4 bits indicating certain bits of IA32_SPEC_CTR are supported(intel-psfd, ipred-ctrl, rrsba-ctrl, bhi-ctrl) and the missing features(ss, tsc-adjust, cldemote, movdiri, movdir64b) Also add GDS-NO and RFDS-NO to indicate the related vulnerabilities are mitigated in stepping 3. Tested-by: Xuelian Guo <xuelian.guo@intel.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20250121020650.1899618-2-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:50:53 +01:00
Paolo Bonzini	22063f03a7	target/i386: avoid using s->tmp0 for add to implicit registers For updates to implicit registers (RCX in LOOP instructions, RSI or RDI in string instructions, or the stack pointer) do the add directly using the registers (with no temporary) if 32-bit or 64-bit, or use a temporary created for the occasion if 16-bit. This is more efficient and removes move instructions for the MO_TL case. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-14-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:50:53 +01:00
Paolo Bonzini	82290c7647	target/i386: extract common bits of gen_repz/gen_repz_nz Now that everything has been cleaned up, look at DF and prefixes in a single function, and call that one from gen_repz and gen_repz_nz. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:50:44 +01:00
Paolo Bonzini	4f094e27f3	target/i386: pull computation of string update value out of loop This is a common operation that is executed many times in rep movs or rep stos loops. It can improve performance by several percentage points. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241215090613.89588-13-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	456709db50	target/i386: execute multiple REP/REPZ iterations without leaving TB Use a TCG loop so that it is not necessary to go through the setup steps of REP and through the I/O check on every iteration. Interestingly, this is not a particularly effective optimization on its own, though it avoids the cost of correct RF emulation that was added in the previous patch. The main benefit lies in allowing the hoisting of loop invariants outside the loop, which will happen separately. The loop exits when the low 16 bits of CX/ECX/RCX are zero (so generally speaking the string operation runs in 65536 iteration batches) to give the main loop an opportunity to pick up interrupts. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-12-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	0360b78187	target/i386: optimize CX handling in repeated string operations In a repeated string operation, CX/ECX will be decremented until it is 0 but never underflow. Use this observation to avoid a deposit or zero-extend operation if the address size of the operation is smaller than MO_TL. As in the previous patch, the patch is structured to include some preparatory work for subsequent changes. In particular, introducing cx_next prepares for when ECX will be decremented before calling fn(s, ot), and therefore cannot yet be written back to cpu_regs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-11-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	3658116025	target/i386: do not use gen_op_jz_ecx for repeated string operations Explicitly generate a TSTEQ branch (which is optimized to NE x,0 if possible). This does not make much sense yet, but later we will add more checks and some will use a temporary to check on the decremented value of CX/ECX/RCX; it will be clearer for all checks to share the same logic using TSTEQ(reg, cx_mask). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-10-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	6986cf0032	target/i386: make cc_op handling more explicit for repeated string instructions. Since the cost of gen_update_cc_op() must be paid anyway, it's easier to place them manually and not rely on spilling that is buried under multiple levels of function calls. While at it, clarify the circumstances in which the gen_update_cc_op() is needed, and why it is not for REPxx SCAS and REPxx CMPS. And since cc_op will have been spilled at the point of a fault, just make the whole insn CC_OP_DYNAMIC. Once repz_opt is reintroduced, a fault could happen either before or after the first execution of CMPS/SCAS, and CC_OP_DYNAMIC sidesteps the complicated matter of what x86_restore_state_to_opc would do. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241215090613.89588-9-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	0d82d9e846	target/i386: fix RF handling for string instructions RF must be set on traps and interrupts from a string instruction, except if they occur after the last iteration. Ensure it is set before giving the main loop a chance to execute. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-8-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	4d7704ebc5	target/i386: tcg: move gen_set/reset_* earlier in the file Allow using them in the code that translates REP/REPZ, without forward declarations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-7-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	0eb7046e1b	target/i386: reorganize ops emitted by do_gen_rep, drop repz_opt The condition for optimizing repeat instruction is more or less the opposite of what you imagine: almost always the string instruction was _not_ optimized and optimizing the loop relied on goto_tb. This is obviously not great for performance, due to the cost of the exit-to-main-loop check, but also wrong. In fact, after expanding dc->jmp_opt and simplifying "!!x" to "x", the condition for looping used to be: ((cflags & CF_NO_GOTO_TB) \|\| (flags & (HF_RF_MASK \| HF_TF_MASK \| HF_INHIBIT_IRQ_MASK))) && !(cflags & CF_USE_ICOUNT) In other words, setting aside RF (it requires special handling for REP instructions and it was completely missing), repeat instruction were being optimized if TF or inhibit IRQ flags were set. This is certainly wrong for TF, because string instructions trap after every execution, and probably for interrupt shadow too. Get rid of repz_opt completely. The next patches will reintroduce the optimization, applying it in the common case instead of the unlikely and wrong one. While at it, place the CX/ECX/RCX=0 case is at the end of the function, which saves a label and is clearer when reading the generated ops. For clarity, mark the cc_op explicitly as DYNAMIC even if at the end of the translation block; the cc_op can come from either the previous instruction or the string instruction, and currently we rely on a gen_update_cc_op() that is hidden in the bowels of gen_jcc() to spill cc_op and mark it clean. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-6-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	d8d552d459	target/i386: unify choice between single and repeated string instructions The same "if" is present in all generator functions for string instructions. Push it inside gen_repz() and gen_repz_nz() instead. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241215090613.89588-5-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	b519556f58	target/i386: unify REP and REPZ/REPNZ generation It only differs in a single call to gen_jcc, so use a "bool" argument to distinguish the two cases; do not duplicate code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-4-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	e604be4fb4	target/i386: remove trailing 1 from gen_{j, cmov, set}cc1 This is not needed anymore now that gen_jcc has been eliminated (merged into the similarly-named gen_Jcc, where the uppercase letter gives away that it is an emission function). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-3-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Paolo Bonzini	6ace2d5163	target/i386: inline gen_jcc into sole caller The code of gen_Jcc is very similar to gen_LOOP* and gen_JCXZ, but this is hidden by gen_jcc. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20241215090613.89588-2-pbonzini@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-23 11:35:33 +01:00
Stefan Hajnoczi	32a97c5d05	tcg: - Add TCGOP_TYPE, TCGOP_FLAGS. - Pass type and flags to tcg_op_supported, tcg_target_op_def. - Split out tcg-target-has.h and unexport from tcg.h. - Reorg constraint processing; constify TCGOpDef. - Make extract, sextract, deposit opcodes mandatory. - Merge ext{8,16,32}{s,u} opcodes into {s}extract. tcg/mips: Expand bswap unconditionally tcg/riscv: Use SRAIW, SRLIW for {s}extract_i64 tcg/riscv: Use BEXTI for single-bit extractions tcg/sparc64: Use SRA, SRL for {s}extract_i64 disas/riscv: Guard dec->cfg dereference for host disassemble util/cpuinfo-riscv: Detect Zbs accel/tcg: Call tcg_tb_insert() for one-insn TBs linux-user: Add missing /proc/cpuinfo fields for sparc -----BEGIN PGP SIGNATURE----- iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmeKnzUdHHJpY2hhcmQu aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+Kvgf+LG9UjXlWF9GK923E TllBL2rLf1OOdtTXWO15VcvGMoWDwB3tVBdhihdvXmnWju+WbfMk6mct5NhzsKn9 LmuugMIZs+hMROj+bgMK8x47jRIh5N2rDYxcEgmyfIpYb2o9qvyqKecGVRlSJTCE bmt5UFbvPThBb8upoMfq3F6evuMx0szBP7wrOwSR/VGpmzIr20UTEWo6I1ALp4uj paFaysYol4em3dIhkiuV9cL7E0EIObaNa7l9RUci/BmTq+JaVxUnW1Y2i0PEwKwG FJSfYTJk3wBgAVxC2zC2g3ZM7uKuecSXMpiFopTiuyQLp7Q61i9kCNvEq0qY5tdb DaqR/g== =cv4O -----END PGP SIGNATURE----- Merge tag 'pull-tcg-20250117' of https://gitlab.com/rth7680/qemu into staging tcg: - Add TCGOP_TYPE, TCGOP_FLAGS. - Pass type and flags to tcg_op_supported, tcg_target_op_def. - Split out tcg-target-has.h and unexport from tcg.h. - Reorg constraint processing; constify TCGOpDef. - Make extract, sextract, deposit opcodes mandatory. - Merge ext{8,16,32}{s,u} opcodes into {s}extract. tcg/mips: Expand bswap unconditionally tcg/riscv: Use SRAIW, SRLIW for {s}extract_i64 tcg/riscv: Use BEXTI for single-bit extractions tcg/sparc64: Use SRA, SRL for {s}extract_i64 disas/riscv: Guard dec->cfg dereference for host disassemble util/cpuinfo-riscv: Detect Zbs accel/tcg: Call tcg_tb_insert() for one-insn TBs linux-user: Add missing /proc/cpuinfo fields for sparc # -----BEGIN PGP SIGNATURE----- # # iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmeKnzUdHHJpY2hhcmQu # aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+Kvgf+LG9UjXlWF9GK923E # TllBL2rLf1OOdtTXWO15VcvGMoWDwB3tVBdhihdvXmnWju+WbfMk6mct5NhzsKn9 # LmuugMIZs+hMROj+bgMK8x47jRIh5N2rDYxcEgmyfIpYb2o9qvyqKecGVRlSJTCE # bmt5UFbvPThBb8upoMfq3F6evuMx0szBP7wrOwSR/VGpmzIr20UTEWo6I1ALp4uj # paFaysYol4em3dIhkiuV9cL7E0EIObaNa7l9RUci/BmTq+JaVxUnW1Y2i0PEwKwG # FJSfYTJk3wBgAVxC2zC2g3ZM7uKuecSXMpiFopTiuyQLp7Q61i9kCNvEq0qY5tdb # DaqR/g== # =cv4O # -----END PGP SIGNATURE----- # gpg: Signature made Fri 17 Jan 2025 13:19:33 EST # gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F # gpg: issuer "richard.henderson@linaro.org" # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full] # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * tag 'pull-tcg-20250117' of https://gitlab.com/rth7680/qemu: (68 commits) softfloat: Constify helpers returning float_status field accel/tcg: Call tcg_tb_insert() for one-insn TBs tcg: Document tb_lookup() and tcg_tb_lookup() linux-user: Add missing /proc/cpuinfo fields for sparc tcg/riscv: Use BEXTI for single-bit extractions util/cpuinfo-riscv: Detect Zbs tcg: Remove TCG_TARGET_HAS_deposit_{i32,i64} tcg: Remove TCG_TARGET_HAS_{s}extract_{i32,i64} tcg/tci: Remove assertions for deposit and extract tcg/tci: Provide TCG_TARGET_{s}extract_valid tcg/sparc64: Use SRA, SRL for {s}extract_i64 tcg/s390x: Fold the ext{8,16,32}[us] cases into {s}extract tcg/riscv: Use SRAIW, SRLIW for {s}extract_i64 tcg/riscv64: Fold the ext{8,16,32}[us] cases into {s}extract tcg/ppc: Fold the ext{8,16,32}[us] cases into {s}extract tcg/mips: Fold the ext{8,16,32}[us] cases into {s}extract tcg/loongarch64: Fold the ext{8,16,32}[us] cases into {s}extract tcg/arm: Add full [US]XT[BH] into {s}extract tcg/aarch64: Expand extract with offset 0 with andi tcg/aarch64: Provide TCG_TARGET_{s}extract_valid ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2025-01-21 08:28:33 -05:00
Alexey Baturo	941f76e293	target/riscv: Support Supm and Sspm as part of Zjpm v1.0 The Zjpm v1.0 spec states there should be Supm and Sspm extensions that are used in profile specification. Enabling Supm extension enables both Ssnpm and Smnpm, while Sspm enables only Smnpm. Signed-off-by: Alexey Baturo <baturo.alexey@gmail.com> Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Message-ID: <20250113194410.1307494-1-baturo.alexey@gmail.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>	2025-01-19 09:44:35 +10:00
Clément Léger	2d8e825928	target/riscv: Add Smdbltrp ISA extension enable switch Add the switch to enable the Smdbltrp ISA extension and disable it for the max cpu. Indeed, OpenSBI when Smdbltrp is present, M-mode double trap is enabled by default and MSTATUS.MDT needs to be cleared to avoid taking a double trap. OpenSBI does not currently support it so disable it for the max cpu to avoid breaking regression tests. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Message-ID: <20250116131539.2475785-1-cleger@rivosinc.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>	2025-01-19 09:44:35 +10:00
Clément Léger	00af7d5360	target/riscv: Implement Smdbltrp behavior When the Smsdbltrp ISA extension is enabled, if a trap happens while MSTATUS.MDT is already set, it will trigger an abort or an NMI is the Smrnmi extension is available. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-ID: <20250110125441.3208676-9-cleger@rivosinc.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>	2025-01-19 09:44:35 +10:00

... 21 22 23 24 25 ...

15794 commits