qemu-cr16

Author	SHA1	Message	Date
Paolo Bonzini	f48aaf926e	target/i386/tcg: fix a few instructions that do not support VEX.L=1 Match the contents of table 2-17 ("#UD Exception and VEX.L Field Encoding") in the SDM, for instruction in exception class 5. They were incorrectly accepting 256-bit versions that do not exist. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 2eb8d9734355ed86e162dce2a3f265ffee4005ed) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2026-01-28 12:01:22 +03:00
Philippe Mathieu-Daudé	4131a1d83c	accel/nvmm: Fix 'cpu' typo in nvmm_init_vcpu() Fix typo to avoid the following build failure: target/i386/nvmm/nvmm-all.c: In function 'nvmm_init_vcpu': target/i386/nvmm/nvmm-all.c:988:9: error: 'AccelCPUState' has no member named 'vcpu_dirty' 988 \| qcpu->vcpu_dirty = true; \| ^~ Cc: qemu-stable@nongnu.org Reported-by: Thomas Huth <thuth@redhat.com> Fixes: `2098164a6b` ("accel/nvmm: Replace @dirty field by generic CPUState::vcpu_dirty field") Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Tested-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Message-ID: <20260113203924.81560-1-philmd@linaro.org> (cherry picked from commit 7be4256281f430f726366c92ffdea0b72651de8a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2026-01-18 20:29:45 +03:00
Paolo Bonzini	11e286fb93	target/i386/tcg: allow VEX in 16-bit protected mode VEX is only forbidden in real and vm86 mode; 16-bit protected mode supports it for some unfathomable reason. Cc: qemu-stable@nongnu.org Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit ed88bdcfbdcf9d411607cd690f93f915feff6a5b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2026-01-16 14:29:07 +03:00
Paolo Bonzini	6594e50e7e	target/i386/tcg: mask addresses for VSIB VSIB can have either 32-bit or 64-bit addresses, pass a constant mask to the helper and apply it before the load. Cc: qemu-stable@nongnu.org Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 5e3572ef2e94608568b1a73eab9d382b250936eb) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2026-01-16 14:28:56 +03:00
Paolo Bonzini	51bc24d427	target/i386/tcg: do not mark all SSE instructions as unaligned If the vex_special field was not initialized, it was considered to be X86_VEX_SSEUnaligned (whose value was zero). Add a new value to fix that. Cc: qemu-stable@nongnu.org Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 73dd6e4a36dd8d85548292f382a4d479e2810371) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2026-01-16 14:28:45 +03:00
Paolo Bonzini	59c9137156	target/i386/tcg: ignore V3 in 32-bit mode From the manual: "In 64-bit mode all 4 bits may be used. [...] In 32-bit and 16-bit modes bit 6 must be 1 (if bit 6 is not 1, the 2-byte VEX version will generate LDS instruction and the 3-byte VEX version will ignore this bit)." Cc: qemu-stable@nongnu.org Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 0db1b556e4bcd7a51f222cda9e14850f88fe3f88) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2025-12-29 10:44:56 +03:00
Andrew Cooper	b33a563281	target/i386: Fix #GP error code for INT instructions While the (intno << shift) expression is correct for indexing the IDT based on whether Long Mode is active, the error code itself was unchanged with AMD64, and is still the index with 3 bits of metadata in the bottom. Found when running a Xen unit test, all under QEMU. The unit test objected to being told there was an error with IDT index 256 when INT $0x80 (128) was the problem instruction: ... Error: Unexpected fault 0x800d0802, #GP[IDT[256]] ... Fixes: `d2fd1af767` ("x86_64 linux user emulation") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Link: https://lore.kernel.org/r/20250312000603.3666083-1-andrew.cooper3@citrix.com Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3160 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 60efba3c1bff0d78632d45c2dc927c5bc7a17ba8) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2025-12-29 10:44:47 +03:00
Nguyen Dinh Phi	3bee93b9ab	accel/hvf: Fix i386 HVF compilation failures Recent changes introduced build errors in the i386 HVF backend: - ../accel/hvf/hvf-accel-ops.c:163:17: error: no member named 'guest_debug_enabled' in 'struct AccelCPUState' 163 \| cpu->accel->guest_debug_enabled = false; - ../accel/hvf/hvf-accel-ops.c:151:51 error: no member named 'unblock_ipi_mask' in 'struct AccelCPUState' - ../target/i386/hvf/hvf.c:736:5 error: use of undeclared identifier 'rip' - ../target/i386/hvf/hvf.c:737:5 error: use of undeclared identifier 'env' This patch corrects the field usage and move identifier to correct function ensuring successful compilation of the i386 HVF backend. These issues were caused by: Fixes: `2ad756383e` (“accel/hvf: Restrict ARM-specific fields of AccelCPUState”) Fixes: `2a21c92447` (“target/i386/hvf: Factor hvf_handle_vmexit() out”) Signed-off-by: Nguyen Dinh Phi <phind.uet@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20251126094601.56403-1-phind.uet@gmail.com> [PMD: Keep setting vcpu_dirty on AArch64] Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Nguyen Dinh Phi <phind.uet@gmail.com> Message-Id: <20251128085854.53539-1-phind.uet@gmail.com>	2025-12-01 21:21:16 +01:00
Paolo Bonzini	106d766c9d	target/i386: fix stack size when delivering real mode interrupts The stack can be 32-bit even in real mode, and in this case the stack pointer must be updated in its entirety rather than just the bottom 16 bits. The same is true of real mode IRET, for which there was even a comment suggesting the right thing to do. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1506 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-11-17 09:49:26 +01:00
Paolo Bonzini	9c3afb9d9b	target/i386: svm: fix sign extension of exit code The exit_code parameter of cpu_vmexit is declared as uint32_t, but exit codes are 64 bits wide according to the AMD SVM specification. And because uint32_t is unsigned, this causes exit codes to be zero-extended, for example writing SVM_EXIT_ERR as 0xffff_ffff instead of the expected 0xffff_ffff_ffff_ffff. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2977 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-11-17 09:49:26 +01:00
Paolo Bonzini	ebb46ba6a4	target/i386/tcg: validate segment registers Correctly reject invalid segment registers, including CS when used as the destination of a MOV. Ignore the REX prefix as well. Fixes: `5e9e21bcc4` ("target/i386: move 60-BF opcodes to new decoder", 2024-05-07) Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3195 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-11-17 09:49:26 +01:00
Peter Maydell	ebd9ea2947	target/i386: Mark VPERMILPS as not valid with prefix 0 There are a small set of binary SSE insns which have no MMX equivalent, which we create the gen functions for with the BINARY_INT_SSE() macro. This forwards to gen_binary_int_sse() with a NULL pointer for 'mmx'. For almost all of these insns we correctly mark them in the decode table as not permitting a zero prefix byte; however we got this wrong for VPERMILPS, with the result that a bogus instruction would get through the decode checks and end up in gen_binary_int_sse() trying to call a NULL pointer. Correct the decode table entry for VPERMILPS so that we get the expected #UD exception. In the x86 SDM, table A-4 "Three-byte Opcode Map: 08H-FFH (First Two Bytes are 0F 38H)" confirms that there is no pfx 0 version of VPERMILPS. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3199 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Link: https://lore.kernel.org/r/20251114175417.2794804-1-peter.maydell@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-11-17 09:49:25 +01:00
Nguyen Dinh Phi	46b06eaeb4	target/i386: emulate: Make sure fetch_instruction exist before calling it Currently, this function is only available in MSHV. If a different accelerator is used, and the code jumps to this section, a segfault will occur. (I ran into this with HVF) Signed-off-by: Nguyen Dinh Phi <phind.uet@gmail.com> Link: https://lore.kernel.org/r/20251114082915.71884-2-phind.uet@gmail.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-11-17 09:49:25 +01:00
Peter Maydell	4f503afc7e	target/x86: Correctly handle invalid 0x0f 0xc7 0xxx insns In the decode_group9() function, if we don't recognise the insn as one that we should handle, we leave the 'entry' pointer unaltered. Because the X86OpEntry struct has a union for the gen and decode pointers, this means that the top level code will call decode.e.gen() which tries to use the decode function pointer (still set to decode_group9) as a gen function pointer. This is undefined behaviour, but seems to be mostly harmless in practice (we call decode_group9() again with bogus arguments and it does nothing). If you have CFI enabled then it will trip the CFI check: ../target/i386/tcg/decode-new.c.inc:2862:9: runtime error: control flow integrity check for type 'void (struct DisasContext , struct X86DecodedInsn )' failed during indirect function call Set *entry to UNKNOWN_OPCODE to provoke the #UD exception, as we do in decode_group1A() and decode_group11() for similar situations. Thanks to the bug reporter for the clear description and analysis of the bug and the simple reproducer. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3172 Fixes: `fcd16539eb` ("target/i386: convert CMPXCHG8B/CMPXCHG16B to new decoder") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20251021173152.1695997-1-peter.maydell@linaro.org>	2025-11-10 12:02:45 +01:00
Bin Guo	74343a438c	migration: Don't free the reason after calling migrate_add_blocker Function migrate_add_blocker will free the reason and set it to NULL if failure is returned. Signed-off-by: Bin Guo <guobin@linux.alibaba.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Link: https://lore.kernel.org/r/20251024205532.19883-1-guobin@linux.alibaba.com Signed-off-by: Peter Xu <peterx@redhat.com>	2025-11-03 16:04:10 -05:00
Gerd Hoffmann	593fe98d74	igvm: add support for initial register state load in native mode Add IgvmNativeVpContextX64 struct holding the register state (see igvm spec), and the qigvm_x86_load_context() function to load the register state. Wire up using two new functions: qigvm_x86_set_vp_context() is called from igvm file handling code and stores the boot processor context. qigvm_x86_bsp_reset() is called from i386 target cpu reset code and loads the context into the cpu registers. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-ID: <20251029105555.2492276-5-kraxel@redhat.com>	2025-11-03 07:38:53 +01:00
Gerd Hoffmann	13abf2fcb7	igvm: add support for igvm memory map parameter in native mode Add and wire up qigvm_x86_get_mem_map_entry function which converts the e820 table into an igvm memory map parameter. This makes igvm files for the native (non-confidential) platform with memory map parameter work. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-ID: <20251029105555.2492276-4-kraxel@redhat.com>	2025-11-03 07:38:53 +01:00
Philippe Mathieu-Daudé	5f34a5b642	accel/hvf: Guard hv_vcpu_run() between cpu_exec_start/end() calls Similarly to `1d78a3c3ab` for KVM, wrap hv_vcpu_run() with cpu_exec_start/end(), so that the accelerator can perform pending operations while all vCPUs are quiescent. See also explanation in commit `c265e976f4` ("cpus-common: lock-free fast path for cpu_exec_start/end"). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2025-10-31 16:26:46 +00:00
Philippe Mathieu-Daudé	2a21c92447	target/i386/hvf: Factor hvf_handle_vmexit() out Factor hvf_handle_vmexit() out of hvf_arch_vcpu_exec(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2025-10-31 16:26:45 +00:00
Philippe Mathieu-Daudé	1182ede151	accel/hvf: Rename hvf_put\|get_registers -> hvf_arch_put\|get_registers hvf_put_registers() and hvf_get_registers() are implemented per target, rename them using the 'hvf_arch_' prefix following the per target pattern. Since they call hv_vcpu_set_reg() / hv_vcpu_get_reg(), mention they must be called on the vCPU. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Mads Ynddal <mads@ynddal.dk> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2025-10-31 16:26:45 +00:00
Philippe Mathieu-Daudé	963f1576c0	accel/hvf: Rename hvf_vcpu_exec() -> hvf_arch_vcpu_exec() hvf_vcpu_exec() is implemented per target, rename it as hvf_arch_vcpu_exec(), following the per target pattern. Since it calls hv_vcpu_run(), mention it must be called on the vCPU. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Mads Ynddal <mads@ynddal.dk> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2025-10-31 16:26:45 +00:00
Julian Ganz	b5c8cd6144	target/i386: call plugin trap callbacks We recently introduced API for registering callbacks for trap related events as well as the corresponding hook functions. Due to differences between architectures, the latter need to be called from target specific code. This change places the hook for x86 targets. Signed-off-by: Julian Ganz <neither@nut.email> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20251027110344.2289945-16-alex.bennee@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2025-10-29 14:12:43 +00:00
Paolo Bonzini	d5e1d2dea1	target/i386: clear CPU_INTERRUPT_SIPI for all accelerators Similar to what commit `df32e5c5` did for TCG; fixes boot with multiple processors on WHPX and probably more accelerators Fixes: `df32e5c568` ("i386/cpu: Prevent delivering SIPI during SMM in TCG mode", 2025-10-14) Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3178 Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-28 14:50:40 +01:00
Paolo Bonzini	1557adc826	accel/mshv: use return value of handle_pio_str_read Coverity complains because we assign to ret here but then never read it again before we overwrite it with the call to set_x64_registers(). Analyzed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-28 14:50:07 +01:00
Xiaoyao Li	639a294227	i386/kvm/cpu: Init SMM cpu address space for hotplugged CPUs The SMM cpu address space is initialized in a machine_init_done notifier. It only runs once when QEMU starts up, which leads to the issue that for any hotplugged CPU after the machine is ready, SMM cpu address space doesn't get initialized. Fix the issue by initializing the SMM cpu address space in x86_cpu_plug() when the cpu is hotplugged. Fixes: `591f817d81` ("target/i386: Define enum X86ASIdx for x86's address spaces") Reported-by: Peter Maydell <peter.maydell@linaro.org> Closes: https://lore.kernel.org/qemu-devel/CAFEAcA_3kkZ+a5rTZGmK8W5K6J7qpYD31HkvjBnxWr-fGT2h_A@mail.gmail.com/ Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20251014094216.164306-2-xiaoyao.li@intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-28 12:39:59 +01:00
Bernhard Beschow	337eece9c0	hw/i386/apic: Ensure own APIC use in apic_msr_{read,write} Avoids the `current_cpu` global and seems more robust by not "forgetting" the own APIC and then re-determining it by cpu_get_current_apic() which uses the global. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20251019210303.104718-9-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2025-10-21 20:16:47 +02:00
Bernhard Beschow	2fd15a24ca	hw/i386/apic: Prefer APICCommonState over DeviceState Makes the APIC API more type-safe by resolving quite a few APIC_COMMON downcasts. Like PICCommonState, the APICCommonState is now a public typedef while staying an abstract datatype. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20251019210303.104718-8-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2025-10-21 20:16:47 +02:00
Philippe Mathieu-Daudé	17db4d61d1	target/i386/monitor: Replace legacy cpu_physical_memory_read() calls Commit `b7ecba0f6f` ("docs/devel/loads-stores.rst: Document our various load and store APIs") mentioned cpu_physical_memory_() methods are legacy, the replacement being address_space_(). Replace: - cpu_physical_memory_read(len=4) -> address_space_ldl() - cpu_physical_memory_read(len=8) -> address_space_ldq() inlining the little endianness conversion via the '_le' suffix. As with the previous implementation, ignore whether the memory read succeeded or failed. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Message-Id: <20251002145742.75624-3-philmd@linaro.org>	2025-10-16 17:07:13 +02:00
Philippe Mathieu-Daudé	152820a991	target/i386/monitor: Propagate CPU address space to 'info mem' handlers We want to replace the cpu_physical_memory_read() calls by address_space_read() equivalents. Since the latter requires an address space, and these commands are run in the context of a vCPU, propagate its first address space. Next commit will do the replacements. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Message-Id: <20251002145742.75624-2-philmd@linaro.org>	2025-10-16 17:06:57 +02:00
Philippe Mathieu-Daudé	665a8035b7	accel/kvm: Introduce KvmPutState enum Join the 3 KVM_PUT_*_STATE definitions in a single enum. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Link: https://lore.kernel.org/r/20251008040715.81513-3-philmd@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:03:59 +02:00
Paolo Bonzini	58aa1d08bb	target/i386: user: do not set up a valid LDT on reset In user-mode emulation, QEMU uses the default setting of the LDT base and limit, which places it at the bottom 64K of virtual address space. However, by default there is no LDT at all in Linux processes, and therefore the limit should be 0. This is visible as a NULL pointer dereference in LSL and LAR instructions when they try to read the LDT at an unmapped address. Resolves: #1376 Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:03:59 +02:00
Paolo Bonzini	0d22b621b7	target/i386: fix access to the T bit of the TSS The T bit is bit 0 of the 16-bit word at offset 100 of the TSS. However, accessing it with a 32-bit word is not really correct, because bytes 102-103 contain the I/O map base address (relative to the base of the TSS) and bits 1-15 are reserved. In particular, any task switch to a TSS that has a nonzero I/O map base address is broken. This fixes the eventinj and taskswitch tests in kvm-unit-tests. Cc: qemu-stable@nongnu.org Fixes: `ad441b8b79` ("target/i386: implement TSS trap bit", 2025-05-12) Reported-by: Thomas Huth <thuth@redhat.com> Closes: https://gitlab.com/qemu-project/qemu/-/issues/3101 Tested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:03:59 +02:00
Thomas Ogrisegg	5a2faa0a0a	target/i386: fix x86_64 pushw op For x86_64 a 16 bit push op (pushw) of a memory address would generate a 64 bit store on the stack instead of a 16 bit store. For example: pushw (%rax) behaves like pushq (%rax) which is incorrect. This patch fixes that. Signed-off-by: Thomas Ogrisegg <tom-bugs-qemu@fnord.at> Link: https://lore.kernel.org/r/20250715210307.GA1115@x1.fnord.at Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:03:59 +02:00
YiFei Zhu	cdba90ac1b	i386/tcg/smm_helper: Properly apply DR values on SMM entry / exit do_smm_enter and helper_rsm sets the env->dr, but does not sync the values with cpu_x86_update_dr7. A malicious kernel may control the instruction pointer in SMM by setting a breakpoint on the SMI entry point, and after do_smm_enter cpu->breakpoints contains the stale breakpoint; and because IDT is not reloaded upon SMI entry, the debug exception handler controlled by the malicious kernel is invoked. Fixes: `01df040b52` ("x86: Debug register emulation (Jan Kiszka)") Reported-by: unvariant.winter@gmail.com Signed-off-by: YiFei Zhu <zhuyifei@google.com> Link: https://lore.kernel.org/r/2bacb9b24e9d337dbe48791aa25d349eb9c52c3a.1758794468.git.zhuyifei@google.com Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:03:59 +02:00
Paolo Bonzini	df32e5c568	i386/cpu: Prevent delivering SIPI during SMM in TCG mode [commit message by YiFei Zhu] A malicious kernel may control the instruction pointer in SMM in a multi-processor VM by sending a sequence of IPIs via APIC: CPU0 CPU1 IPI(CPU1, MODE_INIT) x86_cpu_exec_reset() apic_init_reset() s->wait_for_sipi = true IPI(CPU1, MODE_SMI) do_smm_enter() env->hflags \|= HF_SMM_MASK; IPI(CPU1, MODE_STARTUP, vector) do_cpu_sipi() apic_sipi() /* s->wait_for_sipi check passes */ cpu_x86_load_seg_cache_sipi(vector) A different sequence, SMI INIT SIPI, is also buggy in TCG because INIT is not blocked or latched during SMM. However, it is not vulnerable to an instruction pointer control in the same way because x86_cpu_exec_reset clears env->hflags, exiting SMM. Fixes: `a9bad65d2c` ("target-i386: wake up processors that receive an SMI") Analyzed-by: YiFei Zhu <zhuyifei@google.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:03:58 +02:00
Jon Kohler	00001a22d1	i386/kvm: Expose ARCH_CAP_FB_CLEAR when invulnerable to MDS Newer Intel hardware (Sapphire Rapids and higher) sets multiple MDS immunity bits in MSR_IA32_ARCH_CAPABILITIES but lacks the hardware-level MSR_ARCH_CAP_FB_CLEAR (bit 17): ARCH_CAP_MDS_NO ARCH_CAP_TAA_NO ARCH_CAP_PSDP_NO ARCH_CAP_FBSDP_NO ARCH_CAP_SBDR_SSDP_NO This prevents VMs with fb-clear=on from migrating from older hardware (Cascade Lake, Ice Lake) to newer hardware, limiting live migration capabilities. Note fb-clear was first introduced in v8.1.0 [1]. Expose MSR_ARCH_CAP_FB_CLEAR for MDS-invulnerable systems to enable seamless migration between hardware generations. Note: There is no impact when a guest migrates to newer hardware as the existing bit combinations already mark the host as MMIO-immune and disable FB_CLEAR operations in the kernel (see Linux's arch_cap_mmio_immune() and vmx_update_fb_clear_dis()). See kernel side discussion for [2] for additional context. [1] `22e1094ca8` ("target/i386: add support for FB_CLEAR feature") [2] https://patchwork.kernel.org/project/kvm/patch/20250401044931.793203-1-jon@nutanix.com/ Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Jon Kohler <jon@nutanix.com> Link: https://lore.kernel.org/r/20251008202557.4141285-1-jon@nutanix.com Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:03:58 +02:00
Mathias Krause	df9a3372dd	target/i386: Fix CR2 handling for non-canonical addresses Commit `3563362ddf` ("target/i386: Introduce structures for mmu_translate") accidentally modified CR2 for non-canonical address exceptions while these should lead to a #GP / #SS instead -- without changing CR2. Fix that. A KUT test for this was submitted as [1]. [1] https://lore.kernel.org/kvm/20250612141637.131314-1-minipli@grsecurity.net/ Fixes: `3563362ddf` ("target/i386: Introduce structures for mmu_translate") Signed-off-by: Mathias Krause <minipli@grsecurity.net> Link: https://lore.kernel.org/r/20250612142155.132175-1-minipli@grsecurity.net Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:03:58 +02:00
Babu Moger	d8ec0baf4a	target/i386: Add TSA feature flag verw-clear Transient Scheduler Attacks (TSA) are new speculative side channel attacks related to the execution timing of instructions under specific microarchitectural conditions. In some cases, an attacker may be able to use this timing information to infer data from other contexts, resulting in information leakage CPUID Fn8000_0021 EAX[5] (VERW_CLEAR). If this bit is 1, the memory form of the VERW instruction may be used to help mitigate TSA. Link: https://www.amd.com/content/dam/amd/en/documents/resources/bulletin/technical-guidance-for-mitigating-transient-scheduler-attacks.pdf Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/e6362672e3a67a9df661a8f46598335a1a2d2754.1752176771.git.babu.moger@amd.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:03:58 +02:00
Babu Moger	c79a35acad	target/i386: Add TSA attack variants TSA-SQ and TSA-L1 Transient Scheduler Attacks (TSA) are new speculative side channel attacks related to the execution timing of instructions under specific microarchitectural conditions. In some cases, an attacker may be able to use this timing information to infer data from other contexts, resulting in information leakage. AMD has identified two sub-variants two variants of TSA. CPUID Fn8000_0021 ECX[1] (TSA_SQ_NO). If this bit is 1, the CPU is not vulnerable to TSA-SQ. CPUID Fn8000_0021 ECX[2] (TSA_L1_NO). If this bit is 1, the CPU is not vulnerable to TSA-L1. Add the new feature word FEAT_8000_0021_ECX and corresponding bits to detect TSA variants. Link: https://www.amd.com/content/dam/amd/en/documents/resources/bulletin/technical-guidance-for-mitigating-transient-scheduler-attacks.pdf Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/12881b2c03fa351316057ddc5f39c011074b4549.1752176771.git.babu.moger@amd.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:03:58 +02:00
Richard Henderson	1188b07e60	* i386: fix migration issues in 10.1 * target/i386/mshv: new accelerator * rust: use glib-sys-rs * rust: fixes for docker tests -----BEGIN PGP SIGNATURE----- iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmjnaOwUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNsFQf/WXKxZLLnItHwDz3UdwjzewPWpz5N fpS0E4C03J8pACDgyfl7PQl47P7NlJ08Ig2Lc5l3Z9KiAKgh0orR7Cqd0BY5f9lo uk4FgXfXpQyApywAlctadrTfcH8sRv2tMaP6EJ9coLtJtHW9RUGFPaZeMsqrjpAl TpwAXPYNDDvvy1ih1LPh5DzOPDXE4pin2tDa94gJei56gY95auK4zppoNYLdB3kR GOyR4QK43/yhuxPHOmQCZOE3HK2XrKgMZHWIjAovjZjZFiJs49FaHBOpRfFpsUlG PB3UbIMtu69VY20LqbbyInPnyATRQzqIGnDGTErP6lfCGTKTy2ulQYWvHA== =KM5O -----END PGP SIGNATURE----- Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging * i386: fix migration issues in 10.1 * target/i386/mshv: new accelerator * rust: use glib-sys-rs * rust: fixes for docker tests # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmjnaOwUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroNsFQf/WXKxZLLnItHwDz3UdwjzewPWpz5N # fpS0E4C03J8pACDgyfl7PQl47P7NlJ08Ig2Lc5l3Z9KiAKgh0orR7Cqd0BY5f9lo # uk4FgXfXpQyApywAlctadrTfcH8sRv2tMaP6EJ9coLtJtHW9RUGFPaZeMsqrjpAl # TpwAXPYNDDvvy1ih1LPh5DzOPDXE4pin2tDa94gJei56gY95auK4zppoNYLdB3kR # GOyR4QK43/yhuxPHOmQCZOE3HK2XrKgMZHWIjAovjZjZFiJs49FaHBOpRfFpsUlG # PB3UbIMtu69VY20LqbbyInPnyATRQzqIGnDGTErP6lfCGTKTy2ulQYWvHA== # =KM5O # -----END PGP SIGNATURE----- # gpg: Signature made Thu 09 Oct 2025 12:49:00 AM PDT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [unknown] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [unknown] # gpg: WARNING: The key's User ID is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (35 commits) rust: fix path to rust_root_crate.sh tests/docker: make --enable-rust overridable with EXTRA_CONFIGURE_OPTS MAINTAINERS: Add maintainers for mshv accelerator docs: Add mshv to documentation target/i386/mshv: Use preallocated page for hvcall qapi/accel: Allow to query mshv capabilities accel/mshv: Handle overlapping mem mappings target/i386/mshv: Implement mshv_vcpu_run() target/i386/mshv: Write MSRs to the hypervisor target/i386/mshv: Integrate x86 instruction decoder/emulator target/i386/mshv: Register MSRs with MSHV target/i386/mshv: Register CPUID entries with MSHV target/i386/mshv: Set local interrupt controller state target/i386/mshv: Implement mshv_arch_put_registers() target/i386/mshv: Implement mshv_get_special_regs() target/i386/mshv: Implement mshv_get_standard_regs() target/i386/mshv: Implement mshv_store_regs() target/i386/mshv: Add CPU create and remove logic accel/mshv: Add vCPU signal handling accel/mshv: Add vCPU creation and execution loop ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2025-10-09 07:59:01 -07:00
Magnus Kulke	e4a20afce5	target/i386/mshv: Use preallocated page for hvcall There are hvcalls that are invoked during MMIO exits, the payload is of dynamic size. To avoid heap allocations we can use preallocated pages as in/out buffer for those calls. A page is reserved per vCPU and used for set/get register hv calls. Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com> Link: https://lore.kernel.org/r/20250916164847.77883-26-magnuskulke@linux.microsoft.com [Use standard MAX_CONST macro; mshv.h/mshv_int.h split. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-08 19:17:31 +02:00
Magnus Kulke	efc4093358	accel/mshv: Handle overlapping mem mappings QEMU maps certain regions into the guest multiple times, as seen in the trace below. Currently the MSHV kernel driver will reject those mappings. To workaround this, a record is kept (a static global list of "slots", inspired by what the HVF accelerator has implemented). An overlapping region is not registered at the hypervisor, and marked as mapped=false. If there is an UNMAPPED_GPA exit, we can look for a slot that is unmapped and would cover the GPA. In this case we map out the conflicting slot and map in the requested region. mshv_set_phys_mem add=1 name=pc.bios mshv_map_memory => u_a=7ffff4e00000 gpa=00fffc0000 size=00040000 mshv_set_phys_mem add=1 name=ioapic mshv_set_phys_mem add=1 name=hpet mshv_set_phys_mem add=0 name=pc.ram mshv_unmap_memory u_a=7fff67e00000 gpa=0000000000 size=80000000 mshv_set_phys_mem add=1 name=pc.ram mshv_map_memory u_a=7fff67e00000 gpa=0000000000 size=000c0000 mshv_set_phys_mem add=1 name=pc.rom mshv_map_memory u_a=7ffff4c00000 gpa=00000c0000 size=00020000 mshv_set_phys_mem add=1 name=pc.bios mshv_remap_attempt => u_a=7ffff4e20000 gpa=00000e0000 size=00020000 The mapping table is guarded by a mutex for concurrent modification and RCU mechanisms for concurrent reads. Writes occur rarely, but we'll have to verify whether an unmapped region exist for each UNMAPPED_GPA exit, which happens frequently. Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com> Link: https://lore.kernel.org/r/20250916164847.77883-24-magnuskulke@linux.microsoft.com [Fix format strings for trace-events; mshv.h/mshv_int.h split. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-08 19:17:31 +02:00
Magnus Kulke	6dec60528c	target/i386/mshv: Implement mshv_vcpu_run() Add the main vCPU execution loop for MSHV using the MSHV_RUN_VP ioctl. The execution loop handles guest entry and VM exits. There are handlers for memory r/w, PIO and MMIO to which the exit events are dispatched. In case of MMIO the i386 instruction decoder/emulator is invoked to perform the operation in user space. Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com> Link: https://lore.kernel.org/r/20250916164847.77883-23-magnuskulke@linux.microsoft.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-08 19:17:31 +02:00
Magnus Kulke	64118f452c	target/i386/mshv: Write MSRs to the hypervisor Push current model-specific register (MSR) values to MSHV's vCPUs as part of setting state to the hypervisor. Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com> Link: https://lore.kernel.org/r/20250916164847.77883-22-magnuskulke@linux.microsoft.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-08 19:17:31 +02:00
Magnus Kulke	9bc6a1d296	target/i386/mshv: Integrate x86 instruction decoder/emulator Connect the x86 instruction decoder and emulator to the MSHV backend to handle intercepted instructions. This enables software emulation of MMIO operations in MSHV guests. MSHV has a translate_gva hypercall that is used to accessing the physical guest memory. A guest might read from unmapped memory regions (e.g. OVMF will probe 0xfed40000 for a vTPM). In those cases 0xFF bytes is returned instead of aborting the execution. Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com> Link: https://lore.kernel.org/r/20250916164847.77883-21-magnuskulke@linux.microsoft.com [mshv.h/mshv_int.h split. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-08 19:17:31 +02:00
Magnus Kulke	f38e2a63e5	target/i386/mshv: Register MSRs with MSHV Build and register the guest vCPU's model-specific registers using the MSHV interface. Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com> Link: https://lore.kernel.org/r/20250916164847.77883-20-magnuskulke@linux.microsoft.com [mshv.h/mshv_int.h split. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-08 19:17:31 +02:00
Magnus Kulke	4fa04dd162	target/i386/mshv: Register CPUID entries with MSHV Convert the guest CPU's CPUID model into MSHV's format and register it with the hypervisor. This ensures that the guest observes the correct CPU feature set during CPUID instructions. Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com> Link: https://lore.kernel.org/r/20250916164847.77883-19-magnuskulke@linux.microsoft.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-08 19:17:31 +02:00
Magnus Kulke	ca20d46fa9	target/i386/mshv: Set local interrupt controller state To set the local interrupt controller state, perform hv calls retrieving partition state from the hypervisor. Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com> Link: https://lore.kernel.org/r/20250916164847.77883-18-magnuskulke@linux.microsoft.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-08 19:17:30 +02:00
Magnus Kulke	25a1d871e0	target/i386/mshv: Implement mshv_arch_put_registers() Write CPU register state to MSHV vCPUs. Various mapping functions to prepare the payload for the HV call have been implemented. Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com> Link: https://lore.kernel.org/r/20250916164847.77883-17-magnuskulke@linux.microsoft.com [mshv.h/mshv_int.h split. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-08 19:17:30 +02:00
Magnus Kulke	0382c2c854	target/i386/mshv: Implement mshv_get_special_regs() Retrieve special registers (e.g. segment, control, and descriptor table registers) from MSHV vCPUs. Various helper functions to map register state representations between Qemu and MSHV are introduced. Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com> Link: https://lore.kernel.org/r/20250916164847.77883-16-magnuskulke@linux.microsoft.com [mshv.h/mshv_int.h split. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-08 19:17:30 +02:00

1 2 3 4 5 ...

2602 commits