[2] initialized 'No Reboot' bit to 1 by default. And due to quirk it happened
to work with linux iTCO_wdt driver (which clears it on module load).
However spec [1] states:
"
R/W. This bit is set when the “No Reboot” strap (SPKR pin on
ICH9) is sampled high on PWROK.
"
So it should be set only when '-global ICH9-LPC.noreboot=true' and cleared
when it's false (which should be default).
Fix it to behave according to spec and set 'No Reboot' bit only when
'-global ICH9-LPC.noreboot=true'.
1)
Intel I/O Controller Hub 9 (ICH9) Family Datasheet (rev: 004)
2)
Fixes: 920557971b (ich9: add TCO interface emulation)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250922132600.562193-1-imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Implement the PRI callbacks in vtd_iommu_ops.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901111630.1018573-6-clement.mathieu--drif@eviden.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901111630.1018573-5-clement.mathieu--drif@eviden.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901111630.1018573-4-clement.mathieu--drif@eviden.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
wait_desc with SW=0,IF=0,FN=1 must not be considered as an
invalid descriptor as it is used to implement section 7.10 of
the VT-d spec.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901111630.1018573-3-clement.mathieu--drif@eviden.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901111630.1018573-2-clement.mathieu--drif@eviden.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With current limit set to match max spec size (2PTb),
Windows fails to parse type 17 records when DIMM size reaches 4Tb+.
Failure happens in GetPhysicallyInstalledSystemMemory() function,
and fails "Check SMBIOS System Memory Tables" SVVP test.
Though not fatal, it might cause issues for userspace apps,
something like [1].
Lets cap default DIMM size to 2Tb for now, until MS fixes it.
1) https://issues.redhat.com/browse/RHEL-81999?focusedId=27731200&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-27731200
PS: It's obvious 32 int overflow math somewhere in Windows,
MS admitted that it's Windows bug and in a process of fixing it.
However it's unclear if W10 and earlier would get the fix.
So however I dislike changing defaults, we heed to work around
the issue (it looks like QEMU regression while not being it).
Hopefully 2Tb/DIMM split will last longer until VM memory size
will become large enough to cause to many type 17 records issue
again.
PPS:
Alternatively, instead of messing with defaults, we can create
a dedicated knob to ask for desired DIMM size cap explicitly
on CLI. That will let users to enable workaround when they
hit this corner case. Downside is that knob has to be propagated
up all mgmt stack, which might be not desirable.
PPPS:
Yet alternatively, users can configure initial RAM to be less
than 4Tb and all additional RAM add as DIMMs on QEMU CLI.
(however it's the job to be done by mgmt which could know
Windows version and total amount of RAM)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Fixes: 62f182c97b ("smbios: make memory device size configurable per Machine")
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901084915.2607632-1-imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We didn't make the device user creatable in the first place because we
were worried users might get confused. Rename the device to make its
nature as a test device even more explicit. While we are at it add a
Kconfig variable so it can be skipped for those that want to thin out
their build configuration even further.
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-ID: <20250820195632.1956795-1-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901105948.982583-1-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Starting with commit cab1398a60, SR-IOV VFs are realized as soon as
pcie_sriov_pf_init() is called. Because pcie_sriov_pf_init() must be
called before pcie_sriov_pf_init_vf_bar(), the VF BARs types won't be
known when the VF realize function calls pcie_sriov_vf_register_bar().
This breaks the memory regions of the VFs (for instance with igbvf):
$ lspci
...
Region 0: Memory at 281a00000 (64-bit, prefetchable) [virtual] [size=16K]
Region 3: Memory at 281a20000 (64-bit, prefetchable) [virtual] [size=16K]
$ info mtree
...
address-space: pci_bridge_pci_mem
0000000000000000-ffffffffffffffff (prio 0, i/o): pci_bridge_pci
0000000081a00000-0000000081a03fff (prio 1, i/o): igbvf-mmio
0000000081a20000-0000000081a23fff (prio 1, i/o): igbvf-msix
and causes MMIO accesses to fail:
Invalid write at addr 0x281A01520, size 4, region '(null)', reason: rejected
Invalid read at addr 0x281A00C40, size 4, region '(null)', reason: rejected
To fix this, VF BARs are now registered with pci_register_bar() which
has a type parameter and pcie_sriov_vf_register_bar() is removed.
Fixes: cab1398a60 ("pcie_sriov: Reuse SR-IOV VF device instances")
Signed-off-by: Damien Bergamini <damien.bergamini@eviden.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901151314.1038020-1-clement.mathieu--drif@eviden.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When using a CXL Type 3 device together with a virtio 9p device in QEMU on a
physical server, the 9p device fails to initialize properly. The kernel reports
the following error:
virtio: device uses modern interface but does not have VIRTIO_F_VERSION_1
9pnet_virtio virtio0: probe with driver 9pnet_virtio failed with error -22
Further investigation revealed that the 64-bit BAR space assigned to the 9pnet
device was overlapped by the memory window allocated for the CXL devices. As a
result, the kernel could not correctly access the BAR region, causing the
virtio device to malfunction.
An excerpt from /proc/iomem shows:
480010000-cffffffff : CXL Window 0
480010000-4bfffffff : PCI Bus 0000:00
4c0000000-4c01fffff : PCI Bus 0000:0c
4c0000000-4c01fffff : PCI Bus 0000:0d
4c0200000-cffffffff : PCI Bus 0000:00
4c0200000-4c0203fff : 0000:00:03.0
4c0200000-4c0203fff : virtio-pci-modern
To address this issue, this patch adds the reserved memory end calculation
for cxl devices to reserve sufficient address space and ensure that CXL memory
windows are allocated beyond all PCI 64-bit BARs. This prevents overlap with
64-bit BARs regions such as those used by virtio or other pcie devices,
resolving the conflict.
QEMU Build Configuration:
./configure --prefix=/home/work/qemu_master/build/ \
--target-list=x86_64-softmmu \
--enable-kvm \
--enable-virtfs
QEMU Boot Command:
sudo /home/work/qemu_master/qemu/build/qemu-system-x86_64 \
-nographic -machine q35,cxl=on -enable-kvm -m 16G -smp 8 \
-hda /home/work/gp_qemu/rootfs.img \
-virtfs local,path=/home/work/gp_qemu/share,mount_tag=host0,security_model=passthrough,id=host0 \
-kernel /home/work/linux_output/arch/x86/boot/bzImage \
--append "console=ttyS0 crashkernel=256M root=/dev/sda rootfstype=ext4 rw loglevel=8" \
-object memory-backend-ram,id=vmem0,share=on,size=4096M \
-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
-device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
-device cxl-type3,bus=root_port13,volatile-memdev=vmem0,id=cxl-vmem0,sn=0x123456789 \
-M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G
Fixes: 03b39fcf64 ("hw/cxl: Make the CXL fixed memory window setup a machine parameter")
Signed-off-by: peng guo <engguopeng@buaa.edu.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250805142300.15226-1-engguopeng@buaa.edu.cn>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is useful to be able to freeze a specific version of SeaBIOS to
prevent guest visible changes between BIOS updates. This is currently
not possible since the extension byte 2 provided by SeaBIOS does not
set the VM bit, whereas QEMU sets it unconditionally.
Allowing to clear it also seems useful if we want to hide the fact that
the guest system is running inside a virtual machine.
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250724195409.43499-1-d-tatianin@yandex-team.ru>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
While the HEST layout didn't change, there are some internal
changes related to how offsets are calculated and how memory error
events are triggered.
Update specs to reflect such changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <e3e8bd92ce40d997c67ac1d4d973c0041b8f59fc.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that we have everything in place, enable using HEST GPA
instead of etc/hardware_errors GPA.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <ad77b64aa1f09141efe942539445908631423975.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Create a QMP command to be used for generic ACPI APEI hardware error
injection (HEST) via GHESv2, and add support for it for ARM guests.
Error injection uses ACPI_HEST_SRC_ID_QMP source ID to be platform
independent. This is mapped at arch virt bindings, depending on the
types supported by QEMU and by the BIOS. So, on ARM, this is supported
via ACPI_GHES_NOTIFY_GPIO notification type.
This patch was co-authored:
- original ghes logic to inject a simple ARM record by Shiju Jose;
- generic logic to handle block addresses by Jonathan Cameron;
- generic GHESv2 error inject by Mauro Carvalho Chehab;
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-authored-by: Shiju Jose <shiju.jose@huawei.com>
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <81e2118b3c8b7e5da341817f277d61251655e0db.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Adds support to ARM virtualization to allow handling
generic error ACPI Event via GED & error source device.
It is aligned with Linux Kernel patch:
https://lore.kernel.org/lkml/1272350481-27951-8-git-send-email-ying.huang@intel.com/
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <3237a76b1469d669436399495825348bf34122cd.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We'll be adding a new GED device for HEST GPIO notification and
increasing the number of entries at the HEST table.
Blocklist testing HEST and DSDT tables until such changes
are completed.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <7fca7eb9b801f1b196210f66538234b94bd31c23.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Adds a generic error device to handle generic hardware error
events as specified at ACPI 6.5 specification at 18.3.2.7.2:
https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#event-notification-for-generic-error-sources
using HID PNP0C33.
The PNP0C33 device is used to report hardware errors to
the guest via ACPI APEI Generic Hardware Error Source (GHES).
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <2790f664c849d53de0ce3049fa8c7950c1de1f86.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Create a new property (x-has-hest-addr) and use it to detect if
the GHES table offsets can be calculated from the HEST address
(qemu 10.0 and upper) or via the legacy way via an offset obtained
from the hardware_errors firmware file.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <c4eb3cf32a3f158ae62dac29e866ac3f373956c3.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The GHES migration logic should now support HEST table location too.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <ede7ddf4b10f34094a4327dc458d630ad319bd1c.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some error injection notify methods are async, like GPIO
notify. Add a notifier to be used when the error record is
ready to be sent to the guest OS.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <edf9c6e5b80dc57e3443893bf9e1eb25cb9d266b.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The current code is actually dependent on having just one error
structure with a single source, as any change there would cause
migration issues.
As the number of sources should be arch-dependent, as it will depend on
what kind of notifications will exist, and how many errors can be
reported at the same time, change the logic to be more flexible,
allowing the number of sources to be defined when building the
HEST table by the caller.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <1698680848c11d6f26368426f1657e14faaf55c4.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There are two pointers that are needed during error injection:
1. The start address of the CPER block to be stored;
2. The address of the read ack.
It is preferable to calculate them from the HEST table. This allows
checking the source ID, the size of the table and the type of the
HEST error block structures.
Yet, keep the old code, as this is needed for migration purposes
from older QEMU versions.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <d4344e8dbe66372e1e093d968eda2e8b0527ba48.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Store HEST table address at GPA, placing its the start of the table at
hest_addr_le variable.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <c29aa5e6ab9b2d93dd5328481630c3b03da86261.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a new ags flag to change the way HEST offsets are calculated.
Currently, offsets needed to store ACPI HEST offsets and read ack
are calculated based on a previous knowledge from the logic
which creates the HEST table.
Such logic is not generic, not allowing to easily add more HEST
entries nor replicates what OSPM does.
As the next patches will be adding a more generic logic, add a
new use_hest_addr, set to false, in preparation for such changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <f5de17bf04b27828e1a439ad396b4f7982eaf156.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move the check logic into a common function and simplify the
code which checks if GHES is enabled and was properly setup.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <2bbb1d3eb88b0a668114adef2f1c2a94deebba0e.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The ghes_record_cper_errors() function was introduced to be used
by other types of errors, as part of the error injection
patch series. That's why it is not static.
Make it non-static again to allow its usage outside ghes.c
This reverts commit 611f3bdb20.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <14f2a888cfbf922d5f2bf94d7612114f25107d59.1758610789.git.mchehab+huawei@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When any host or guest GSO over UDP tunnel offload is enabled the
virtio net header includes the additional tunnel-related fields,
update the size accordingly.
Push the GSO over UDP tunnel offloads all the way down to the tap
device extending the newly introduced NetFeatures struct, and
eventually enable the associated features.
As per virtio specification, to convert features bit to offload bit,
map the extended features into the reserved range.
Finally, make the vhost backend aware of the exact header layout, to
copy it correctly. The tunnel-related field are present if either
the guest or the host negotiated any UDP tunnel related feature:
add them to the kernel supported features list, to allow qemu
transfer to the backend the needed information.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <093b4bc68368046bffbcab2202227632d6e4e83b.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tap devices support GSO over UDP tunnel offload. Probe for such
feature in a similar manner to other offloads.
GSO over UDP tunnel needs to be enabled in addition to a "plain"
offload (TSO or USO).
No need to check separately for the outer header checksum offload:
the kernel is going to support both of them or none.
The new features are disabled by default to avoid compat issues,
and could be enabled, after that hw_compat_10_1 will be added,
together with the related compat entries.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <a987a8a7613cbf33bb2209c7c7f5889b512638a7.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the extended types and helpers to manipulate the virtio_net
features.
Note that offloads are still 64bits wide, as per specification,
and extended offloads will be mapped into such range.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <bc5afdc5c1cb1a37238dd2b36004db3d46cbf211.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Provide extended version of the features manipulation helpers,
and let the device initialization deal with the full features space,
adjusting the relevant format strings accordingly.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <69c78c432e28e146a8874b2a7d00e9cbd111b1ba.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Leverage the kernel extended features manipulation ioctls(), if
available, and fallback to old ops otherwise. Error out when setting
extended features but kernel support is not available.
Note that extended support for get/set backend features is not needed,
as the only feature that can be changed belongs to the 64 bit range.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <150daade3d59e77629276920e014ee8e5fc12121.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Extend the VirtioDeviceFeatures struct with an additional u64
to track unknown features in the 64-127 bit range and decode
the full virtio features spaces for vhost and virtio devices.
Also add entries for the soon-to-be-supported virtio net GSO over
UDP features.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <e51969f94d89045b333f1bc5ef5fca9e12fc371a.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Similar to virtio infra, vhost core maintains the features status
in the full extended format and allows the devices to implement
extended version of the getter/setter.
Note that 'protocol_features' are not extended: they are only
used by vhost-user, and the latter device is not going to implement
extended features soon.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <a0062c3b1847fb2baedd6cd8f6ef13b051d6beb2.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Extend the features configuration space to 128 bits. If the virtio
device supports any extended features, allow the common read/write
operation to access all of it, otherwise keep exposing only the
lower 64 bits.
On migration, save the 128 bit version of the features only if the
upper bits are non zero. Relay on reset to clear all the feature
space before load.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <c0b81601f65b41ca8310eba8f05e2dcf3702de89.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtio specifications allows for a device features space up
to 128 bits and more. Soon we are going to use some of the 'extended'
bits features for the virtio net driver.
Add support to allow extended features negotiation on a per
devices basis. Devices willing to negotiated extended features
need to implemented a new pair of features getter/setter, the
core will conditionally use them instead of the basic one.
Note that 'bad_features' don't need to be extended, as they are
bound to the 64 bits limit.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <9bb29d70adc3f2b8c7756d4e3cd076cffee87826.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If the driver uses any of the extended features (i.e. 64 or above),
store the extended features range (64-127 bits).
At load time, let legacy features initialize the full features range
and pass it to the set helper; sub-states loading will have filled-up
the extended part as needed.
This is one of the few spots that need explicitly to know and set
in stone the extended features array size; add a build bug to prevent
breaking the migration should such size change again in the future:
more serialization plumbing will be needed.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <d5d9d398675bee6c4c7d7308c5d3d5d3c6d17d87.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtio specifications allows for up to 128 bits for the
device features. Soon we are going to use some of the 'extended'
bits features (bit 64 and above) for the virtio net driver.
Represent the virtio features bitmask with a fixed size array, and
introduce a few helpers to help manipulate them.
Most drivers will keep using only 64 bits features space: use union
to allow them access the lower part of the extended space without any
per driver change.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <6a9bbb5eb33830f20afbcb7e64d300af4126dd98.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Update headers to include the virtio GSO over UDP tunnel features
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <0b1f3c011f90583ab52aa4fef04df6db35cc4a69.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Such annotation is present into the kernel uAPI headers since
v6.7, and will be used soon by the vhost_type.h. Deal with it
just stripping it.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <a1430f43cc954d2a931fa60581bda6d6af4bc771.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The set_offload() argument list is already pretty long and
we are going to introduce soon a bunch of additional offloads.
Replace the offload arguments with a single struct and update
all the relevant call-sites.
No functional changes intended.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <a9d4dd043b8c71b791e9ff05e17ef06072d9714e.1758549625.git.pabeni@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Fix MSI table size limit
* Add riscv64 to FirmwareArchitecture
* Sync RISC-V hwprobe with Linux
* Implement MonitorDef HMP API
* Update OpenSBI to v1.7
* Fix SiFive UART character drop issue and minor refactors
* Fix RISC-V timer migration issues
* Use riscv_cpu_is_32bit() when handling SBI_DBCN reg
* Use riscv_csrr in riscv_csr_read
* Align memory allocations to 2M on RISC-V
* Do not use translator_ldl in opcode_at
* Minor fixes of RISC-V CFI
* Modify minimum VLEN rule
* Fix vslide1[up|down].vx unexpected result when XLEN=32 and SEW=64
* Fixup IOMMU PDT Nested Walk
* Fix endianness swap on compressed instructions
* Update status of IOMMU kernel support
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmjfQhoACgkQr3yVEwxT
gBPnTg//eQ9GMFTLcW4kFMsVYeY8TbkmQN9Wnk+XubG92siGkzuNmfy36yo7oeib
dB6/h5JLjycjttOfgyx73/TKUucyZs+ZYkVVWWQCSU+sqPTA370MmGNM8CSmPms/
lFuNIixd+sSUDIOod9zQHzxv+f3ZN2bjEAyzJAEhSXgTO+1xnOeJHHjxB5O2Z/a1
ccd3Po1wR6nm2T4x88LcHDHj8svLsfG0G1RRkU+yeLu7J6Qpp0d/lOZI7if+AQqb
Nmz65n2uSuUEuNNQIxYaQp/nbkF3DSxi3mg3+hCQjF+hMjXL4hAhSEPril3MQjGi
802nEaqG8Qdzec+bZiKt0c3e0f4SrnpDXDnz7NrtfSO6vXAvqqZuC8kTdZy8dsPU
1D809ksZoNDIB87z89MQPsQ7k1Bs2Iq9pNpB9huD3mzY4DHqYhkzysAwc8Qhvimv
pBaeSDV66OrI/al5c0FqSN0LiLHvlRcwqiATiQwIdCV+PUe+cVPwIKq6ABQiYpVu
mvnzgEJ4r7iO92hOoAGM+eRC7krafF1/gbe3SDI3RLUTDPM6hcTRcluvBlpBdNDj
lIYXs89f0jBh0I4IRGm8ftqD9xPDP56mZVEIIjSWDRTT6mfZLxWWMmXC/OK63U7/
bpJKohFOKy8P6SSvTACcLSOQlP3r+FRrmBOXs7S24U+Hr9xUep0=
=DGkt
-----END PGP SIGNATURE-----
Merge tag 'pull-riscv-to-apply-20251003-3' of https://github.com/alistair23/qemu into staging
First RISC-V PR for 10.2
* Fix MSI table size limit
* Add riscv64 to FirmwareArchitecture
* Sync RISC-V hwprobe with Linux
* Implement MonitorDef HMP API
* Update OpenSBI to v1.7
* Fix SiFive UART character drop issue and minor refactors
* Fix RISC-V timer migration issues
* Use riscv_cpu_is_32bit() when handling SBI_DBCN reg
* Use riscv_csrr in riscv_csr_read
* Align memory allocations to 2M on RISC-V
* Do not use translator_ldl in opcode_at
* Minor fixes of RISC-V CFI
* Modify minimum VLEN rule
* Fix vslide1[up|down].vx unexpected result when XLEN=32 and SEW=64
* Fixup IOMMU PDT Nested Walk
* Fix endianness swap on compressed instructions
* Update status of IOMMU kernel support
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmjfQhoACgkQr3yVEwxT
# gBPnTg//eQ9GMFTLcW4kFMsVYeY8TbkmQN9Wnk+XubG92siGkzuNmfy36yo7oeib
# dB6/h5JLjycjttOfgyx73/TKUucyZs+ZYkVVWWQCSU+sqPTA370MmGNM8CSmPms/
# lFuNIixd+sSUDIOod9zQHzxv+f3ZN2bjEAyzJAEhSXgTO+1xnOeJHHjxB5O2Z/a1
# ccd3Po1wR6nm2T4x88LcHDHj8svLsfG0G1RRkU+yeLu7J6Qpp0d/lOZI7if+AQqb
# Nmz65n2uSuUEuNNQIxYaQp/nbkF3DSxi3mg3+hCQjF+hMjXL4hAhSEPril3MQjGi
# 802nEaqG8Qdzec+bZiKt0c3e0f4SrnpDXDnz7NrtfSO6vXAvqqZuC8kTdZy8dsPU
# 1D809ksZoNDIB87z89MQPsQ7k1Bs2Iq9pNpB9huD3mzY4DHqYhkzysAwc8Qhvimv
# pBaeSDV66OrI/al5c0FqSN0LiLHvlRcwqiATiQwIdCV+PUe+cVPwIKq6ABQiYpVu
# mvnzgEJ4r7iO92hOoAGM+eRC7krafF1/gbe3SDI3RLUTDPM6hcTRcluvBlpBdNDj
# lIYXs89f0jBh0I4IRGm8ftqD9xPDP56mZVEIIjSWDRTT6mfZLxWWMmXC/OK63U7/
# bpJKohFOKy8P6SSvTACcLSOQlP3r+FRrmBOXs7S24U+Hr9xUep0=
# =DGkt
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 02 Oct 2025 08:25:14 PM PDT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20251003-3' of https://github.com/alistair23/qemu: (26 commits)
docs: riscv-iommu: Update status of kernel support
target/riscv: Fix endianness swap on compressed instructions
hw/riscv/riscv-iommu: Fixup PDT Nested Walk
target/riscv: rvv: Fix vslide1[up|down].vx unexpected result when XLEN=32 and SEW=64
target/riscv: rvv: Modify minimum VLEN according to enabled vector extensions
target/riscv: rvv: Replace checking V by checking Zve32x
target/riscv: Fix ssamoswap error handling
target/riscv: Fix SSP CSR error handling in VU/VS mode
target/riscv: Fix the mepc when sspopchk triggers the exception
target/riscv: do not use translator_ldl in opcode_at
qemu/osdep: align memory allocations to 2M on RISC-V
target/riscv: use riscv_csrr in riscv_csr_read
target/riscv/kvm: Use riscv_cpu_is_32bit() when handling SBI_DBCN reg
target/riscv: Save stimer and vstimer in CPU vmstate
hw/intc: Save timers array in RISC-V mtimer VMState
migration: Add support for a variable-length array of UINT32 pointers
hw/intc: Save time_delta in RISC-V mtimer VMState
hw/char: sifive_uart: Add newline to error message
hw/char: sifive_uart: Remove outdated comment about Tx FIFO
hw/char: sifive_uart: Avoid pushing Tx FIFO when size is zero
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The iommu Linux kernel support is now upstream. VFIO is still
downstream at this stage.
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Message-ID: <20250814001452.504510-1-joel@jms.id.au>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Three instructions were not using the endianness swap flag, which resulted in a bug on big-endian architectures.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3131
Buglink: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2123828
Fixes: e0a3054f18 ("target/riscv: add support for Zcb extension")
Signed-off-by: Valentin Haudiquet <valentin.haudiquet@canonical.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250929115543.1648157-1-valentin.haudiquet@canonical.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Current implementation is wrong when iohgatp != bare. The RISC-V
IOMMU specification has defined that the PDT is based on GPA, not
SPA. So this patch fixes the problem, making PDT walk correctly
when the G-stage table walk is enabled.
Fixes: 0c54acb824 ("hw/riscv: add RISC-V IOMMU base emulation")
Cc: qemu-stable@nongnu.org
Cc: Sebastien Boeuf <seb@rivosinc.com>
Cc: Tomasz Jeznach <tjeznach@rivosinc.com>
Reviewed-by: Weiwei Li <liwei1518@gmail.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Tested-by: Chen Pei <cp0613@linux.alibaba.com>
Tested-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
Message-ID: <20250913041233.972870-1-guoren@kernel.org>
[ Changes by AF:
- Add braces to if statements
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
When XLEN is 32 and SEW is 64, the original implementation of
vslide1up.vx and vslide1down.vx helper functions fills the 32-bit value
of rs1 into the first element of the destination vector register (rd),
which is a 64-bit element.
This commit attempted to resolve the issue by extending the rs1 value
to 64 bits during the TCG translation phase to ensure that the helper
functions won't lost the higer 32 bits.
Signed-off-by: Max Chou <max.chou@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20250124073325.2467664-1-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
According to the RISC-V unprivileged specification, the VLEN should be greater
or equal to the ELEN. This commit modifies the minimum VLEN based on the vector
extensions and introduces a check rule for VLEN and ELEN.
Extension Minimum VLEN
* V 128
* Zve64[d|f|x] 64
* Zve32[f|x] 32
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20250923090729.1887406-3-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The Zve32x extension will be applied by the V and Zve* extensions.
Therefore we can replace the original V checking with Zve32x checking for both
the V and Zve* extensions.
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20250923090729.1887406-2-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>