Currently invoking 'qemu timers' command results into: "gdb.error: There
is no member named last". Let's remove the legacy 'last' field from
QEMUClock, as it was removed in v4.2.0 by the commit 3c2d4c8aa6
("timer: last, remove last bits of last").
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20251204105019.455060-3-andrey.drobyshev@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 80c97930a93c32e2e666f5420af2d5734021a135)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This reverts commit b422a7bff6.
The reporter says "The commit breaks go; if you run go build in a loop,
it eventually hangs uninterruptible (except -9) with a couple of zombie
children left over".
Reported-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20260202091753.28459-1-pbonzini@redhat.com>
(cherry picked from commit 251a3d4ca3c961d95cd624252a178a33066455a2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
By the spec, fork() copies only the thread which executes it.
So it may happen, what while one thread is doing a fork,
another thread is holding `clone_lock` mutex
(e.g. doing a `fork()` or `exit()`).
So the child process is born with the mutex being held,
and there are nobody to release it.
As the thread executing do_syscall() is not considered running,
start_exclusive() does not protect us from the case.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3226
Signed-off-by: Aleksandr Sergeev <sergeev0xef@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20260126151612.2176451-1-sergeev0xef@gmail.com>
(cherry picked from commit d22e9aec572396836782e993cb18d598e6012688)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Whilst the spec doesn't speak to it directly my assumption is that
a request for more operations than exist should result in an invalid
input error return.
Fixes: 77a8e9fe0e ("hw/cxl/cxl-mailbox-utils: Add support for Media operations discovery commands cxl r3.2 (8.2.10.9.5.3)")
Closes: https://lore.kernel.org/qemu-devel/CAFEAcA-p5wZkNxK7wNVq_3PAzEE-muOd1Def-0O-FSpck4DrBQ@mail.gmail.com/
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260102154731.474859-3-Jonathan.Cameron@huawei.com>
(cherry picked from commit 25465c0e1fd74d2118dfec03912f2595eeb497d7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The both the size and base of a media sanitize operation are both provided
by the VM, an overflow is possible which may result in checks on valid
range passing when they should not. Close that by checking for overflow
on the addition.
Fixes: 40ab4ed107 ("hw/cxl/cxl-mailbox-utils: Media operations Sanitize and Write Zeros commands CXL r3.2(8.2.10.9.5.3)")
Closes: https://lore.kernel.org/qemu-devel/CAFEAcA8Rqop+ju0fuxN+0T57NBG+bep80z45f6pY0ci2fz_G3A@mail.gmail.com/
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260102154731.474859-2-Jonathan.Cameron@huawei.com>
(cherry picked from commit 87f8e5a71d061964c9bfa4d6e02db47f54dd61f7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fix inverted error check in virgl_cmd_resource_create_blob() that causes
the function to return error when virtio_gpu_create_mapping_iov() succeeds.
virtio_gpu_create_mapping_iov() returns 0 on success and negative values
on error. The check 'if (!ret)' incorrectly treats success (ret=0) as an
error condition, causing the function to fail when it should succeed.
Change the condition to 'if (ret != 0)' to properly detect errors.
Fixes: 7c092f17cc ("virtio-gpu: Handle resource blob commands")
Signed-off-by: Honglei Huang <honghuan@amd.com>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260113015203.3643608-2-honghuan@amd.com>
(cherry picked from commit 3560b51979577afc3ab937fd8b02597515bdfbae)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
virtio_pmem_flush() treats a NULL return from virtqueue_pop() as a fatal
error and calls virtio_error(), which puts the device into NEEDS_RESET.
However, virtqueue handlers can be invoked when no element is available,
so an empty queue should be handled as a benign no-op.
With a Linux guest this avoids spurious NEEDS_RESET and the resulting
-EIO propagation (e.g. EXT4 journal abort and remount-ro).
Signed-off-by: Li Chen <me@linux.beauty>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260106083859.380338-1-me@linux.beauty>
(cherry picked from commit efd581a8cd4405ca183ecd017072b0c878802d69)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When `owner` == `mr`, `object_unparent` will crash:
object_unparent(mr) ->
object_property_del_child(mr, mr) ->
object_finalize_child_property(mr, name, mr) ->
object_unref(mr) ->
object_finalize(mr) ->
object_property_del_all(mr) ->
object_finalize_child_property(mr, name, mr) ->
object_unref(mr) ->
fail on g_assert(obj->ref > 0)
However, passing a different `owner` to `memory_region_init` does not
work. `memory_region_ref` has an optimization where it takes a ref
only on the owner. That means when flatviews are created, it does not
take a ref on the region and you can get a UAF from `flatview_destroy`
called from RCU.
The correct fix therefore is to use `NULL` as the name which will set
the `owner` but not the `parent` (which is still NULL). This allows us
to use `memory_region_ref` on itself while not having to rely on unparent
for cleanup.
Signed-off-by: Joelle van Dyne <j@getutm.app>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260103214400.71694-1-j@getutm.app>
(cherry picked from commit e27194e087aede62dbe3d2805c6f1aa30d3465df)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This backend driver is used for demonstration purposes only, unlimited
size leads QEMU OOM.
Fixes: CVE-2025-14876
Fixes: 1653a5f3fc ("cryptodev: introduce a new cryptodev backend")
Reported-by: 이재영 <nakamurajames123@gmail.com>
Signed-off-by: zhenwei pi <zhenwei.pi@linux.dev>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251221024321.143196-3-zhenwei.pi@linux.dev>
(cherry picked from commit 7b913094c703641a0442bb1d1165323a019c591c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The total lenght of request is limited by cryptodev config, verify it
to avoid unexpected request from guest.
Fixes: CVE-2025-14876
Fixes: 0e660a6f90 ("crypto: Introduce RSA algorithm")
Reported-by: 이재영 <nakamurajames123@gmail.com>
Signed-off-by: zhenwei pi <zhenwei.pi@linux.dev>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251221024321.143196-2-zhenwei.pi@linux.dev>
(cherry picked from commit 91c6438caffc880e999a7312825479685d659b44)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When migrating, dst QEMU by default has SMRAM unlocked,
and since wmask is not migrated, the migrated value of
MCH_HOST_BRIDGE_F_SMBASE in config space fall to prey of
mch_update_smbase_smram()
...
if (pd->wmask[MCH_HOST_BRIDGE_F_SMBASE] == 0xff) {
*reg = 0x00;
and is getting cleared and leads to unlocked smram
on dst even if on source it's been locked.
As Andrey has pointed out [1], we should derive wmask
from config and not other way around.
Drop offending chunk and resync wmask based on MCH_HOST_BRIDGE_F_SMBASE
register value. That would preserve the register during
migration and set smram regions into corresponding state.
What that changes is:
that it would let guest write junk values in register
(with no apparent effect) until it's stumbles upon
reserved 0x1 [|] 0x2 values, at which point it
would be only possible to lock register and trigger
switch to SMRAM blackhole in CPU AS.
While at it, fix up test by removing junk discard before negotiation hunk.
PS2:
Instead of adding a dedicated post_load handler for it,
reuse mch_update->mch_update_smbase_smram call chain
that is called on write/reset/post_load to be consistent
with how we handle mch registers.
PS3:
for prosterity here is erro message Andrey got due to this bug:
qemu: vfio_container_dma_map(0x..., 0x0, 0xa0000, 0x....) = -22 (Invalid argument)
qemu: hardware error: vfio: DMA mapping failed, unable to continue
1) https://patchew.org/QEMU/20251203180851.6390-1-arbn@yandex-team.com/
Fixes: f404220e27 ("q35: implement 128K SMRAM at default SMBASE address")
Reported-by: Andrey Ryabinin <arbn@yandex-team.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrey Ryabinin <arbn@yandex-team.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251211165454.288476-1-imammedo@redhat.com>
(cherry picked from commit 66cf169e29b06dca104c5a229fba0da4ce33599c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
PCI_SRIOV_* are offsets into the SR-IOV capability, not into the PCI
config space. pcie_sriov_pf_exit() erroneously takes them as the latter,
which makes it read PCI_HEADER_TYPE and PCI_BIST when it tries to read
PCI_SRIOV_TOTAL_VF.
In many cases we're lucky enough that the PCI config space will be 0
there, so we just skip the whole for loop, but this isn't guaranteed.
For example, setting the multifunction bit on the PF and then doing a
'device_del' on it will get a larger number and cause a segfault.
Fix this and access the real PCI_SRIOV_* fields in the capability.
Cc: qemu-stable@nongnu.org
Fixes: 19e55471d4 ('pcie_sriov: Allow user to create SR-IOV device')
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251205145718.55136-1-kwolf@redhat.com>
(cherry picked from commit f73e5ed9bc4cfacf041323a6b40a85e6b6459b75)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Setting the sriov-pf property on devices that aren't PCI Express causes
an assertion failure:
$ qemu-system-x86_64 \
-blockdev null-co,node-name=null \
-blockdev null-co,node-name=null2 \
-device virtio-blk,drive=null,id=pf \
-device virtio-blk,sriov-pf=pf,drive=null2
qemu-system-x86_64: ../hw/pci/pcie.c:1062: void pcie_add_capability(PCIDevice *, uint16_t, uint8_t, uint16_t, uint16_t): Assertion `offset >= PCI_CONFIG_SPACE_SIZE' failed.
This is because proxy->last_pcie_cap_offset is only initialised to a
non-zero value in virtio_pci_realize() if it's a PCI Express device, and
then virtio_pci_device_plugged() still tries to use it.
To fix this, just skip the SR-IOV code for !pci_is_express(). Then the
next thing pci_qdev_realize() does is call pcie_sriov_register_device(),
which returns the appropriate error.
Cc: qemu-stable@nongnu.org
Fixes: d0c280d3fa ('pcie_sriov: Make a PCI device with user-created VF ARI-capable')
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251204172657.174391-1-kwolf@redhat.com>
(cherry picked from commit 623db856476806124e9ae45fbc39e75012261570)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In `virtio_add_resource` function, the UUID used as a key for
`g_hash_table_insert` was temporary, which could lead to
invalid lookups when accessed later. This patch ensures that
the UUID remains valid by duplicating it into a newly allocated
memory space. The value is then inserted into the hash table
with this persistent UUID key to ensure that the key stored in
the hash table remains valid as long as the hash table entry
exists.
Fixes: faefdba847 ("hw/display: introduce virtio-dmabuf")
Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Albert Esteve <aesteve@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Jim MacArthur <jim.macarthur@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251204162129.262745-1-dbassey@redhat.com>
(cherry picked from commit fff77cfb8413190c6362b95203ef0973c83b50d2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When stopping a vhost-vdpa device, only the first queue pair is marked as suspended,
while the remaining queues are not updated to the suspended state.
As a result, when stopping a multi-queue vhost-vdpa device,
the following error message will be printed.
qemu-system-x86_64:vhost VQ 2 ring restore failed: -1: Operation not permitted (1)
qemu-system-x86_64:vhost VQ 3 ring restore failed: -1: Operation not permitted (1)
So move v->suspended to v->shared, and then all the vhost_vdpa devices cannot
have different suspended states.
Fixes: 0bb302a996 ("vdpa: add vhost_vdpa_suspend")
Suggested-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Wafer Xie <wafer@jaguarmicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20251119132452.3117-1-wafer@jaguarmicro.com>
(cherry picked from commit fd3a2c601ab4a1bdb669e4c584b364e00a978702)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In the previous design, the I2C model updated dma_dram_offset only when
firmware programmed the RX/TX DMA buffer address registers. The firmware
used to rewrite these registers before issuing each DMA command.
The firmware driver behavior has changed to program the DMA address
registers only once during I2C initialization. As a result, the I2C model
no longer refreshes dma_dram_offset, causing DMA to move data into an
incorrect DRAM address.
Fix this by introducing helper functions to update dma_dram_offset from
the DMA address registers, and invoke them right before handling TX/RX
DMA operations. This guarantees DMA always uses the correct buffer
address even if the registers are programmed only once.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Fixes: c400c38854 ("hw/i2c/aspeed: Introduce a new dma_dram_offset attribute in AspeedI2Cbus")
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20260203020855.1642884-5-jamin_lin@aspeedtech.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit efea7ddb4689a1ac4bce63a9ddb32887c7f3ac50)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In the previous design, the I2C model would update I2CC_DMA_LEN (0x54) based on
the value of I2CM_DMA_LEN (0x1C) when the firmware set either I2CM_DMA_TX_ADDR
(0x30) or I2CM_DMA_RX_ADDR (0x34). However, this only worked correctly if the
firmware set I2CM_DMA_LEN before setting I2CM_DMA_TX_ADDR or I2CM_DMA_RX_ADDR.
If the firmware instead set I2CM_DMA_TX_ADDR or I2CM_DMA_RX_ADDR before setting
I2CM_DMA_LEN, the value written to I2CC_DMA_LEN would be incorrect.
To fix this issue, the model should be updated to set I2CC_DMA_LEN when the
firmware writes to the I2CM_DMA_LEN register, rather than when it writes to the
I2CM_DMA_RX_ADDR and I2CM_DMA_TX_ADDR registers.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Fixes: ba2cccd64e ("aspeed: i2c: Add new mode support")
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20260102090746.1130033-4-jamin_lin@aspeedtech.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 9cbd8ee7f67fceee51d3c993a282e5adc397b6b9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
EHCI3 and EHCI4 were missing entries in aspeed_soc_ast2700a1_irqmap,
so their source IRQs were never routed through the INTC OR-gates.
As a result, EHCI3/4 interrupts were not propagated to the GIC,
causing incorrect interrupt behavior for these controllers.
Add EHCI3 and EHCI4 to the IRQ map and route them to the same INTC
group as other shared peripherals, ensuring their interrupts are
properly connected to the GIC.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Fixes: ba27ba302a ("hw/arm: ast27x0: Wire up EHCI controllers")
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20260203020855.1642884-2-jamin_lin@aspeedtech.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 7d64f04863ed23f6a142fb8f47c5a470c0e081f9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>