In commits like 969e50b61a ("net: Pad short frames to minimum size
before sending from SLiRP/TAP") we switched away from requiring
network devices to handle short frames to instead having the net core
code do the padding of short frames out to the ETH_ZLEN minimum size.
We then dropped the code for handling short frames from the network
devices in a series of commits like 140eae9c8f ("hw/net: e1000:
Remove the logic of padding short frames in the receive path").
This missed one route where the device's receive code can still see a
short frame: if the device is in loopback mode and it transmits a
short frame via the qemu_receive_packet() function, this will be fed
back into its own receive code without being padded.
Add the padding logic to qemu_receive_packet().
This fixes a buffer overrun which can be triggered in the
e1000_receive_iov() logic via the loopback code path.
Other devices that use qemu_receive_packet() to implement loopback
are cadence_gem, dp8393x, lan9118, msf2-emac, pcnet, rtl8139
and sungem.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3043
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
An assertion in e1000e_write_payload_frag_to_rx_buffers() attempts to
guard against the calling code accidentally trying to write too much
data to a single RX descriptor, such that the E1000EBAState::cur_idx
indexes off the end of the EB1000BAState::written[] array.
Unfortunately it is overzealous: it asserts that cur_idx is in
range after it has been incremented. This will fire incorrectly
for the case where the guest configures four buffers and exactly
enough bytes are written to fill all four of them.
The only places where we use cur_idx and index in to the written[]
array are the functions e1000e_write_hdr_frag_to_rx_buffers() and
e1000e_write_payload_frag_to_rx_buffers(), so we can rewrite this to
assert before doing the array dereference, rather than asserting
after updating cur_idx.
Cc: qemu-stable@nongnu.org
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
In e1000e_write_packet_to_guest() we attempt to ensure that we don't
write more of a packet to a descriptor than will fit in the guest
configured receive buffers. However, this code does not allow for
the "packet split" feature. When packet splitting is enabled, the
first of up to 4 buffers in the descriptor is used for the packet
header only, with the payload going into buffers 2, 3 and 4. Our
length check only checks against the total sizes of all 4 buffers,
which meant that if an incoming packet was large enough to fit in (1
+ 2 + 3 + 4) but not into (2 + 3 + 4) and packet splitting was
enabled, we would run into the assertion in
e1000e_write_hdr_frag_to_rx_buffers() that we had enough buffers for
the data:
qemu-system-i386: ../../hw/net/e1000e_core.c:1418: void e1000e_write_payload_frag_to_rx_buffers(E1000ECore *, hwaddr *, E1000EBAState *, const char *, dma_addr_t): Assertion `bastate->cur_idx < MAX_PS_BUFFERS' failed.
A malicious guest could provoke this assertion by configuring the
device into loopback mode, and then sending itself a suitably sized
packet into a suitably arrange rx descriptor.
The code also fails to deal with the possibility that the descriptor
buffers are sized such that the trailing checksum word does not fit
into the last descriptor which has actual data, which might also
trigger this assertion.
Rework the length handling to use two variables:
* desc_size is the total amount of data DMA'd to the guest
for the descriptor being processed in this iteration of the loop
* rx_desc_buf_size is the total amount of space left in it
As we copy data to the guest (packet header, payload, checksum),
update these two variables. (Previously we attempted to calculate
desc_size once at the top of the loop, but this is too difficult to
do correctly.) Then we can use the variables to ensure that we clamp
the amount of copied payload data to the remaining space in the
descriptor's buffers, even if we've used one of the buffers up in the
packet-split code, and we can tell whether we have enough space for
the full checksum word in this descriptor or whether we're going to
need to split that to the following descriptor.
I have included comments that hopefully help to make the loop
logic a little clearer.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/537
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
In e1000e_write_packet_to_guest() we don't write data for RX descriptors
where the buffer address is NULL (as required by the i82574 datasheet
section 7.1.7.2). However, when we do this we still update desc_offset
by the amount of data we would have written to the RX descriptor if
it had a valid buffer pointer, resulting in our dropping that data
entirely. The data sheet is not 100% clear on the subject, but this
seems unlikely to be the correct behaviour.
Rearrange the null-descriptor logic so that we don't treat these
do-nothing descriptors as if we'd really written the data.
This both fixes a bug and also is a prerequisite to cleaning up
the size calculation logic in the next patch.
(Cc to stable largely because it will be needed for the next patch,
which fixes a more serious bug.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Makes the net_hub_port_cleanup function idempotent to avoid double
removals by guarding its QLIST_REMOVE with a flag.
When using a Xen networking device with hubport backends, e.g.:
-accel kvm,xen-version=0x40011
-netdev hubport,...
-device xen-net-device,...
the shutdown order starts with net_cleanup, which walks the list and
deletes netdevs (including hubports). Then Xen's xen_device_unrealize is
called, which eventually leads to a second net_hub_port_cleanup call,
resulting in a segfault.
Fixes: e7891c57 ("net: move backend cleanup to NIC cleanup")
Reported-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
- stream: Fix potential crash during job completion
- aio: add the aio_add_sqe() io_uring API
- qcow2: put discards in discard queue when discard-no-unref is enabled
- qcow2, vmdk: Restrict creation with secondary file using protocol
- qemu-img rebase: Fix assertion failure due to exceeding IO_BUF_SIZE
- iotests: Run iotests with sanitizers
- iotests: Add more image formats to the thorough testing
- iotests: Improve the dry run list to speed up thorough testing
- Code cleanup
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmkTqWcRHGt3b2xmQHJl
ZGhhdC5jb20ACgkQfwmycsiPL9awPg//VqEgqYbEr3dVUvBFk8tlcewoo7KGICVk
4kddOwMJIdcsVpiLuNzqQARH2kHV93Hiv+mVt25o00PkJx565eCGTh/bBFas3UXL
JMBjgHyJutGr4cijkNrnQgqWfeTgc32xdVEWh1nZM2K7LslzC9I1PfUzfxRMYqZA
Em0KE3vwQDC7xtIyk4t451hkfcQY8fwN9bDMpD+zbzaLsYTEyOJ900En88iW7oHE
TuJhrviin11jdQCA26QVNXRaw7iIVVo8vJP1VEgbn31iY+Qpcr/HcQRs0x2gex67
OqIdh4onqkdGCFDxTGUoAH+jORXWUmk/JipIhl9pJP0ZDyAjsm97ThJ6SvctURsK
UMU0dzXEc1C5spD2CWnN0PujqHYQqYaylx7MdiCJMjaCfDB3ZeIRsTGoiLMB24P+
WBrcn2P+f03nC/sVvxRZWrpyI2kZwEh1RsO/mnLQ3apVBFeKqaFi8Ouo9oi1ZMd6
ahUw7sZSoTxmGY1FhOSRCGEh2Wjy0ZIOx9tHT1U9vig5Kf9KeE81yO8yaq2T60mq
9eaUL8rcUrKRiJw9NUkcEYmIUJrh0nUe/kK2RWmbEGMYIH7ASrGqiyUP5FxpekD+
i/uen4BeyRwe6rnPOzGolg+HMysMBr8VD/8PwJ8g88FLH1jIdTYvFUdRbrkciUlo
okC+y4+kqiU=
=SI8s
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging
Block layer patches
- stream: Fix potential crash during job completion
- aio: add the aio_add_sqe() io_uring API
- qcow2: put discards in discard queue when discard-no-unref is enabled
- qcow2, vmdk: Restrict creation with secondary file using protocol
- qemu-img rebase: Fix assertion failure due to exceeding IO_BUF_SIZE
- iotests: Run iotests with sanitizers
- iotests: Add more image formats to the thorough testing
- iotests: Improve the dry run list to speed up thorough testing
- Code cleanup
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCgAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmkTqWcRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9awPg//VqEgqYbEr3dVUvBFk8tlcewoo7KGICVk
# 4kddOwMJIdcsVpiLuNzqQARH2kHV93Hiv+mVt25o00PkJx565eCGTh/bBFas3UXL
# JMBjgHyJutGr4cijkNrnQgqWfeTgc32xdVEWh1nZM2K7LslzC9I1PfUzfxRMYqZA
# Em0KE3vwQDC7xtIyk4t451hkfcQY8fwN9bDMpD+zbzaLsYTEyOJ900En88iW7oHE
# TuJhrviin11jdQCA26QVNXRaw7iIVVo8vJP1VEgbn31iY+Qpcr/HcQRs0x2gex67
# OqIdh4onqkdGCFDxTGUoAH+jORXWUmk/JipIhl9pJP0ZDyAjsm97ThJ6SvctURsK
# UMU0dzXEc1C5spD2CWnN0PujqHYQqYaylx7MdiCJMjaCfDB3ZeIRsTGoiLMB24P+
# WBrcn2P+f03nC/sVvxRZWrpyI2kZwEh1RsO/mnLQ3apVBFeKqaFi8Ouo9oi1ZMd6
# ahUw7sZSoTxmGY1FhOSRCGEh2Wjy0ZIOx9tHT1U9vig5Kf9KeE81yO8yaq2T60mq
# 9eaUL8rcUrKRiJw9NUkcEYmIUJrh0nUe/kK2RWmbEGMYIH7ASrGqiyUP5FxpekD+
# i/uen4BeyRwe6rnPOzGolg+HMysMBr8VD/8PwJ8g88FLH1jIdTYvFUdRbrkciUlo
# okC+y4+kqiU=
# =SI8s
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 11 Nov 2025 10:23:51 PM CET
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (28 commits)
qemu-img rebase: don't exceed IO_BUF_SIZE in one operation
qcow2, vmdk: Restrict creation with secondary file using protocol
block: Allow drivers to control protocol prefix at creation
tests/qemu-iotest: Add more image formats to the thorough testing
tests/qemu-iotests: Improve the dry run list to speed up thorough testing
tests/qemu-iotests/184: Fix skip message for qemu-img without throttle
qcow2: put discards in discard queue when discard-no-unref is enabled
qcow2: rename update_refcount_discard to queue_discard
iotests: Run iotests with sanitizers
qemu-img: Fix amend option parse error handling
iotests: Test resizing file node under raw with size/offset
block: Drop detach_subchain for bdrv_replace_node
block: replace TABs with space
block/io_uring: use non-vectored read/write when possible
block/io_uring: use aio_add_sqe()
aio-posix: add aio_add_sqe() API for user-defined io_uring requests
aio-posix: add fdmon_ops->dispatch()
aio-posix: unindent fdmon_io_uring_destroy()
aio-posix: gracefully handle io_uring_queue_init() failure
aio: add errp argument to aio_context_setup()
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
During a rebase operation data is copied from the backing chain into
the target image using a loop, and each iteration looks for a
contiguous region of allocated data of at most IO_BUF_SIZE (2 MB).
Once that region is found, and in order to avoid partial writes, its
boundaries are extended so they are aligned to the (sub)clusters of
the target image (see commit 12df580b).
This operation can however result in a region that exceeds the maximum
allowed IO_BUF_SIZE, crashing qemu-img.
This can be easily reproduced when the source image has a smaller
cluster size than the target image:
base <- int <- active
$ qemu-img create -f qcow2 base.qcow2 4M
$ qemu-img create -f qcow2 -F qcow2 -b base.qcow2 -o cluster_size=1M int.qcow2
$ qemu-img create -f qcow2 -F qcow2 -b int.qcow2 -o cluster_size=2M active.qcow2
$ qemu-io -c "write -P 0xff 1M 2M" int.qcow2
$ qemu-img rebase -F qcow2 -b base.qcow2 active.qcow2
qemu-img: qemu-img.c:4102: img_rebase: Assertion `written + pnum <= IO_BUF_SIZE' failed.
Aborted
Cc: qemu-stable <qemu-stable@nongnu.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3174
Fixes: 12df580b3b ("qemu-img: rebase: avoid unnecessary COW operations")
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-ID: <20251107091834.383781-1-berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Ever since CVE-2024-4467 (see commit 7ead9469 in qemu v9.1.0), we have
intentionally treated the opening of secondary files whose name is
specified in the contents of the primary file, such as a qcow2
data_file, as something that must be a local file and not a protocol
prefix (it is still possible to open a qcow2 file that wraps an NBD
data image by using QMP commands, but that is from the explicit action
of the QMP overriding any string encoded in the qcow2 file). At the
time, we did not prevent the use of protocol prefixes on the secondary
image while creating a qcow2 file, but it results in a qcow2 file that
records an empty string for the data_file, rather than the protocol
passed in during creation:
$ qemu-img create -f raw datastore.raw 2G
$ qemu-nbd -e 0 -t -f raw datastore.raw &
$ qemu-img create -f qcow2 -o data_file=nbd://localhost:10809/ \
datastore_nbd.qcow2 2G
Formatting 'datastore_nbd.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2147483648 data_file=nbd://localhost:10809/ lazy_refcounts=off refcount_bits=16
$ qemu-img info datastore_nbd.qcow2 | grep data
$ qemu-img info datastore_nbd.qcow2 | grep data
image: datastore_nbd.qcow2
data file:
data file raw: false
filename: datastore_nbd.qcow2
And since an empty string was recorded in the file, attempting to open
the image without using QMP to supply the NBD data store fails, with a
somewhat confusing error message:
$ qemu-io -f qcow2 datastore_nbd.qcow2
qemu-io: can't open device datastore_nbd.qcow2: The 'file' block driver requires a file name
Although the ability to create an image with a convenience reference
to a protocol data file is not a security hole (unlike the case with
open, the image is not untrusted if we are the ones creating it), the
above demo shows that it is still inconsistent. Thus, it makes more
sense if we also insist that image creation rejects a protocol prefix
when using the same syntax. Now, the above attempt produces:
$ qemu-img create -f qcow2 -o data_file=nbd://localhost:10809/ \
datastore_nbd.qcow2 2G
Formatting 'datastore_nbd.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2147483648 data_file=nbd://localhost:10809/ lazy_refcounts=off refcount_bits=16
qemu-img: datastore_nbd.qcow2: Could not create 'nbd://localhost:10809/': No such file or directory
with datastore_nbd.qcow2 no longer created.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20250915213919.3121401-6-eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch is pure refactoring: instead of hard-coding permission to
use a protocol prefix when creating an image, the drivers can now pass
in a parameter, comparable to what they could already do for opening a
pre-existing image. This patch is purely mechanical (all drivers pass
in true for now), but it will enable the next patch to cater to
drivers that want to differ in behavior for the primary image vs. any
secondary images that are opened at the same time as creating the
primary image.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20250915213919.3121401-5-eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that the "check" script is a little bit smarter with providing
a list of tests that are supported for an image format, we can also
add more image formats that can be used for generic block layer
testing. (Note: qcow1 and luks are not added because some tests
there currently fail, and other formats like bochs, cloop, dmg and
vvfat do not work with the generic tests and thus would only get
skipped if we'd tried to add them here)
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20251014104142.1281028-4-thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When running the tests in thorough mode, e.g. with:
make -j$(nproc) check SPEED=thorough
we currently always get a huge amount of total tests that the test
runner tries to execute (2457 in my case), but a big bunch of them are
only skipped (1099 in my case, meaning that only 1358 got executed).
This happens because we try to run the whole set of iotests for multiple
image formats while a lot of the tests can only run with one certain
format only and thus are marked as SKIP during execution. This is quite a
waste of time during each test run, and also unnecessarily blows up the
displayed list of executed tests in the console output.
Thus let's try to be a little bit smarter: If the "check" script is run
with "-n" and an image format switch (like "-qed") at the same time (which
is what we do for discovering the tests for the meson test runner already),
only report the tests that likely support the given format instead of
providing the whole list of all tests. We can determine whether a test
supports a format or not by looking at the lines in the file that contain
a "supported_fmt" or "unsupported_fmt" statement. This is only heuristics,
of course, but it is good enough for running the iotests via "make
check-block" - I double-checked that the list of executed tests does not
get changed by this patch, it's only the tests that are skipped anyway that
are now not run anymore.
This way the amount of total tests drops from 2457 to 1432 for me, and
the amount of skipped tests drops from 1099 to just 74 (meaning that we
still properly run 1432 - 74 = 1358 tests as we did before).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20251014104142.1281028-3-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If qemu-img does not support throttling, test 184 currently skips
with the message:
not suitable for this image format: raw
But that's wrong, it's not about the image format, it's about the
throttling not being available in qemu-img. Thus fix this by using
_notrun with a proper message instead.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20251014104142.1281028-2-thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When discard-no-unref is enabled, discards are not queued like it
should.
This was broken since discard-no-unref was added.
Add a helper function qcow2_discard_cluster which handles some common
checks and calls the queue_discards function if needed to add the
discard request to the queue.
Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
Message-ID: <20250513132628.1055549-3-jean-louis@dupond.be>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The function just queues discards, and doesn't do any refcount change.
So let's change the function name to align with its function.
Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
Message-ID: <20250513132628.1055549-2-jean-louis@dupond.be>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 2cc4d1c5ea ("tests/check-block: Skip iotests when sanitizers
are enabled") changed iotests to skip when sanitizers are enabled.
The rationale is that AddressSanitizer emits warnings and reports leaks,
which results in test breakage. Later, sanitizers that are enabled for
production environments (safe-stack and cfi-icall) were exempted.
However, this approach has a few problems.
- It requires rebuild to disable sanitizers if the existing build has
them enabled.
- It disables other useful non-production sanitizers.
- The exemption of safe-stack and cfi-icall is not correctly
implemented, so qemu-iotests are incorrectly enabled whenever either
safe-stack or cfi-icall is enabled *and*, even if there is another
sanitizer like AddressSanitizer.
To solve these problems, direct AddressSanitizer warnings to separate
files to avoid changing the test results, and selectively disable
leak detection at runtime instead of requiring to disable all
sanitizers at buildtime.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Message-ID: <20251023-iotests-v1-2-fab143ca4c2f@rsg.ci.i.u-tokyo.ac.jp>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu_opts_del(opts) dereferences opts->list, which is the old amend_opts
pointer that can be dangling after executing
qemu_opts_append(amend_opts, bs->drv->create_opts) and cause
use-after-free.
Fix the potential use-after-free by moving the qemu_opts_del() call
before the qemu_opts_append() call.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Message-ID: <20251023-iotests-v1-1-fab143ca4c2f@rsg.ci.i.u-tokyo.ac.jp>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds some more tests for using the 'size' and 'offset' options of
raw to the recently added resize-below-raw test.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20251028094328.17919-1-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Detaching filters using detach_subchain=true can cause segfaults as
described in #3149.
More specifically, this was observed when executing concurrent
block-stream and query-named-block-nodes. block-stream adds a
copy-on-read filter as the main BDS for the blockjob; that filter was
dropped with detach_subchain=true but not unref'd until the the blockjob
was free'd. Because query-named-block-nodes assumes that a filter will
always have exactly one child, it caused a segfault when it observed the
detached filter. Stacktrace:
0 bdrv_refresh_filename (bs=0x5efed72f8350)
at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:8082
1 0x00005efea73cf9dc in bdrv_block_device_info
(blk=0x0, bs=0x5efed72f8350, flat=true, errp=0x7ffeb829ebd8)
at block/qapi.c:62
2 0x00005efea7391ed3 in bdrv_named_nodes_list
(flat=<optimized out>, errp=0x7ffeb829ebd8)
at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:6275
3 0x00005efea7471993 in qmp_query_named_block_nodes
(has_flat=<optimized out>, flat=<optimized out>, errp=0x7ffeb829ebd8)
at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/blockdev.c:2834
4 qmp_marshal_query_named_block_nodes
(args=<optimized out>, ret=0x7f2b753beec0, errp=0x7f2b753beec8)
at qapi/qapi-commands-block-core.c:553
5 0x00005efea74f03a5 in do_qmp_dispatch_bh (opaque=0x7f2b753beed0)
at qapi/qmp-dispatch.c:128
6 0x00005efea75108e6 in aio_bh_poll (ctx=0x5efed6f3f430)
at util/async.c:219
7 0x00005efea74ffdb2 in aio_dispatch (ctx=0x5efed6f3f430)
at util/aio-posix.c:436
8 0x00005efea7512846 in aio_ctx_dispatch (source=<optimized out>,
callback=<optimized out>,user_data=<optimized out>)
at util/async.c:361
9 0x00007f2b77809bfb in ?? ()
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
10 0x00007f2b77809e70 in g_main_context_dispatch ()
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
11 0x00005efea7517228 in glib_pollfds_poll () at util/main-loop.c:287
12 os_host_main_loop_wait (timeout=0) at util/main-loop.c:310
13 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:589
14 0x00005efea7140482 in qemu_main_loop () at system/runstate.c:905
15 0x00005efea744e4e8 in qemu_default_main (opaque=opaque@entry=0x0)
at system/main.c:50
16 0x00005efea6e76319 in main
(argc=<optimized out>, argv=<optimized out>)
at system/main.c:93
As discussed in 20251024-second-fix-3149-v1-1-d997fa3d5ce2@canonical.com,
a filter should not exist without children in the first place; therefore,
drop the parameter entirely as it is only used for filters.
This is a partial revert of 3108a15cf0.
After this change, a blockdev-backup job's copy-before-write filter will
hold references to its children until the filter is unref'd. This causes
an additional flush during bdrv_close, so also update iotest 257.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3149
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20251029-third-fix-3149-v2-1-94932bb404f4@canonical.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Bring the block files in line with the QEMU coding style, with spaces
for indentation. This patch partially resolves the issue 371.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/371
Signed-off-by: Yeqi Fu <fufuyqqqqqq@gmail.com>
Message-ID: <20230325085224.23842-1-fufuyqqqqqq@gmail.com>
[thuth: Rebased the patch to the current master branch]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20251007163511.334178-1-thuth@redhat.com>
[kwolf: Fixed up vertical alignemnt]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The io_uring_prep_readv2/writev2() man pages recommend using the
non-vectored read/write operations when possible for performance
reasons.
I didn't measure a significant difference but it doesn't hurt to have
this optimization in place.
Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20251104022933.618123-16-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
AioContext has its own io_uring instance for file descriptor monitoring.
The disk I/O io_uring code was developed separately. Originally I
thought the characteristics of file descriptor monitoring and disk I/O
were too different, requiring separate io_uring instances.
Now it has become clear to me that it's feasible to share a single
io_uring instance for file descriptor monitoring and disk I/O. We're not
using io_uring's IOPOLL feature or anything else that would require a
separate instance.
Unify block/io_uring.c and util/fdmon-io_uring.c using the new
aio_add_sqe() API that allows user-defined io_uring sqe submission. Now
block/io_uring.c just needs to submit readv/writev/fsync and most of the
io_uring-specific logic is handled by fdmon-io_uring.c.
There are two immediate advantages:
1. Fewer system calls. There is no need to monitor the disk I/O io_uring
ring fd from the file descriptor monitoring io_uring instance. Disk
I/O completions are now picked up directly. Also, sqes are
accumulated in the sq ring until the end of the event loop iteration
and there are fewer io_uring_enter(2) syscalls.
2. Less code duplication.
Note that error_setg() messages are not supposed to end with
punctuation, so I removed a '.' for the non-io_uring build error
message.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20251104022933.618123-15-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Introduce the aio_add_sqe() API for submitting io_uring requests in the
current AioContext. This allows other components in QEMU, like the block
layer, to take advantage of io_uring features without creating their own
io_uring context.
This API supports nested event loops just like file descriptor
monitoring and BHs do. This comes at a complexity cost: CQE callbacks
must be placed on a list so that nested event loops can invoke pending
CQE callbacks from parent event loops. If you're wondering why
CqeHandler exists instead of just a callback function pointer, this is
why.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20251104022933.618123-14-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The ppoll and epoll file descriptor monitoring implementations rely on
the event loop's generic file descriptor, timer, and BH dispatch code to
invoke user callbacks.
The io_uring file descriptor monitoring implementation will need
io_uring-specific dispatch logic for CQE handlers for custom SQEs.
Introduce a new FDMonOps ->dispatch() callback that allows file
descriptor monitoring implementations to invoke user callbacks. The next
patch will use this new callback.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20251104022933.618123-13-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reduce the level of indentation to make further code changes easier to
read.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20251104022933.618123-12-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
io_uring may not be available at runtime due to system policies (e.g.
the io_uring_disabled sysctl) or creation could fail due to file
descriptor resource limits.
Handle failure scenarios as follows:
If another AioContext already has io_uring, then fail AioContext
creation so that the aio_add_sqe() API is available uniformly from all
QEMU threads. Otherwise fall back to epoll(7) if io_uring is
unavailable.
Notes:
- Update the comment about selecting the fastest fdmon implementation.
At this point it's not about speed anymore, it's about aio_add_sqe()
API availability.
- Uppercase the error message when converting from error_report() to
error_setg_errno() for consistency (but there are instances of
lowercase in the codebase).
- It's easier to move the #ifdefs from aio-posix.h to aio-posix.c.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20251104022933.618123-11-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When aio_context_new() -> aio_context_setup() fails at startup it
doesn't really matter whether errors are returned to the caller or the
process terminates immediately.
However, it is not acceptable to terminate when hotplugging --object
iothread at runtime. Refactor aio_context_setup() so that errors can be
propagated. The next commit will set errp when fdmon_io_uring_setup()
fails.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20251104022933.618123-10-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
g_source_destroy() only removes the GSource from the GMainContext it's
attached to, if any. It does not free it.
Use g_source_unref() instead so that the AioContext (which embeds a
GSource) is freed. There is no need to call g_source_destroy() in
aio_context_new() because the GSource isn't attached to a GMainContext
yet.
aio_ctx_finalize() expects everything to be set up already, so introduce
the new ctx->initialized boolean and do nothing when called with
!initialized. This also requires moving aio_context_setup() down after
event_notifier_init() since aio_ctx_finalize() won't release any
resources that aio_context_setup() acquired.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20251104022933.618123-9-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no need for aio_context_use_g_source() now that epoll(7) and
io_uring(7) file descriptor monitoring works with the glib event loop.
AioContext doesn't need to be notified that GSource is being used.
On hosts with io_uring support this now enables fdmon-io_uring.c by
default, replacing fdmon-poll.c and fdmon-epoll.c. In other words, the
event loop will use io_uring!
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20251104022933.618123-8-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
AioContext's glib integration only supports ppoll(2) file descriptor
monitoring. epoll(7) and io_uring(7) disable themselves and switch back
to ppoll(2) when the glib event loop is used. The main loop thread
cannot use epoll(7) or io_uring(7) because it always uses the glib event
loop.
Future QEMU features may require io_uring(7). One example is uring_cmd
support in FUSE exports. Each feature could create its own io_uring(7)
context and integrate it into the event loop, but this is inefficient
due to extra syscalls. It would be more efficient to reuse the
AioContext's existing fdmon-io_uring.c io_uring(7) context because
fdmon-io_uring.c will already be active on systems where Linux io_uring
is available.
In order to keep fdmon-io_uring.c's AioContext operational even when the
glib event loop is used, extend FDMonOps with an API similar to
GSourceFuncs so that file descriptor monitoring can integrate into the
glib event loop.
A quick summary of the GSourceFuncs API:
- prepare() is called each event loop iteration before waiting for file
descriptors and timers.
- check() is called to determine whether events are ready to be
dispatched after waiting.
- dispatch() is called to process events.
More details here: https://docs.gtk.org/glib/struct.SourceFuncs.html
Move the ppoll(2)-specific code from aio-posix.c into fdmon-poll.c and
also implement epoll(7)- and io_uring(7)-specific file descriptor
monitoring code for glib event loops.
Note that it's still faster to use aio_poll() rather than the glib event
loop since glib waits for file descriptor activity with ppoll(2) and
does not support adaptive polling. But at least epoll(7) and io_uring(7)
now work in glib event loops.
Splitting this into multiple commits without temporarily breaking
AioContext proved difficult so this commit makes all the changes. The
next commit will remove the aio_context_use_g_source() API because it is
no longer needed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20251104022933.618123-7-stefanha@redhat.com>
[kwolf: Build fixes; fix AioContext.list_lock use after destroy]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
test-nested-aio-poll relies on internal details of how fdmon-poll.c
handles AioContext polling. Skip it when other fdmon implementations are
in use.
The reason why fdmon-io_uring.c behaves differently from fdmon-poll.c is
that its fdmon_ops->need_wait() function returns true when
io_uring_enter(2) must be called (e.g. to submit pending SQEs).
AioContext polling is skipped when ->need_wait() returns true, so the
test case will never enter AioContext polling mode with
fdmon-io_uring.c.
Restrict this test to fdmon-poll.c and drop the
aio_context_use_g_source() call since it's no longer necessary.
Note that this test is only built on POSIX systems so it is safe to
include "util/aio-posix.h".
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20251104022933.618123-6-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 816a430c51 ("util/aio: Defer disabling poll mode as long as
possible") kept polling enabled when the event loop timeout is 0. Since
there is no timeout the event loop will continue immediately and the
overhead of disabling and re-enabling polling can be avoided.
fdmon-io_uring.c is unable to take advantage of this optimization
because its ->need_wait() function returns true whenever there are new
io_uring SQEs to submit:
if (timeout || ctx->fdmon_ops->need_wait(ctx)) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Polling will be disabled even when timeout == 0.
Extend the optimization to handle the case when need_wait() returns true
and timeout == 0.
Cc: Chao Gao <chao.gao@intel.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20251104022933.618123-5-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
io_uring_enter(2) only returns -EINTR in some cases when interrupted by
a signal. Therefore the while loop in fdmon_io_uring_wait() is
incomplete and can lead to a spurious early return.
Handle the case when a signal interrupts io_uring_enter(2) but the
syscall returns the number of SQEs submitted (that takes priority over
-EINTR).
This patch probably makes little difference for QEMU, but the test suite
relies on the exact pattern of aio_poll() return values, so it's best to
hide this io_uring syscall interface quirk.
Here is the strace of test-aio receiving 3 SIGCONT signals after this
fix has been applied. Notice how the io_uring_enter(2) return value is 1
the first time because an SQE was submitted, but -EINTR the other times:
eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK) = 9
io_uring_enter(7, 1, 0, 0, NULL, 8) = 1
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, 0x7ffe38a46240) = 0
io_uring_enter(7, 1, 1, IORING_ENTER_GETEVENTS, NULL, 8) = 1
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=596096, si_uid=1000} ---
io_uring_enter(7, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8) = -1 EINTR (Interrupted system call)
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=596096, si_uid=1000} ---
io_uring_enter(7, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8 <unfinished ...>
<... io_uring_enter resumed>) = -1 EINTR (Interrupted system call)
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=596096, si_uid=1000} ---
io_uring_enter(7, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8 <unfinished ...>
<... io_uring_enter resumed>) = 0
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20251104022933.618123-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
io_uring_prep_timeout() stashes a pointer to the timespec struct rather
than copying its fields. That means the struct must live until after the
SQE has been submitted by io_uring_enter(2). add_timeout_sqe() violates
this constraint because the SQE is not submitted within the function.
Inline add_timeout_sqe() into fdmon_io_uring_wait() so that the struct
lives at least as long as io_uring_enter(2).
This fixes random hangs (bogus timeout values) when the kernel loads
undefined timespec struct values from userspace after the original
struct on the stack has been destroyed.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20251104022933.618123-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When an AioHandler is enqueued on ctx->submit_list for removal, the
fill_sq_ring() function will submit an io_uring POLL_REMOVE operation to
cancel the in-flight POLL_ADD operation.
There is a race when another thread enqueues an AioHandler for deletion
on ctx->submit_list when the POLL_ADD CQE has already appeared. In that
case POLL_REMOVE is unnecessary. The code already handled this, but
forgot that the AioHandler itself is still on ctx->submit_list when the
POLL_ADD CQE is being processed. It's unsafe to delete the AioHandler at
that point in time (use-after-free).
Solve this problem by keeping the AioHandler alive but setting a flag so
that it will be deleted by fill_sq_ring() when it runs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20251104022933.618123-2-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fix the indentation in one line, and while we're at it, use an f-string
instead of old-school formatting in another spot.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20251110104837.52077-1-thuth@redhat.com>
Drop unused imports, annotate imports that are not at the top, but done
on purpose in other locations, use f-strings where it makes sense, etc.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20251103192430.63278-1-thuth@redhat.com>
Remove unused imports, write constants with capital letters and make
sure that the code uses the right indentation / formatting.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20251030143203.297692-1-thuth@redhat.com>
- put 3rd party "import pycotap" after the standard imports
- "help" is a built-in function in Python, don't use it as a variable name
- put the doc strings in the right locations (after the "def" line)
- use isinstance() instead of checking via type()
Message-Id: <a3413bbd-e98c-4267-81c7-aa42aeda8a09@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
small fixes all over the place.
UDP tunnel and TSEG tweaks are kind of borderline,
but I feel not making the change now will just add
to compatibility headaches down the road.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmkQplIPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpFDsIAMlScYTW0fugUaP4B/a8xjgRFwBSk2CoU7aE
l0k5ihyadecpnMLswkvoLfH9jl5Mu3MOZ6bpfcIHOWXMusGyiYcds6wupb8qcATP
Ud4ZjybuNrpoGUul1ECkNTE3xvUtSBOVu8z9ac4ojP+w0LVDiuWyg1bl5QiRuzEg
K87OjbdTIgCKKJi5QRw/dMJfoOofay98g0kbcuhkBiudvu3FtOpJW0g/aiY1m2sY
MXYeBZjGbYGkAOXLKRcSr3nYtZbY4sg/onJ3Xb0HPbUZfRMTm7KKApwhH9jsHmlO
VgaRGcF+dNDC7XIsaZt6k/YTsWCApYvuCcEQbjR1rW1d4ZmZU/Y=
=ocWR
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pci,pc: fixes for 10.2
small fixes all over the place.
UDP tunnel and TSEG tweaks are kind of borderline,
but I feel not making the change now will just add
to compatibility headaches down the road.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmkQplIPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpFDsIAMlScYTW0fugUaP4B/a8xjgRFwBSk2CoU7aE
# l0k5ihyadecpnMLswkvoLfH9jl5Mu3MOZ6bpfcIHOWXMusGyiYcds6wupb8qcATP
# Ud4ZjybuNrpoGUul1ECkNTE3xvUtSBOVu8z9ac4ojP+w0LVDiuWyg1bl5QiRuzEg
# K87OjbdTIgCKKJi5QRw/dMJfoOofay98g0kbcuhkBiudvu3FtOpJW0g/aiY1m2sY
# MXYeBZjGbYGkAOXLKRcSr3nYtZbY4sg/onJ3Xb0HPbUZfRMTm7KKApwhH9jsHmlO
# VgaRGcF+dNDC7XIsaZt6k/YTsWCApYvuCcEQbjR1rW1d4ZmZU/Y=
# =ocWR
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 09 Nov 2025 03:33:54 PM CET
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [unknown]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu:
vhost-user.rst: clarify when FDs can be sent
q35: increase default tseg size
virtio-net: Advertise UDP tunnel GSO support by default
tests/qtest/bios-tables-test: Update DSDT blobs after GPEX _DSM change
hw/pci-host/gpex-acpi: Fix _DSM function 0 support return value
tests/qtest/bios-tables-test: Prepare for _DSM change in the DSDT table
vhost-user: make vhost_set_vring_file() synchronous
intel_iommu: Fix DMA failure when guest switches IOMMU domain
intel_iommu: Reset pasid cache when system level reset
intel_iommu: Handle PASID cache invalidation
vhost-user: fix shared object lookup handler logic
amd_iommu: Support 64-bit address for IOTLB lookup
amd_iommu: Fix handling of devices on buses != 0
MAINTAINERS: Update entry for AMD-Vi Emulation
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Please pull a bunch of fixes which repair issues introduced due to the previous
patch series which added LASI SCSI and LASI network card support as well as
the new 715 machines.
This includes fixes for reported coverty issues, and repairs the B160L machine
emulation.
Thanks!
Helge
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCaREQRQAKCRD3ErUQojoP
Xy+DAQDJk9BbaZA4DOIMptbGewQMJLRYESa6XClF3s0IdbORQQD8DB49ipDtQkBz
50VfT6IusGBBKMaLr/9XgKqrk2bBqgc=
=mgEV
-----END PGP SIGNATURE-----
Merge tag 'lasi-fixes-pull-request' of https://github.com/hdeller/qemu-hppa into staging
hppa lasi bugfixes pull request
Please pull a bunch of fixes which repair issues introduced due to the previous
patch series which added LASI SCSI and LASI network card support as well as
the new 715 machines.
This includes fixes for reported coverty issues, and repairs the B160L machine
emulation.
Thanks!
Helge
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCaREQRQAKCRD3ErUQojoP
# Xy+DAQDJk9BbaZA4DOIMptbGewQMJLRYESa6XClF3s0IdbORQQD8DB49ipDtQkBz
# 50VfT6IusGBBKMaLr/9XgKqrk2bBqgc=
# =mgEV
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 09 Nov 2025 11:05:57 PM CET
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'lasi-fixes-pull-request' of https://github.com/hdeller/qemu-hppa:
target/hppa: Update SeaBIOS-hppa to version 20
ncr710: Use address space of device instead of global address space
ncr710: Add missing vmstate entries
i82596: Adding proper break-statement functionality in RX functions
i82596: Remove crc_valid variable
ncr710: Drop leftover debug code
ncr710: Fix potential null pointer dereference
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In the decode_group9() function, if we don't recognise the insn as
one that we should handle, we leave the 'entry' pointer unaltered.
Because the X86OpEntry struct has a union for the gen and decode
pointers, this means that the top level code will call decode.e.gen()
which tries to use the decode function pointer (still set to
decode_group9) as a gen function pointer.
This is undefined behaviour, but seems to be mostly harmless in
practice (we call decode_group9() again with bogus arguments and it
does nothing). If you have CFI enabled then it will trip the CFI
check:
../target/i386/tcg/decode-new.c.inc:2862:9: runtime error: control flow integrity check for type 'void (struct DisasContext *, struct X86DecodedInsn *)' failed during indirect function call
Set *entry to UNKNOWN_OPCODE to provoke the #UD exception, as we do
in decode_group1A() and decode_group11() for similar situations.
Thanks to the bug reporter for the clear description and analysis of
the bug and the simple reproducer.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3172
Fixes: fcd16539eb ("target/i386: convert CMPXCHG8B/CMPXCHG16B to new decoder")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20251021173152.1695997-1-peter.maydell@linaro.org>
When disabling MIPS tests on commit 1c11aa1807
("tests/functional: Mark the MIPS replay tests as flaky")
we missed the 5KEc test.
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20251104145955.84091-1-philmd@linaro.org>
re.match(r'^ *', ...) can't fail, but mypy doesn't know that and
complains:
scripts/qapi/parser.py:444: error: Item "None" of "Match[str] | None" has no attribute "end" [union-attr]
Work around by using must_match() instead.
Fixes: 8107ba47fd (qapi: Add documentation format validation)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20251105152219.311154-1-armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250925035610.80605-3-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250925035610.80605-2-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
This is SeaBIOS for the hppa architecture v20
and it contains mostly bugfixes for issues which
were introduced by adding the 715/64 machine.
Fixes include:
- Fix inventory for 715 Snake machine
- Detect if LASI LAN and SCSI exists at startup
- Allow LASI LAN on B160L if created by qemu
- Enhance error messages
Signed-off-by: Helge Deller <deller@gmx.de>