qemu-cr16

Author	SHA1	Message	Date
Cédric Le Goater	326e620fc0	Fix const qualifier build errors with recent glibc A recent change in glibc 2.42.9000 [1] changes the return type of strstr() and other string functions to be 'const char ' when the input is a 'const char '. This breaks the build in various files with errors such as : error: initialization discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers] 208 \| char pidstr = strstr(filename, "%"); \| ^~~~~~ Fix this by changing the type of the variables that store the result of these functions to 'const char '. [1] https://sourceware.org/git/?p=glibc.git;a=commit;h=cd748a63ab1a7ae846175c532a3daab341c62690 Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20251209174328.698774-1-clg@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2025-12-09 21:00:15 +01:00
Stefan Hajnoczi	047dabef97	block/io_uring: use aio_add_sqe() AioContext has its own io_uring instance for file descriptor monitoring. The disk I/O io_uring code was developed separately. Originally I thought the characteristics of file descriptor monitoring and disk I/O were too different, requiring separate io_uring instances. Now it has become clear to me that it's feasible to share a single io_uring instance for file descriptor monitoring and disk I/O. We're not using io_uring's IOPOLL feature or anything else that would require a separate instance. Unify block/io_uring.c and util/fdmon-io_uring.c using the new aio_add_sqe() API that allows user-defined io_uring sqe submission. Now block/io_uring.c just needs to submit readv/writev/fsync and most of the io_uring-specific logic is handled by fdmon-io_uring.c. There are two immediate advantages: 1. Fewer system calls. There is no need to monitor the disk I/O io_uring ring fd from the file descriptor monitoring io_uring instance. Disk I/O completions are now picked up directly. Also, sqes are accumulated in the sq ring until the end of the event loop iteration and there are fewer io_uring_enter(2) syscalls. 2. Less code duplication. Note that error_setg() messages are not supposed to end with punctuation, so I removed a '.' for the non-io_uring build error message. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20251104022933.618123-15-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	1eebdab3c3	aio-posix: add aio_add_sqe() API for user-defined io_uring requests Introduce the aio_add_sqe() API for submitting io_uring requests in the current AioContext. This allows other components in QEMU, like the block layer, to take advantage of io_uring features without creating their own io_uring context. This API supports nested event loops just like file descriptor monitoring and BHs do. This comes at a complexity cost: CQE callbacks must be placed on a list so that nested event loops can invoke pending CQE callbacks from parent event loops. If you're wondering why CqeHandler exists instead of just a callback function pointer, this is why. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20251104022933.618123-14-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	87e7a0f423	aio-posix: add fdmon_ops->dispatch() The ppoll and epoll file descriptor monitoring implementations rely on the event loop's generic file descriptor, timer, and BH dispatch code to invoke user callbacks. The io_uring file descriptor monitoring implementation will need io_uring-specific dispatch logic for CQE handlers for custom SQEs. Introduce a new FDMonOps ->dispatch() callback that allows file descriptor monitoring implementations to invoke user callbacks. The next patch will use this new callback. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20251104022933.618123-13-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	a63e41f2a4	aio-posix: unindent fdmon_io_uring_destroy() Reduce the level of indentation to make further code changes easier to read. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20251104022933.618123-12-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	59202c98c0	aio-posix: gracefully handle io_uring_queue_init() failure io_uring may not be available at runtime due to system policies (e.g. the io_uring_disabled sysctl) or creation could fail due to file descriptor resource limits. Handle failure scenarios as follows: If another AioContext already has io_uring, then fail AioContext creation so that the aio_add_sqe() API is available uniformly from all QEMU threads. Otherwise fall back to epoll(7) if io_uring is unavailable. Notes: - Update the comment about selecting the fastest fdmon implementation. At this point it's not about speed anymore, it's about aio_add_sqe() API availability. - Uppercase the error message when converting from error_report() to error_setg_errno() for consistency (but there are instances of lowercase in the codebase). - It's easier to move the #ifdefs from aio-posix.h to aio-posix.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20251104022933.618123-11-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	421dcc8023	aio: add errp argument to aio_context_setup() When aio_context_new() -> aio_context_setup() fails at startup it doesn't really matter whether errors are returned to the caller or the process terminates immediately. However, it is not acceptable to terminate when hotplugging --object iothread at runtime. Refactor aio_context_setup() so that errors can be propagated. The next commit will set errp when fdmon_io_uring_setup() fails. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20251104022933.618123-10-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	3769b9abe9	aio: free AioContext when aio_context_new() fails g_source_destroy() only removes the GSource from the GMainContext it's attached to, if any. It does not free it. Use g_source_unref() instead so that the AioContext (which embeds a GSource) is freed. There is no need to call g_source_destroy() in aio_context_new() because the GSource isn't attached to a GMainContext yet. aio_ctx_finalize() expects everything to be set up already, so introduce the new ctx->initialized boolean and do nothing when called with !initialized. This also requires moving aio_context_setup() down after event_notifier_init() since aio_ctx_finalize() won't release any resources that aio_context_setup() acquired. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20251104022933.618123-9-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	d1f42b600a	aio: remove aio_context_use_g_source() There is no need for aio_context_use_g_source() now that epoll(7) and io_uring(7) file descriptor monitoring works with the glib event loop. AioContext doesn't need to be notified that GSource is being used. On hosts with io_uring support this now enables fdmon-io_uring.c by default, replacing fdmon-poll.c and fdmon-epoll.c. In other words, the event loop will use io_uring! Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20251104022933.618123-8-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	ded29e64c6	aio-posix: integrate fdmon into glib event loop AioContext's glib integration only supports ppoll(2) file descriptor monitoring. epoll(7) and io_uring(7) disable themselves and switch back to ppoll(2) when the glib event loop is used. The main loop thread cannot use epoll(7) or io_uring(7) because it always uses the glib event loop. Future QEMU features may require io_uring(7). One example is uring_cmd support in FUSE exports. Each feature could create its own io_uring(7) context and integrate it into the event loop, but this is inefficient due to extra syscalls. It would be more efficient to reuse the AioContext's existing fdmon-io_uring.c io_uring(7) context because fdmon-io_uring.c will already be active on systems where Linux io_uring is available. In order to keep fdmon-io_uring.c's AioContext operational even when the glib event loop is used, extend FDMonOps with an API similar to GSourceFuncs so that file descriptor monitoring can integrate into the glib event loop. A quick summary of the GSourceFuncs API: - prepare() is called each event loop iteration before waiting for file descriptors and timers. - check() is called to determine whether events are ready to be dispatched after waiting. - dispatch() is called to process events. More details here: https://docs.gtk.org/glib/struct.SourceFuncs.html Move the ppoll(2)-specific code from aio-posix.c into fdmon-poll.c and also implement epoll(7)- and io_uring(7)-specific file descriptor monitoring code for glib event loops. Note that it's still faster to use aio_poll() rather than the glib event loop since glib waits for file descriptor activity with ppoll(2) and does not support adaptive polling. But at least epoll(7) and io_uring(7) now work in glib event loops. Splitting this into multiple commits without temporarily breaking AioContext proved difficult so this commit makes all the changes. The next commit will remove the aio_context_use_g_source() API because it is no longer needed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20251104022933.618123-7-stefanha@redhat.com> [kwolf: Build fixes; fix AioContext.list_lock use after destroy] Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:04:53 +01:00
Stefan Hajnoczi	511c62a2c6	aio-posix: keep polling enabled with fdmon-io_uring.c Commit `816a430c51` ("util/aio: Defer disabling poll mode as long as possible") kept polling enabled when the event loop timeout is 0. Since there is no timeout the event loop will continue immediately and the overhead of disabling and re-enabling polling can be avoided. fdmon-io_uring.c is unable to take advantage of this optimization because its ->need_wait() function returns true whenever there are new io_uring SQEs to submit: if (timeout \|\| ctx->fdmon_ops->need_wait(ctx)) { ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Polling will be disabled even when timeout == 0. Extend the optimization to handle the case when need_wait() returns true and timeout == 0. Cc: Chao Gao <chao.gao@intel.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20251104022933.618123-5-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 17:32:48 +01:00
Stefan Hajnoczi	5f8741fca5	aio-posix: fix spurious return from ->wait() due to signals io_uring_enter(2) only returns -EINTR in some cases when interrupted by a signal. Therefore the while loop in fdmon_io_uring_wait() is incomplete and can lead to a spurious early return. Handle the case when a signal interrupts io_uring_enter(2) but the syscall returns the number of SQEs submitted (that takes priority over -EINTR). This patch probably makes little difference for QEMU, but the test suite relies on the exact pattern of aio_poll() return values, so it's best to hide this io_uring syscall interface quirk. Here is the strace of test-aio receiving 3 SIGCONT signals after this fix has been applied. Notice how the io_uring_enter(2) return value is 1 the first time because an SQE was submitted, but -EINTR the other times: eventfd2(0, EFD_CLOEXEC\|EFD_NONBLOCK) = 9 io_uring_enter(7, 1, 0, 0, NULL, 8) = 1 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, 0x7ffe38a46240) = 0 io_uring_enter(7, 1, 1, IORING_ENTER_GETEVENTS, NULL, 8) = 1 --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=596096, si_uid=1000} --- io_uring_enter(7, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8) = -1 EINTR (Interrupted system call) --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=596096, si_uid=1000} --- io_uring_enter(7, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8 <unfinished ...> <... io_uring_enter resumed>) = -1 EINTR (Interrupted system call) --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=596096, si_uid=1000} --- io_uring_enter(7, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8 <unfinished ...> <... io_uring_enter resumed>) = 0 Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20251104022933.618123-4-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 17:32:48 +01:00
Stefan Hajnoczi	c31a445749	aio-posix: fix fdmon-io_uring.c timeout stack variable lifetime io_uring_prep_timeout() stashes a pointer to the timespec struct rather than copying its fields. That means the struct must live until after the SQE has been submitted by io_uring_enter(2). add_timeout_sqe() violates this constraint because the SQE is not submitted within the function. Inline add_timeout_sqe() into fdmon_io_uring_wait() so that the struct lives at least as long as io_uring_enter(2). This fixes random hangs (bogus timeout values) when the kernel loads undefined timespec struct values from userspace after the original struct on the stack has been destroyed. Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20251104022933.618123-3-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 17:32:48 +01:00
Stefan Hajnoczi	dbf70f0a03	aio-posix: fix race between io_uring CQE and AioHandler deletion When an AioHandler is enqueued on ctx->submit_list for removal, the fill_sq_ring() function will submit an io_uring POLL_REMOVE operation to cancel the in-flight POLL_ADD operation. There is a race when another thread enqueues an AioHandler for deletion on ctx->submit_list when the POLL_ADD CQE has already appeared. In that case POLL_REMOVE is unnecessary. The code already handled this, but forgot that the AioHandler itself is still on ctx->submit_list when the POLL_ADD CQE is being processed. It's unsafe to delete the AioHandler at that point in time (use-after-free). Solve this problem by keeping the AioHandler alive but setting a flag so that it will be deleted by fill_sq_ring() when it runs. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20251104022933.618123-2-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 17:32:48 +01:00
Vladimir Sementsov-Ogievskiy	20aa05edc2	util/hexdump: fix QEMU_HEXDUMP_LINE_WIDTH logic QEMU_HEXDUMP_LINE_WIDTH calculation doesn't correspond to qemu_hexdump_line(). This leads to last line of the dump (when length is not multiply of 16) has badly aligned ASCII part. Let's calculate length the same way. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20251031190246.257153-2-vsementsov@yandex-team.ru> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2025-11-03 11:59:32 +01:00
Alex Bennée	91634cc331	timers: properly prefix init_clocks() Otherwise we run the risk of name clashing, for example with stm32l4x5_usart-test.c should we shuffle the includes. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20251030173302.1379174-1-alex.bennee@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2025-11-03 11:59:32 +01:00
Peter Xu	c89d1c879a	bql: Fix bql_locked status with condvar APIs QEMU has a per-thread "bql_locked" variable stored in TLS section, showing whether the current thread is holding the BQL lock. It's a pretty handy variable. Function-wise, QEMU have codes trying to conditionally take bql, relying on the var reflecting the locking status (e.g. BQL_LOCK_GUARD), or in a GDB debugging session, we could also look at the variable (in reality, co_tls_bql_locked), to see which thread is currently holding the bql. When using that as a debugging facility, sometimes we can observe multiple threads holding bql at the same time. It's because QEMU's condvar APIs bypassed the bql_*() API, hence they do not update bql_locked even if they have released the mutex while waiting. It can cause confusion if one does "thread apply all p co_tls_bql_locked" and see multiple threads reporting true. Fix this by moving the bql status updates into the mutex debug hooks. Now the variable should always reflect the reality. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20250904223158.1276992-1-peterx@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2025-11-03 11:59:32 +01:00
Akihiko Odaki	55d98e3ede	rcu: Unify force quiescent state Borrow the concept of force quiescent state from Linux to ensure readers remain fast during normal operation and to avoid stalls. Background ========== The previous implementation had four steps to begin reclamation. 1. call_rcu_thread() would wait for the first callback. 2. call_rcu_thread() would periodically poll until a decent number of callbacks piled up or it timed out. 3. synchronize_rcu() would statr a grace period (GP). 4. wait_for_readers() would wait for the GP to end. It would also trigger the force_rcu notifier to break busy loops in a read-side critical section if drain_call_rcu() had been called. Problem ======= The separation of waiting logic across these steps led to suboptimal behavior: The GP was delayed until call_rcu_thread() stops polling. force_rcu was not consistently triggered when call_rcu_thread() detected a high number of pending callbacks or a timeout. This inconsistency sometimes led to stalls, as reported in a virtio-gpu issue where memory unmapping was blocked[1]. wait_for_readers() imposed unnecessary overhead in non-urgent cases by unconditionally executing qatomic_set(&index->waiting, true) and qemu_event_reset(&rcu_gp_event), which are necessary only for expedited synchronization. Solution ======== Move the polling in call_rcu_thread() to wait_for_readers() to prevent the delay of the GP. Additionally, reorganize wait_for_readers() to distinguish between two states: Normal State: it relies exclusively on periodic polling to detect the end of the GP and maintains the read-side fast path. Force Quiescent State: Whenever expediting synchronization, it always triggers force_rcu and executes both qatomic_set(&index->waiting, true) and qemu_event_reset(&rcu_gp_event). This avoids stalls while confining the read-side overhead to this state. This unified approach, inspired by the Linux RCU, ensures consistent and efficient RCU grace period handling and confirms resolution of the virtio-gpu issue. [1] https://lore.kernel.org/qemu-devel/20251014111234.3190346-9-alex.bennee@linaro.org/ Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Link: https://lore.kernel.org/r/20251016-force-v1-1-919a82112498@rsg.ci.i.u-tokyo.ac.jp Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-28 12:39:59 +01:00
Philippe Mathieu-Daudé	2ff8c9a298	buildsys: Remove support for 32-bit PPC hosts Stop detecting 32-bit PPC host as supported. See previous commit for rationale. Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> [rth: Retain _ARCH_PPC64 check in udiv_qrnnd] Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20251014173900.87497-4-philmd@linaro.org>	2025-10-16 14:58:53 -07:00
Paolo Bonzini	67913e95bf	timer: constify some functions Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:04:11 +02:00
Paolo Bonzini	5142397c79	async: access bottom half flags with qatomic_read Running test-aio-multithread under TSAN reveals data races on bh->flags. Because bottom halves may be scheduled or canceled asynchronously, without taking a lock, adjust aio_compute_bh_timeout() and aio_ctx_check() to use a relaxed read to access the flags. Use an acquire load to ensure that anything that was written prior to qemu_bh_schedule() is visible. Closes: https://gitlab.com/qemu-project/qemu/-/issues/2749 Closes: https://gitlab.com/qemu-project/qemu/-/issues/851 Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-10-14 11:03:59 +02:00
Steve Sistare	fe72a8073e	oslib: qemu_clear_cloexec Define qemu_clear_cloexec, analogous to qemu_set_cloexec. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/1759332851-370353-4-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2025-10-03 09:48:02 -04:00
Richard Henderson	1abdde1ad4	Pull request Tanish Desai and Paolo Bonzini's tracing Rust support. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmjdSSIACgkQnKSrs4Gr c8h8hwf/RXawMzImGn3I2kTOUWAQ97+yY0UgtyO010K71gypBa2EBcPIVH0ZOsy0 oT5pF2w7k0g83DXqupXiZO3yjSSmeGBXlOw8QS6D+FN0VpsdxrYJnvzVMqCckOrR 6wwM+fYYfCk/LwQFvjcMDdd6BSB/wUyMuBnh+fa8X9vxRL6CgMY7RpQd7YZ9JNtL PFQscu/K6zUARxwQ/DZTx5jYlW4rE5O4mq80CW2l1pgnyOH5vH/TySTKp0yX8eDO 5eoF7ttieOxxt6YobFak7EfWFvFuyp1j5NlWlyWKzhce1oSOAbaXnB1I61admRb3 7XrsTU0RjH6kp8ki4SZEoAh/HMw+4w== =myWt -----END PGP SIGNATURE----- Merge tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu into staging Pull request Tanish Desai and Paolo Bonzini's tracing Rust support. # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCgAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmjdSSIACgkQnKSrs4Gr # c8h8hwf/RXawMzImGn3I2kTOUWAQ97+yY0UgtyO010K71gypBa2EBcPIVH0ZOsy0 # oT5pF2w7k0g83DXqupXiZO3yjSSmeGBXlOw8QS6D+FN0VpsdxrYJnvzVMqCckOrR # 6wwM+fYYfCk/LwQFvjcMDdd6BSB/wUyMuBnh+fa8X9vxRL6CgMY7RpQd7YZ9JNtL # PFQscu/K6zUARxwQ/DZTx5jYlW4rE5O4mq80CW2l1pgnyOH5vH/TySTKp0yX8eDO # 5eoF7ttieOxxt6YobFak7EfWFvFuyp1j5NlWlyWKzhce1oSOAbaXnB1I61admRb3 # 7XrsTU0RjH6kp8ki4SZEoAh/HMw+4w== # =myWt # -----END PGP SIGNATURE----- # gpg: Signature made Wed 01 Oct 2025 08:30:42 AM PDT # gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [unknown] # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu: tracetool/syslog: add Rust support tracetool/ftrace: add Rust support tracetool/log: add Rust support log: change qemu_loglevel to unsigned tracetool/simple: add Rust support rust: pl011: add tracepoints rust: qdev: add minimal clock bindings rust: add trace crate tracetool: Add Rust format support tracetool/backend: remove redundant trace event checks tracetool: add CHECK_TRACE_EVENT_GET_STATE trace/ftrace: move snprintf+write from tracepoints to ftrace.c tracetool: add SPDX headers treewide: remove unnessary "coding" header tracetool: remove dead code tracetool: fix usage of try_import() Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2025-10-01 15:02:42 -07:00
Paolo Bonzini	75871d88d3	log: change qemu_loglevel to unsigned Bindgen makes the LOG_* constants unsigned, even if they are defined as (1 << 15): pub const LOG_TRACE: u32 = 32768; Make them unsigned in C as well through the BIT() macro, and also change the type of the variable that they are used with. Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20250929154938.594389-14-pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2025-10-01 11:22:07 -04:00
Markus Armbruster	bcb536cabe	error: Kill @error_warn We added @error_warn some two years ago in commit `3ffef1a55c` (error: add global &error_warn destination). It has multiple issues: * error.h's big comment was not updated for it. * Function contracts were not updated for it. * ERRP_GUARD() is unaware of @error_warn, and fails to mask it from error_prepend() and such. These crash on @error_warn, as pointed out by Akihiko Odaki. All fixable. However, after more than two years, we had just of 15 uses, of which the last few patches removed seven as unclean or otherwise undesirable, adding back five elsewhere. I didn't look closely enough at the remaining seven to decide whether they are desirable or not. I don't think this feature earns its keep. Drop it. Thanks-to: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-ID: <20250923091000.3180122-14-armbru@redhat.com> Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>	2025-10-01 08:33:24 +02:00
Markus Armbruster	5bd58f04b8	util/oslib-win32: Do not treat null @errp as &error_warn qemu_socket_select() and its wrapper qemu_socket_unselect() treat a null @errp as &error_warn. This is wildly inappropriate. A caller passing null @errp specifies that errors are to be ignored. If warnings are wanted, the caller must pass &error_warn. Change callers to do that, and drop the inappropriate treatment of null @errp. This assumes that warnings are wanted. I'm not familiar with the calling code, so I can't say whether it will work when the socket is invalid, or WSAEventSelect() fails. If it doesn't, then this should be an error instead of a warning. Invalid socket might even be a programming error. These warnings were introduced in commit `f5fd677ae7` (win32/socket: introduce qemu_socket_select() helper). I considered reverting to silence, but Daniel Berrangé asked for the warnings to be preserved. Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20250923091000.3180122-9-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>	2025-09-30 14:43:53 +02:00
Vladimir Sementsov-Ogievskiy	34523df319	util/vhost-user-server: vu_message_read(): improve error handling 1. Drop extra error_report_err(NULL), it will just crash, if we get here. 2. Get and report error of qemu_set_blocking(), instead of aborting. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy	6f607941b1	treewide: use qemu_set_blocking instead of g_unix_set_fd_nonblocking Instead of open-coded g_unix_set_fd_nonblocking() calls, use QEMU wrapper qemu_set_blocking(). Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> [DB: fix missing closing ) in tap-bsd.c, remove now unused GError var] Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy	5d1d32ce9d	util: drop qemu_socket_set_block() Now it's unused. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy	09759245cf	util: drop qemu_socket_try_set_nonblock() Now we can use qemu_set_blocking() in these cases. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy	8cb17f9c36	util: drop qemu_socket_set_nonblock() Use common qemu_set_blocking() instead. Note that pre-patch the behavior of Win32 and Linux realizations are inconsistent: we ignore failure for Win32, and assert success for Linux. How do we convert the callers? 1. Most of callers call qemu_socket_set_nonblock() on a freshly created socket fd, in conditions when we may simply report an error. Seems correct switching to error handling both for Windows (pre-patch error is ignored) and Linux (pre-patch we assert success). Anyway, we normally don't expect errors in these cases. Still in tests let's use &error_abort for simplicity. What are exclusions? 2. hw/virtio/vhost-user.c - we are inside #ifdef CONFIG_LINUX, so no damage in switching to error handling from assertion. 3. io/channel-socket.c: here we convert both old calls to qemu_socket_set_nonblock() and qemu_socket_set_block() to one new call. Pre-patch we assert success for Linux in qemu_socket_set_nonblock(), and ignore all other errors here. So, for Windows switch is a bit dangerous: we may get new errors or crashes(when error_abort is passed) in cases where we have silently ignored the error before (was it correct in all such cases, if they were?) Still, there is no other way to stricter API than take this risk. 4. util/vhost-user-server - compiled only for Linux (see util/meson.build), so we are safe, switching from assertion to &error_abort. Note: In qga/channel-posix.c we use g_warning(), where g_printerr() would actually be a better choice. Still let's for now follow common style of qga, where g_warning() is commonly used to print such messages, and no call to g_printerr(). Converting everything to use g_printerr() should better be another series. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy	1ed8903916	treewide: handle result of qio_channel_set_blocking() Currently, we just always pass NULL as errp argument. That doesn't look good. Some realizations of interface may actually report errors. Channel-socket realization actually either ignore or crash on errors, but we are going to straighten it out to always reporting an errp in further commits. So, convert all callers to either handle the error (where environment allows) or explicitly use &error_abort. Take also a chance to change the return value to more convenient bool (keeping also in mind, that underlying realizations may return -1 on failure, not -errno). Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> [DB: fix return type mismatch in TLS/websocket channel impls for qio_channel_set_blocking] Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy	4149afca71	util: add qemu_set_blocking() function In generic code we have qio_channel_set_blocking(), which takes bool parameter, and qemu_file_set_blocking(), which as well takes bool parameter. At lower fd-layer we have a mess of functions: - enough direct calls to Unix-specific g_unix_set_fd_nonblocking() (of course, all calls are out of Windows-compatible code), which is glib specific with GError, which we can't use, and have to handle error-reporting by hand after the call. and several platform-agnostic qemu_* helpers: - qemu_socket_set_nonblock(), which asserts success for posix (still, in most cases we can handle the error in better way) and ignores error for win32 realization - qemu_socket_try_set_nonblock(), providing and error, but not errp, so we have to handle it after the call - qemu_socket_set_block(), which simply ignores an error Note, that _socket_ word in original API, which we are going to substitute was intended, because Windows support these operations only for sockets. What leads to solution of dropping it again? 1. Having a QEMU-native wrapper with errp parameter for g_unix_set_fd_nonblocking() for non-socket fds worth doing, at least to unify error handling. 2. So, if try to keep _socket_ vs _file_ words, we'll have two actually duplicated functions for Linux, which actually will be executed successfully on any (good enough) fds, and nothing prevent using them improperly except for the name. That doesn't look good. 3. Naming helped us in the world where we crash on errors or ignore them. Now, with errp parameter, callers are intended to proper error checking. And for places where we really OK with crash-on-error semantics (like tests), we have an explicit &error_abort. So, this commit starts a series, which will effectively revert commit `ff5927baa7` "util: rename qemu_*block() socket functions" (which in turn was reverting `f9e8cacc55` "oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()", so that's a long story). Now we don't simply rename, instead we provide the new API and update all the callers. This commit only introduces a new fd-layer wrapper. Next commits will replace old API calls with it, and finally remove old API. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2025-09-19 12:46:07 +01:00
Richard Henderson	b8eb3dd495	cpuinfo/i386: Detect GFNI as an AVX extension We won't use the SSE GFNI instructions, so delay detection until we know AVX is present. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2025-09-04 09:49:23 +02:00
Daniel P. Berrangé	012842c075	log: make '-msg timestamp=on' apply to all qemu_log usage Currently the tracing 'log' back emits special code to add timestamps to trace points sent via qemu_log(). This current impl is a bad design for a number of reasons. * It changes the QEMU headers, such that 'error-report.h' content is visible to all files using tracing, but only when the 'log' backend is enabled. This has led to build failure bugs as devs rarely test without the (default) 'log' backend enabled, and CI can't cover every scenario for every trace backend. * It bloats the trace points definitions which are inlined into every probe location due to repeated inlining of timestamp formatting code, adding MBs of overhead to QEMU. * The tracing subsystem should not be treated any differently from other users of qemu_log. They all would benefit from having timestamps present. * The timestamp emitted with the tracepoints is in a needlessly different format to that used by error_report() in response to '-msg timestamp=on'. This fixes all these issues simply by moving timestamp formatting into qemu_log, using the same approach as for error_report. The code before: static inline void _nocheck__trace_qcrypto_tls_creds_get_path(void * creds, const char * filename, const char * path) { if (trace_event_get_state(TRACE_QCRYPTO_TLS_CREDS_GET_PATH) && qemu_loglevel_mask(LOG_TRACE)) { if (message_with_timestamp) { struct timeval _now; gettimeofday(&_now, NULL); qemu_log("%d@%zu.%06zu:qcrypto_tls_creds_get_path " "TLS creds path creds=%p filename=%s path=%s" "\n", qemu_get_thread_id(), (size_t)_now.tv_sec, (size_t)_now.tv_usec , creds, filename, path); } else { qemu_log("qcrypto_tls_creds_get_path " "TLS creds path creds=%p filename=%s path=%s" "\n", creds, filename, path); } } } and after: static inline void _nocheck__trace_qcrypto_tls_creds_get_path(void * creds, const char * filename, const char * path) { if (trace_event_get_state(TRACE_QCRYPTO_TLS_CREDS_GET_PATH) && qemu_loglevel_mask(LOG_TRACE)) { qemu_log("qcrypto_tls_creds_get_path " "TLS creds path creds=%p filename=%s path=%s" "\n", creds, filename, path); } } The log and error messages before: $ qemu-system-x86_64 -trace qcrypto* -object tls-creds-x509,id=tls0,dir=$HOME/tls -msg timestamp=on 2986097@1753122905.917608:qcrypto_tls_creds_x509_load TLS creds x509 load creds=0x55d925bd9490 dir=/var/home/berrange/tls 2986097@1753122905.917621:qcrypto_tls_creds_get_path TLS creds path creds=0x55d925bd9490 filename=ca-cert.pem path=<none> 2025-07-21T18:35:05.917626Z qemu-system-x86_64: Unable to access credentials /var/home/berrange/tls/ca-cert.pem: No such file or directory and after: $ qemu-system-x86_64 -trace qcrypto* -object tls-creds-x509,id=tls0,dir=$HOME/tls -msg timestamp=on 2025-07-21T18:43:28.089797Z qcrypto_tls_creds_x509_load TLS creds x509 load creds=0x55bf5bf12380 dir=/var/home/berrange/tls 2025-07-21T18:43:28.089815Z qcrypto_tls_creds_get_path TLS creds path creds=0x55bf5bf12380 filename=ca-cert.pem path=<none> 2025-07-21T18:43:28.089819Z qemu-system-x86_64: Unable to access credentials /var/home/berrange/tls/ca-cert.pem: No such file or directory The binary size before: $ ls -alh qemu-system-x86_64 -rwxr-xr-x. 1 berrange berrange 87M Jul 21 19:39 qemu-system-x86_64 $ strip qemu-system-x86_64 $ ls -alh qemu-system-x86_64 -rwxr-xr-x. 1 berrange berrange 30M Jul 21 19:39 qemu-system-x86_64 and after: $ ls -alh qemu-system-x86_64 -rwxr-xr-x. 1 berrange berrange 85M Jul 21 19:41 qemu-system-x86_64 $ strip qemu-system-x86_64 $ ls -alh qemu-system-x86_64 -rwxr-xr-x. 1 berrange berrange 29M Jul 21 19:41 qemu-system-x86_64 Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-id: 20250721185452.3016488-1-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2025-07-24 10:12:21 -04:00
Paolo Bonzini	ed7f6da282	rust/qemu-api: log: implement io::Write This makes it possible to lock the log file; it also makes log_mask_ln! not allocate memory when logging a constant string. Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-07-10 18:33:51 +02:00
Sean Wei	145c3e54be	util/rcu.c: replace FSF postal address with licenses URL The LGPLv2.1 boiler-plate in util/rcu.c still contained the obsolete "51 Franklin Street" postal address. Replace it with the canonical GNU licenses URL recommended by the FSF: https://www.gnu.org/licenses/ Signed-off-by: Sean Wei <me@sean.taipei> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-ID: <20250613.qemu.patch.07@sean.taipei> Signed-off-by: Thomas Huth <thuth@redhat.com>	2025-06-26 00:42:37 +02:00
Akihiko Odaki	0a765ca850	qemu-thread: Use futex if available for QemuLockCnt This unlocks the futex-based implementation of QemuLockCnt to Windows. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Link: https://lore.kernel.org/r/20250529-event-v5-6-53b285203794@daynix.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-06-06 14:32:55 +02:00
Akihiko Odaki	69e10db83e	qemu-thread: Use futex for QemuEvent on Windows Use the futex-based implementation of QemuEvent on Windows to remove code duplication and remove the overhead of event object construction and destruction. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20250526-event-v4-6-5b784cc8e1de@daynix.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-06-06 14:32:55 +02:00
Akihiko Odaki	d1895f4c17	qemu-thread: Avoid futex abstraction for non-Linux qemu-thread used to abstract pthread primitives into futex for the QemuEvent implementation of POSIX systems other than Linux. However, this abstraction has one key difference: unlike futex, pthread primitives require an explicit destruction, and it must be ordered after wait and wake operations. It would be easier to perform destruction if a wait operation ensures the corresponding wake operation finishes as POSIX semaphore does, but that requires to protect state accesses in qemu_event_set() and qemu_event_wait() with a mutex. On the other hand, real futex does not need such a protection but needs complex barrier and atomic operations to ensure ordering between the two functions. Add special implementations of qemu_event_set() and qemu_event_wait() using pthread primitives. qemu_event_wait() will ensure qemu_event_set() finishes, and these functions will avoid complex barrier and atomic operations to ensure ordering between them. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Tested-by: Phil Dennis-Jordan <phil@philjordan.eu> Reviewed-by: Phil Dennis-Jordan <phil@philjordan.eu> Link: https://lore.kernel.org/r/20250526-event-v4-5-5b784cc8e1de@daynix.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-06-06 14:32:55 +02:00
Akihiko Odaki	32da70a887	qemu-thread: Replace __linux__ with CONFIG_LINUX scripts/checkpatch.pl warns for __linux__ saying "architecture specific defines should be avoided". Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20250526-event-v4-4-5b784cc8e1de@daynix.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-06-06 14:32:55 +02:00
Akihiko Odaki	1bc2c49539	futex: Support Windows Windows supports futex-like APIs since Windows 8 and Windows Server 2012. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Link: https://lore.kernel.org/r/20250529-event-v5-2-53b285203794@daynix.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-06-06 14:32:55 +02:00
Akihiko Odaki	6e2d11bf04	futex: Check value after qemu_futex_wait() futex(2) - Linux manual page https://man7.org/linux/man-pages/man2/futex.2.html > Note that a wake-up can also be caused by common futex usage patterns > in unrelated code that happened to have previously used the futex > word's memory location (e.g., typical futex-based implementations of > Pthreads mutexes can cause this under some conditions). Therefore, > callers should always conservatively assume that a return value of 0 > can mean a spurious wake-up, and use the futex word's value (i.e., > the user-space synchronization scheme) to decide whether to continue > to block or not. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Link: https://lore.kernel.org/r/20250529-event-v5-1-53b285203794@daynix.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-06-06 14:32:55 +02:00
Paolo Bonzini	e8fb9c91a3	util/error: make func optional The function name is not available in Rust, so make it optional. Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-06-05 20:24:51 +02:00
Paolo Bonzini	230a4894f4	util/error: allow non-NUL-terminated err->src Rust makes the current file available as a statically-allocated string, but without a NUL terminator. Allow this by storing an optional maximum length in the Error. Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-06-05 20:24:51 +02:00
Paolo Bonzini	8714d366e7	util/error: expose Error definition to Rust code This is used to preserve the file and line in a roundtrip from C Error to Rust and back to C. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-06-05 20:24:51 +02:00
Juraj Marcin	1bd4237cb1	util/qemu-sockets: Introduce inet socket options controlling TCP keep-alive With the default TCP stack configuration, it could be even 2 hours before the connection times out due to the other side not being reachable. However, in some cases, the application needs to be aware of a connection issue much sooner. This is the case, for example, for postcopy live migration. If there is no traffic from the migration destination guest (server-side) to the migration source guest (client-side), the destination keeps waiting for pages indefinitely and does not switch to the postcopy-paused state. This can happen, for example, if the destination QEMU instance is started with the '-S' command line option and the machine is not started yet, or if the machine is idle and produces no new page faults for not-yet-migrated pages. This patch introduces new inet socket parameters that control count, idle period, and interval of TCP keep-alive packets before the connection is considered broken. These parameters are available on systems where the respective TCP socket options are defined, that includes Linux, Windows, macOS, but not OpenBSD. Additionally, macOS defines TCP_KEEPIDLE as TCP_KEEPALIVE instead, so the patch supplies its own definition. The default value for all is 0, which means the system configuration is used. Signed-off-by: Juraj Marcin <jmarcin@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2025-05-22 11:24:41 +01:00
Juraj Marcin	316e8ee8d6	util/qemu-sockets: Refactor inet_parse() to use QemuOpts Currently, the inet address parser cannot handle multiple options where one is prefixed with the name of the other. For example, with the 'keep-alive-idle' option added, the current parser cannot parse '127.0.0.1:5000,keep-alive-idle=60,keep-alive' correctly. Instead, it fails with "error parsing 'keep-alive' flag '-idle=60,keep-alive'". To resolve these issues, this patch rewrites the inet address parsing using the QemuOpts parser, which the inet_parse_flag() function tries to mimic. This new parser supports all previously supported options and on top of that the 'numeric' flag is now also supported. The only difference is, the new parser produces an error if an unknown option is passed, instead of silently ignoring it. Signed-off-by: Juraj Marcin <jmarcin@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2025-05-22 11:24:41 +01:00
Juraj Marcin	00064705ed	util/qemu-sockets: Add support for keep-alive flag to passive sockets Commit `aec21d3175` (qapi: Add InetSocketAddress member keep-alive) introduces the keep-alive flag, which enables the SO_KEEPALIVE socket option, but only on client-side sockets. However, this option is also useful for server-side sockets, so they can check if a client is still reachable or drop the connection otherwise. This patch enables the SO_KEEPALIVE socket option on passive server-side sockets if the keep-alive flag is enabled. This socket option is then inherited by active server-side sockets communicating with connected clients. Signed-off-by: Juraj Marcin <jmarcin@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2025-05-22 11:24:41 +01:00
Juraj Marcin	911e0f2c6e	util/qemu-sockets: Refactor success and failure paths in inet_listen_saddr() To get a listening socket, we need to first create a socket, try binding it to a certain port, and lastly starting listening to it. Each of these operations can fail due to various reasons, one of them being that the requested address/port is already in use. In such case, the function tries the same process with a new port number. This patch refactors the port number loop, so the success path is no longer buried inside the 'if' statements in the middle of the loop. Now, the success path is not nested and ends at the end of the iteration after successful socket creation, binding, and listening. In case any of the operations fails, it either continues to the next iteration (and the next port) or jumps out of the loop to handle the error and exits the function. Signed-off-by: Juraj Marcin <jmarcin@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2025-05-22 11:24:41 +01:00

1 2 3 4 5 ...

2038 commits