Commit graph

2016 commits

Author SHA1 Message Date
Richard Henderson
1abdde1ad4 Pull request
Tanish Desai and Paolo Bonzini's tracing Rust support.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmjdSSIACgkQnKSrs4Gr
 c8h8hwf/RXawMzImGn3I2kTOUWAQ97+yY0UgtyO010K71gypBa2EBcPIVH0ZOsy0
 oT5pF2w7k0g83DXqupXiZO3yjSSmeGBXlOw8QS6D+FN0VpsdxrYJnvzVMqCckOrR
 6wwM+fYYfCk/LwQFvjcMDdd6BSB/wUyMuBnh+fa8X9vxRL6CgMY7RpQd7YZ9JNtL
 PFQscu/K6zUARxwQ/DZTx5jYlW4rE5O4mq80CW2l1pgnyOH5vH/TySTKp0yX8eDO
 5eoF7ttieOxxt6YobFak7EfWFvFuyp1j5NlWlyWKzhce1oSOAbaXnB1I61admRb3
 7XrsTU0RjH6kp8ki4SZEoAh/HMw+4w==
 =myWt
 -----END PGP SIGNATURE-----

Merge tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu into staging

Pull request

Tanish Desai and Paolo Bonzini's tracing Rust support.

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmjdSSIACgkQnKSrs4Gr
# c8h8hwf/RXawMzImGn3I2kTOUWAQ97+yY0UgtyO010K71gypBa2EBcPIVH0ZOsy0
# oT5pF2w7k0g83DXqupXiZO3yjSSmeGBXlOw8QS6D+FN0VpsdxrYJnvzVMqCckOrR
# 6wwM+fYYfCk/LwQFvjcMDdd6BSB/wUyMuBnh+fa8X9vxRL6CgMY7RpQd7YZ9JNtL
# PFQscu/K6zUARxwQ/DZTx5jYlW4rE5O4mq80CW2l1pgnyOH5vH/TySTKp0yX8eDO
# 5eoF7ttieOxxt6YobFak7EfWFvFuyp1j5NlWlyWKzhce1oSOAbaXnB1I61admRb3
# 7XrsTU0RjH6kp8ki4SZEoAh/HMw+4w==
# =myWt
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 01 Oct 2025 08:30:42 AM PDT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [unknown]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu:
  tracetool/syslog: add Rust support
  tracetool/ftrace: add Rust support
  tracetool/log: add Rust support
  log: change qemu_loglevel to unsigned
  tracetool/simple: add Rust support
  rust: pl011: add tracepoints
  rust: qdev: add minimal clock bindings
  rust: add trace crate
  tracetool: Add Rust format support
  tracetool/backend: remove redundant trace event checks
  tracetool: add CHECK_TRACE_EVENT_GET_STATE
  trace/ftrace: move snprintf+write from tracepoints to ftrace.c
  tracetool: add SPDX headers
  treewide: remove unnessary "coding" header
  tracetool: remove dead code
  tracetool: fix usage of try_import()

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-10-01 15:02:42 -07:00
Paolo Bonzini
75871d88d3 log: change qemu_loglevel to unsigned
Bindgen makes the LOG_* constants unsigned, even if they are defined as
(1 << 15):

   pub const LOG_TRACE: u32 = 32768;

Make them unsigned in C as well through the BIT() macro, and also change
the type of the variable that they are used with.

Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20250929154938.594389-14-pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-10-01 11:22:07 -04:00
Markus Armbruster
bcb536cabe error: Kill @error_warn
We added @error_warn some two years ago in commit 3ffef1a55c (error:
add global &error_warn destination).  It has multiple issues:

* error.h's big comment was not updated for it.

* Function contracts were not updated for it.

* ERRP_GUARD() is unaware of @error_warn, and fails to mask it from
  error_prepend() and such.  These crash on @error_warn, as pointed
  out by Akihiko Odaki.

All fixable.  However, after more than two years, we had just of 15
uses, of which the last few patches removed seven as unclean or
otherwise undesirable, adding back five elsewhere.  I didn't look
closely enough at the remaining seven to decide whether they are
desirable or not.

I don't think this feature earns its keep.  Drop it.

Thanks-to: Akihiko  Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20250923091000.3180122-14-armbru@redhat.com>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
2025-10-01 08:33:24 +02:00
Markus Armbruster
5bd58f04b8 util/oslib-win32: Do not treat null @errp as &error_warn
qemu_socket_select() and its wrapper qemu_socket_unselect() treat a
null @errp as &error_warn.  This is wildly inappropriate.  A caller
passing null @errp specifies that errors are to be ignored.  If
warnings are wanted, the caller must pass &error_warn.

Change callers to do that, and drop the inappropriate treatment of
null @errp.

This assumes that warnings are wanted.  I'm not familiar with the
calling code, so I can't say whether it will work when the socket is
invalid, or WSAEventSelect() fails.  If it doesn't, then this should
be an error instead of a warning.  Invalid socket might even be a
programming error.

These warnings were introduced in commit f5fd677ae7 (win32/socket:
introduce qemu_socket_select() helper).  I considered reverting to
silence, but Daniel Berrangé asked for the warnings to be preserved.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250923091000.3180122-9-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
2025-09-30 14:43:53 +02:00
Vladimir Sementsov-Ogievskiy
34523df319 util/vhost-user-server: vu_message_read(): improve error handling
1. Drop extra error_report_err(NULL), it will just crash, if we get
here.

2. Get and report error of qemu_set_blocking(), instead of aborting.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy
6f607941b1 treewide: use qemu_set_blocking instead of g_unix_set_fd_nonblocking
Instead of open-coded g_unix_set_fd_nonblocking() calls, use
QEMU wrapper qemu_set_blocking().

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[DB: fix missing closing ) in tap-bsd.c, remove now unused GError var]
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy
5d1d32ce9d util: drop qemu_socket_set_block()
Now it's unused.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy
09759245cf util: drop qemu_socket_try_set_nonblock()
Now we can use qemu_set_blocking() in these cases.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy
8cb17f9c36 util: drop qemu_socket_set_nonblock()
Use common qemu_set_blocking() instead.

Note that pre-patch the behavior of Win32 and Linux realizations
are inconsistent: we ignore failure for Win32, and assert success
for Linux.

How do we convert the callers?

1. Most of callers call qemu_socket_set_nonblock() on a
freshly created socket fd, in conditions when we may simply
report an error. Seems correct switching to error handling
both for Windows (pre-patch error is ignored) and Linux
(pre-patch we assert success). Anyway, we normally don't
expect errors in these cases.

Still in tests let's use &error_abort for simplicity.

What are exclusions?

2. hw/virtio/vhost-user.c - we are inside #ifdef CONFIG_LINUX,
so no damage in switching to error handling from assertion.

3. io/channel-socket.c: here we convert both old calls to
qemu_socket_set_nonblock() and qemu_socket_set_block() to
one new call. Pre-patch we assert success for Linux in
qemu_socket_set_nonblock(), and ignore all other errors here.
So, for Windows switch is a bit dangerous: we may get
new errors or crashes(when error_abort is passed) in
cases where we have silently ignored the error before
(was it correct in all such cases, if they were?) Still,
there is no other way to stricter API than take
this risk.

4. util/vhost-user-server - compiled only for Linux (see
util/meson.build), so we are safe, switching from assertion to
&error_abort.

Note: In qga/channel-posix.c we use g_warning(), where g_printerr()
would actually be a better choice. Still let's for now follow
common style of qga, where g_warning() is commonly used to print
such messages, and no call to g_printerr(). Converting everything
to use g_printerr() should better be another series.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy
1ed8903916 treewide: handle result of qio_channel_set_blocking()
Currently, we just always pass NULL as errp argument. That doesn't
look good.

Some realizations of interface may actually report errors.
Channel-socket realization actually either ignore or crash on
errors, but we are going to straighten it out to always reporting
an errp in further commits.

So, convert all callers to either handle the error (where environment
allows) or explicitly use &error_abort.

Take also a chance to change the return value to more convenient
bool (keeping also in mind, that underlying realizations may
return -1 on failure, not -errno).

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[DB: fix return type mismatch in TLS/websocket channel
     impls for qio_channel_set_blocking]
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy
4149afca71 util: add qemu_set_blocking() function
In generic code we have qio_channel_set_blocking(), which takes
bool parameter, and qemu_file_set_blocking(), which as well takes
bool parameter.

At lower fd-layer we have a mess of functions:

- enough direct calls to Unix-specific g_unix_set_fd_nonblocking()
(of course, all calls are out of Windows-compatible code), which
is glib specific with GError, which we can't use, and have to
handle error-reporting by hand after the call.

and several platform-agnostic qemu_* helpers:

- qemu_socket_set_nonblock(), which asserts success for posix (still,
  in most cases we can handle the error in better way) and ignores
  error for win32 realization

- qemu_socket_try_set_nonblock(), providing and error, but not errp,
so we have to handle it after the call

- qemu_socket_set_block(), which simply ignores an error

Note, that *_socket_* word in original API, which we are going
to substitute was intended, because Windows support these operations
only for sockets. What leads to solution of dropping it again?

1. Having a QEMU-native wrapper with errp parameter
for g_unix_set_fd_nonblocking() for non-socket fds worth doing,
at least to unify error handling.

2. So, if try to keep _socket_ vs _file_ words, we'll have two
actually duplicated functions for Linux, which actually will
be executed successfully on any (good enough) fds, and nothing
prevent using them improperly except for the name. That doesn't
look good.

3. Naming helped us in the world where we crash on errors or
ignore them. Now, with errp parameter, callers are intended to
proper error checking. And for places where we really OK with
crash-on-error semantics (like tests), we have an explicit
&error_abort.

So, this commit starts a series, which will effectively revert
commit ff5927baa7 "util: rename qemu_*block() socket functions"
(which in turn was reverting f9e8cacc55
"oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()",
so that's a long story).
Now we don't simply rename, instead we provide the new API and
update all the callers.

This commit only introduces a new fd-layer wrapper. Next commits
will replace old API calls with it, and finally remove old API.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-09-19 12:46:07 +01:00
Richard Henderson
b8eb3dd495 cpuinfo/i386: Detect GFNI as an AVX extension
We won't use the SSE GFNI instructions, so delay
detection until we know AVX is present.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-09-04 09:49:23 +02:00
Daniel P. Berrangé
012842c075 log: make '-msg timestamp=on' apply to all qemu_log usage
Currently the tracing 'log' back emits special code to add timestamps
to trace points sent via qemu_log(). This current impl is a bad design
for a number of reasons.

 * It changes the QEMU headers, such that 'error-report.h' content
   is visible to all files using tracing, but only when the 'log'
   backend is enabled. This has led to build failure bugs as devs
   rarely test without the (default) 'log' backend enabled, and
   CI can't cover every scenario for every trace backend.

 * It bloats the trace points definitions which are inlined into
   every probe location due to repeated inlining of timestamp
   formatting code, adding MBs of overhead to QEMU.

 * The tracing subsystem should not be treated any differently
   from other users of qemu_log. They all would benefit from
   having timestamps present.

 * The timestamp emitted with the tracepoints is in a needlessly
   different format to that used by error_report() in response
   to '-msg timestamp=on'.

This fixes all these issues simply by moving timestamp formatting
into qemu_log, using the same approach as for error_report.

The code before:

  static inline void _nocheck__trace_qcrypto_tls_creds_get_path(void * creds, const char * filename, const char * path)
  {
      if (trace_event_get_state(TRACE_QCRYPTO_TLS_CREDS_GET_PATH) && qemu_loglevel_mask(LOG_TRACE)) {
          if (message_with_timestamp) {
              struct timeval _now;
              gettimeofday(&_now, NULL);
              qemu_log("%d@%zu.%06zu:qcrypto_tls_creds_get_path " "TLS creds path creds=%p filename=%s path=%s" "\n",
                       qemu_get_thread_id(),
                       (size_t)_now.tv_sec, (size_t)_now.tv_usec
                       , creds, filename, path);
          } else {
              qemu_log("qcrypto_tls_creds_get_path " "TLS creds path creds=%p filename=%s path=%s" "\n", creds, filename, path);
          }
      }
  }

and after:

  static inline void _nocheck__trace_qcrypto_tls_creds_get_path(void * creds, const char * filename, const char * path)
  {
      if (trace_event_get_state(TRACE_QCRYPTO_TLS_CREDS_GET_PATH) && qemu_loglevel_mask(LOG_TRACE)) {
          qemu_log("qcrypto_tls_creds_get_path " "TLS creds path creds=%p filename=%s path=%s" "\n", creds, filename, path);
      }
  }

The log and error messages before:

  $ qemu-system-x86_64 -trace qcrypto* -object tls-creds-x509,id=tls0,dir=$HOME/tls -msg timestamp=on
  2986097@1753122905.917608:qcrypto_tls_creds_x509_load TLS creds x509 load creds=0x55d925bd9490 dir=/var/home/berrange/tls
  2986097@1753122905.917621:qcrypto_tls_creds_get_path TLS creds path creds=0x55d925bd9490 filename=ca-cert.pem path=<none>
  2025-07-21T18:35:05.917626Z qemu-system-x86_64: Unable to access credentials /var/home/berrange/tls/ca-cert.pem: No such file or directory

and after:

  $ qemu-system-x86_64 -trace qcrypto* -object tls-creds-x509,id=tls0,dir=$HOME/tls -msg timestamp=on
  2025-07-21T18:43:28.089797Z qcrypto_tls_creds_x509_load TLS creds x509 load creds=0x55bf5bf12380 dir=/var/home/berrange/tls
  2025-07-21T18:43:28.089815Z qcrypto_tls_creds_get_path TLS creds path creds=0x55bf5bf12380 filename=ca-cert.pem path=<none>
  2025-07-21T18:43:28.089819Z qemu-system-x86_64: Unable to access credentials /var/home/berrange/tls/ca-cert.pem: No such file or directory

The binary size before:

  $ ls -alh qemu-system-x86_64
  -rwxr-xr-x. 1 berrange berrange 87M Jul 21 19:39 qemu-system-x86_64
  $ strip qemu-system-x86_64
  $ ls -alh qemu-system-x86_64
  -rwxr-xr-x. 1 berrange berrange 30M Jul 21 19:39 qemu-system-x86_64

and after:

  $ ls -alh qemu-system-x86_64
  -rwxr-xr-x. 1 berrange berrange 85M Jul 21 19:41 qemu-system-x86_64
  $ strip qemu-system-x86_64
  $ ls -alh qemu-system-x86_64
  -rwxr-xr-x. 1 berrange berrange 29M Jul 21 19:41 qemu-system-x86_64

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-id: 20250721185452.3016488-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-07-24 10:12:21 -04:00
Paolo Bonzini
ed7f6da282 rust/qemu-api: log: implement io::Write
This makes it possible to lock the log file; it also makes log_mask_ln!
not allocate memory when logging a constant string.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-07-10 18:33:51 +02:00
Sean Wei
145c3e54be util/rcu.c: replace FSF postal address with licenses URL
The LGPLv2.1 boiler-plate in util/rcu.c still contained
the obsolete "51 Franklin Street" postal address.

Replace it with the canonical GNU licenses URL recommended by the FSF:
https://www.gnu.org/licenses/

Signed-off-by: Sean Wei <me@sean.taipei>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250613.qemu.patch.07@sean.taipei>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-06-26 00:42:37 +02:00
Akihiko Odaki
0a765ca850 qemu-thread: Use futex if available for QemuLockCnt
This unlocks the futex-based implementation of QemuLockCnt to Windows.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Link: https://lore.kernel.org/r/20250529-event-v5-6-53b285203794@daynix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-06 14:32:55 +02:00
Akihiko Odaki
69e10db83e qemu-thread: Use futex for QemuEvent on Windows
Use the futex-based implementation of QemuEvent on Windows to
remove code duplication and remove the overhead of event object
construction and destruction.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20250526-event-v4-6-5b784cc8e1de@daynix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-06 14:32:55 +02:00
Akihiko Odaki
d1895f4c17 qemu-thread: Avoid futex abstraction for non-Linux
qemu-thread used to abstract pthread primitives into futex for the
QemuEvent implementation of POSIX systems other than Linux. However,
this abstraction has one key difference: unlike futex, pthread
primitives require an explicit destruction, and it must be ordered after
wait and wake operations.

It would be easier to perform destruction if a wait operation ensures
the corresponding wake operation finishes as POSIX semaphore does, but
that requires to protect state accesses in qemu_event_set() and
qemu_event_wait() with a mutex. On the other hand, real futex does not
need such a protection but needs complex barrier and atomic operations
to ensure ordering between the two functions.

Add special implementations of qemu_event_set() and qemu_event_wait()
using pthread primitives. qemu_event_wait() will ensure qemu_event_set()
finishes, and these functions will avoid complex barrier and atomic
operations to ensure ordering between them.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Phil Dennis-Jordan <phil@philjordan.eu>
Link: https://lore.kernel.org/r/20250526-event-v4-5-5b784cc8e1de@daynix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-06 14:32:55 +02:00
Akihiko Odaki
32da70a887 qemu-thread: Replace __linux__ with CONFIG_LINUX
scripts/checkpatch.pl warns for __linux__ saying "architecture specific
defines should be avoided".

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20250526-event-v4-4-5b784cc8e1de@daynix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-06 14:32:55 +02:00
Akihiko Odaki
1bc2c49539 futex: Support Windows
Windows supports futex-like APIs since Windows 8 and Windows Server
2012.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Link: https://lore.kernel.org/r/20250529-event-v5-2-53b285203794@daynix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-06 14:32:55 +02:00
Akihiko Odaki
6e2d11bf04 futex: Check value after qemu_futex_wait()
futex(2) - Linux manual page
https://man7.org/linux/man-pages/man2/futex.2.html
> Note that a wake-up can also be caused by common futex usage patterns
> in unrelated code that happened to have previously used the futex
> word's memory location (e.g., typical futex-based implementations of
> Pthreads mutexes can cause this under some conditions).  Therefore,
> callers should always conservatively assume that a return value of 0
> can mean a spurious wake-up, and use the futex word's value (i.e.,
> the user-space synchronization scheme) to decide whether to continue
> to block or not.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Link: https://lore.kernel.org/r/20250529-event-v5-1-53b285203794@daynix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-06 14:32:55 +02:00
Paolo Bonzini
e8fb9c91a3 util/error: make func optional
The function name is not available in Rust, so make it optional.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-05 20:24:51 +02:00
Paolo Bonzini
230a4894f4 util/error: allow non-NUL-terminated err->src
Rust makes the current file available as a statically-allocated string,
but without a NUL terminator.  Allow this by storing an optional maximum
length in the Error.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-05 20:24:51 +02:00
Paolo Bonzini
8714d366e7 util/error: expose Error definition to Rust code
This is used to preserve the file and line in a roundtrip from
C Error to Rust and back to C.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-05 20:24:51 +02:00
Juraj Marcin
1bd4237cb1 util/qemu-sockets: Introduce inet socket options controlling TCP keep-alive
With the default TCP stack configuration, it could be even 2 hours
before the connection times out due to the other side not being
reachable. However, in some cases, the application needs to be aware of
a connection issue much sooner.

This is the case, for example, for postcopy live migration. If there is
no traffic from the migration destination guest (server-side) to the
migration source guest (client-side), the destination keeps waiting for
pages indefinitely and does not switch to the postcopy-paused state.
This can happen, for example, if the destination QEMU instance is
started with the '-S' command line option and the machine is not started
yet, or if the machine is idle and produces no new page faults for
not-yet-migrated pages.

This patch introduces new inet socket parameters that control count,
idle period, and interval of TCP keep-alive packets before the
connection is considered broken. These parameters are available on
systems where the respective TCP socket options are defined, that
includes Linux, Windows, macOS, but not OpenBSD. Additionally, macOS
defines TCP_KEEPIDLE as TCP_KEEPALIVE instead, so the patch supplies its
own definition.

The default value for all is 0, which means the system configuration is
used.

Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-05-22 11:24:41 +01:00
Juraj Marcin
316e8ee8d6 util/qemu-sockets: Refactor inet_parse() to use QemuOpts
Currently, the inet address parser cannot handle multiple options where
one is prefixed with the name of the other. For example, with the
'keep-alive-idle' option added, the current parser cannot parse
'127.0.0.1:5000,keep-alive-idle=60,keep-alive' correctly. Instead, it
fails with "error parsing 'keep-alive' flag '-idle=60,keep-alive'".

To resolve these issues, this patch rewrites the inet address parsing
using the QemuOpts parser, which the inet_parse_flag() function tries to
mimic. This new parser supports all previously supported options and on
top of that the 'numeric' flag is now also supported. The only
difference is, the new parser produces an error if an unknown option is
passed, instead of silently ignoring it.

Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-05-22 11:24:41 +01:00
Juraj Marcin
00064705ed util/qemu-sockets: Add support for keep-alive flag to passive sockets
Commit aec21d3175 (qapi: Add InetSocketAddress member keep-alive)
introduces the keep-alive flag, which enables the SO_KEEPALIVE socket
option, but only on client-side sockets. However, this option is also
useful for server-side sockets, so they can check if a client is still
reachable or drop the connection otherwise.

This patch enables the SO_KEEPALIVE socket option on passive server-side
sockets if the keep-alive flag is enabled. This socket option is then
inherited by active server-side sockets communicating with connected
clients.

Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-05-22 11:24:41 +01:00
Juraj Marcin
911e0f2c6e util/qemu-sockets: Refactor success and failure paths in inet_listen_saddr()
To get a listening socket, we need to first create a socket, try binding
it to a certain port, and lastly starting listening to it. Each of these
operations can fail due to various reasons, one of them being that the
requested address/port is already in use. In such case, the function
tries the same process with a new port number.

This patch refactors the port number loop, so the success path is no
longer buried inside the 'if' statements in the middle of the loop. Now,
the success path is not nested and ends at the end of the iteration
after successful socket creation, binding, and listening. In case any of
the operations fails, it either continues to the next iteration (and the
next port) or jumps out of the loop to handle the error and exits the
function.

Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-05-22 11:24:41 +01:00
Juraj Marcin
b8b5278aca util/qemu-sockets: Refactor setting client sockopts into a separate function
This is done in preparation for enabling the SO_KEEPALIVE support for
server sockets and adding settings for more TCP keep-alive socket
options.

Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-05-22 11:24:41 +01:00
Stefan Hajnoczi
6b18f6e342 Pull request
Farhan Ali's s390x host PCI support for the block/nvme.c driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmgcviUACgkQnKSrs4Gr
 c8hRswgAupxH5Zhx50F7GzwZyu9TCF2sphEPd2VuFVxze8Sg6mXnJq5BFTjv9IuC
 0trPppfDyKFKujDk+FA3pl9bT45btm0xctNbFYNRS3HXrVUyMQLy73MlFF2twa5g
 U3uiX2d7DAYOdi5O1Cn3bhlByDh4qSko7YyUDFKio+WU57cdJxEd+pUqwyVXrU3E
 AMC2ZmJdKFGGC+tWxBIAuWNc5apq9yzbiywR8z62/Z2IC+Bym0RpvCbdklqcZb8O
 tpGxDKN8bY6s+hy1NZmA8eBA/iCiu6SUFmNpoe2vSwCFEk9R3gi+UNcuTVt3FaWO
 lgzoZSOelmI3JkF0UBqvKsPXt3fdJw==
 =KII7
 -----END PGP SIGNATURE-----

Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging

Pull request

Farhan Ali's s390x host PCI support for the block/nvme.c driver.

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmgcviUACgkQnKSrs4Gr
# c8hRswgAupxH5Zhx50F7GzwZyu9TCF2sphEPd2VuFVxze8Sg6mXnJq5BFTjv9IuC
# 0trPppfDyKFKujDk+FA3pl9bT45btm0xctNbFYNRS3HXrVUyMQLy73MlFF2twa5g
# U3uiX2d7DAYOdi5O1Cn3bhlByDh4qSko7YyUDFKio+WU57cdJxEd+pUqwyVXrU3E
# AMC2ZmJdKFGGC+tWxBIAuWNc5apq9yzbiywR8z62/Z2IC+Bym0RpvCbdklqcZb8O
# tpGxDKN8bY6s+hy1NZmA8eBA/iCiu6SUFmNpoe2vSwCFEk9R3gi+UNcuTVt3FaWO
# lgzoZSOelmI3JkF0UBqvKsPXt3fdJw==
# =KII7
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 08 May 2025 10:22:29 EDT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [ultimate]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [ultimate]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* tag 'block-pull-request' of https://gitlab.com/stefanha/qemu:
  block/nvme: Use host PCI MMIO API
  include: Add a header to define host PCI MMIO functions
  util: Add functions for s390x mmio read/write

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-05-09 12:04:19 -04:00
Farhan Ali
b17d69a1da util: Add functions for s390x mmio read/write
Starting with z15 (or newer) we can execute mmio
instructions from userspace. On older platforms
where we don't have these instructions available
we can fallback to using system calls to access
the PCI mapped resources.

This patch adds helper functions for mmio reads
and writes for s390x.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-id: 20250430185012.2303-2-alifm@linux.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-05-08 10:03:07 -04:00
Kohei Tokunaga
5b78d120ff util: Add coroutine backend for emscripten
Emscripten does not support couroutine methods currently used by QEMU but
provides a coroutine implementation called "fiber". This commit introduces a
coroutine backend using fiber. Note that fiber does not support submitting
coroutines to other threads.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/006b683fd578ed6303a2dc8679094da9a7e6dfb4.1745820062.git.ktokunaga.mail@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-05-06 16:02:04 +02:00
Kohei Tokunaga
8e72b0eb18 util: exclude mmap-alloc.c from compilation target on Emscripten
Emscripten does not support partial unmapping of mmapped memory
regions[1]. This limitation prevents correct implementation of qemu_ram_mmap
and qemu_ram_munmap, which rely on partial unmap behavior.

As a workaround, this commit excludes mmap-alloc.c from the Emscripten
build. Instead, for Emscripten build, this modifies qemu_anon_ram_alloc to
use qemu_memalign in place of qemu_ram_mmap, and disable memory backends
that rely on mmap, such as memory-backend-file and memory-backend-shm.

[1] d4a74336f2/system/lib/libc/emscripten_mmap.c (L61)

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Link: https://lore.kernel.org/r/76834f933ee4f14eeb5289d21c59d306886e58e9.1745820062.git.ktokunaga.mail@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-05-06 16:02:04 +02:00
Kohei Tokunaga
4c7c051719 util/cacheflush.c: Update cache flushing mechanism for Emscripten
Although __builtin___clear_cache is used to flush the instruction cache for
a specified memory region, this operation doesn't apply to wasm, as its
memory isn't executable. Moreover, Emscripten does not support this builtin
and fails to compile it with the following error.

> fatal error: error in backend: llvm.clear_cache is not supported on wasm

To resolve this, this commit removes the call to __builtin___clear_cache for
Emscripten build.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/2926a798fa52a3a5b11c3df4edd1643d2b7cdcb9.1745820062.git.ktokunaga.mail@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-05-06 16:02:04 +02:00
Philippe Mathieu-Daudé
2cd09e47aa qom: Make InterfaceInfo[] uses const
Mechanical change using:

  $ sed -i -E 's/\(InterfaceInfo.?\[/\(const InterfaceInfo\[/g' \
              $(git grep -lE '\(InterfaceInfo.?\[\]\)')

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20250424194905.82506-7-philmd@linaro.org>
2025-04-25 17:00:41 +02:00
Philippe Mathieu-Daudé
12d1a768bd qom: Have class_init() take a const data argument
Mechanical change using gsed, then style manually adapted
to pass checkpatch.pl script.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250424194905.82506-4-philmd@linaro.org>
2025-04-25 17:00:41 +02:00
Stefan Hajnoczi
019fbfa4bc Miscellaneous patches for 2025-04-24
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmgJ7dYSHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZTiZIP/1PFAg/s3SoiLQwH/ZrjyUkm1kiKnjOH
 CC5Stw6I9tuYnDAhASAdSymofLv0NNydNe5ai6ZZAWRyRYjIcfNigKAGK4Di+Uhe
 nYxT0Yk8hNGwMhl6NnBp4mmCUNCwcbjT9uXdiYQxFYO/qqYR1388xJjeN3c362l3
 AaLrE5bX5sqa6TAkTeRPjeIqxlyGT7jnCrN7I1hMhDvbc3ITF3AMfYFMjnmAQgr+
 mTWGS1QogqqkloODbR1DKD1CAWOlpK+0HibhNF+lz71P0HlwVvy+HPXso505Wf0B
 dMwlSrZ1DnqNVF/y5IhMEMslahKajbjbFVhBjmrGl/8T821etCxxgB20c0vyFRy8
 qTyJGwBZaEo0VWr70unSmq45TRoeQvdHAw/e+GtilR0ci80q2ly4gbObnw7L8le+
 gqZo4IWmrwp2sbPepE57sYKQpEndwbRayf/kcFd0LPPpeINu9ZooXkYX0pOo6Cdg
 vDKMaEB1/fmPhjSlknxkKN9LZdR+nDw8162S1CKsUdWanAOjmP8haN19aoHhIekZ
 q+r2qUq/U827yNy9/qbInmsoFYDz9s6sAOE63jibd5rZZ9Anei6NOSgLzA4CqCR1
 +d0+TXp19gP9mLMFs7/ZclwkXCz47OQYhXYphjI3wM9x+xbdRcI4n+DOH5u5coKx
 AsA6+2n0GF4Y
 =GaoH
 -----END PGP SIGNATURE-----

Merge tag 'pull-misc-2025-04-24' of https://repo.or.cz/qemu/armbru into staging

Miscellaneous patches for 2025-04-24

# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmgJ7dYSHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTiZIP/1PFAg/s3SoiLQwH/ZrjyUkm1kiKnjOH
# CC5Stw6I9tuYnDAhASAdSymofLv0NNydNe5ai6ZZAWRyRYjIcfNigKAGK4Di+Uhe
# nYxT0Yk8hNGwMhl6NnBp4mmCUNCwcbjT9uXdiYQxFYO/qqYR1388xJjeN3c362l3
# AaLrE5bX5sqa6TAkTeRPjeIqxlyGT7jnCrN7I1hMhDvbc3ITF3AMfYFMjnmAQgr+
# mTWGS1QogqqkloODbR1DKD1CAWOlpK+0HibhNF+lz71P0HlwVvy+HPXso505Wf0B
# dMwlSrZ1DnqNVF/y5IhMEMslahKajbjbFVhBjmrGl/8T821etCxxgB20c0vyFRy8
# qTyJGwBZaEo0VWr70unSmq45TRoeQvdHAw/e+GtilR0ci80q2ly4gbObnw7L8le+
# gqZo4IWmrwp2sbPepE57sYKQpEndwbRayf/kcFd0LPPpeINu9ZooXkYX0pOo6Cdg
# vDKMaEB1/fmPhjSlknxkKN9LZdR+nDw8162S1CKsUdWanAOjmP8haN19aoHhIekZ
# q+r2qUq/U827yNy9/qbInmsoFYDz9s6sAOE63jibd5rZZ9Anei6NOSgLzA4CqCR1
# +d0+TXp19gP9mLMFs7/ZclwkXCz47OQYhXYphjI3wM9x+xbdRcI4n+DOH5u5coKx
# AsA6+2n0GF4Y
# =GaoH
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 24 Apr 2025 03:52:54 EDT
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* tag 'pull-misc-2025-04-24' of https://repo.or.cz/qemu/armbru:
  cleanup: Drop pointless label at end of function
  cleanup: Drop pointless return at end of function
  cleanup: Re-run return_directly.cocci

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-04-24 13:44:57 -04:00
Markus Armbruster
8a2b516ba2 cleanup: Drop pointless return at end of function
A few functions now end with a label.  The next commit will clean them
up.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250407082643.2310002-3-armbru@redhat.com>
[Straightforward conflict with commit 988ad4cceb (hw/loongarch/virt:
Fix cpuslot::cpu set at last in virt_cpu_plug()) resolved]
2025-04-24 09:33:42 +02:00
Richard Henderson
161f5bc8e9 include/exec: Split out icount.h
Split icount stuff from system/cpu-timers.h.
There are 17 files which only require icount.h, 7 that only
require cpu-timers.h, and 7 that require both.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-04-23 14:08:44 -07:00
Richard Henderson
8be545ba5a include/system: Move exec/memory.h to system/memory.h
Convert the existing includes with

  sed -i ,exec/memory.h,system/memory.h,g

Move the include within cpu-all.h into a !CONFIG_USER_ONLY block.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-04-23 14:08:21 -07:00
Joe Komlodi
e6c38d2ab5 util/cacheflush: Make first DSB unconditional on aarch64
On ARM hosts with CTR_EL0.DIC and CTR_EL0.IDC set, this would only cause
an ISB to be executed during cache maintenance, which could lead to QEMU
executing TBs containing garbage instructions.

This seems to be because the ISB finishes executing instructions and
flushes the pipeline, but the ISB doesn't guarantee that writes from the
executed instructions are committed. If a small enough TB is created, it's
possible that the writes setting up the TB aren't committed by the time the
TB is executed.

This function is intended to be a port of the gcc implementation
(85b46d0795/libgcc/config/aarch64/sync-cache.c (L67))
which makes the first DSB unconditional, so we can fix the synchronization
issue by doing that as well.

Cc: qemu-stable@nongnu.org
Fixes: 664a79735e ("util: Specialize flush_idcache_range for aarch64")
Signed-off-by: Joe Komlodi <komlodi@google.com>
Message-id: 20250310203622.1827940-2-komlodi@google.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2025-03-14 12:54:33 +00:00
Kevin Wolf
f76d3bee75 aio-posix: Adjust polling time also for new handlers
aio_dispatch_handler() adds handlers to ctx->poll_aio_handlers if
polling should be enabled. If we call adjust_polling_time() for all
polling handlers before this, new polling handlers are still left at
poll->ns = 0 and polling is only actually enabled after the next event.
Move the adjust_polling_time() call after aio_dispatch_handler().

This fixes test-nested-aio-poll, which expects that polling becomes
effective the first time around.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250311141912.135657-1-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Kevin Wolf
ee416407b3 aio-posix: Separate AioPolledEvent per AioHandler
Adaptive polling has a big problem: It doesn't consider that an event
loop can wait for many different events that may have very different
typical latencies.

For example, think of a guest that tends to send a new I/O request soon
after the previous I/O request completes, but the storage on the host is
rather slow. In this case, getting the new request from guest quickly
means that polling is enabled, but the next thing is performing the I/O
request on the backend, which is slow and disables polling again for the
next guest request. This means that in such a scenario, polling could
help for every other event, but is only ever enabled when it can't
succeed.

In order to fix this, keep a separate AioPolledEvent for each
AioHandler. We will then know that the backend file descriptor always
has a high latency and isn't worth polling for, but we also know that
the guest is always fast and we should poll for it. This solves at least
half of the problem, we can now keep polling for those cases where it
makes sense and get the improved performance from it.

Since the event loop doesn't know which event will be next, we still do
some unnecessary polling while we're waiting for the slow disk. I made
some attempts to be more clever than just randomly growing and shrinking
the polling time, and even to let callers be explicit about when they
expect a new event, but so far this hasn't resulted in improved
performance or even caused performance regressions. For now, let's just
fix the part that is easy enough to fix, we can revisit the rest later.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250307221634.71951-6-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Kevin Wolf
cf2e226fc6 aio-posix: Factor out adjust_polling_time()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250307221634.71951-5-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Kevin Wolf
518db1013c aio: Create AioPolledEvent
As a preparation for having multiple adaptive polling states per
AioContext, move the 'ns' field into a separate struct.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250307221634.71951-4-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Akihiko Odaki
9dc64bd5a4 util/iov: Do not assert offset is in iov
iov_from_buf(), iov_to_buf(), iov_memset(), and iov_copy() asserts
that the given offset fits in the iov while tolerating the specified
number of bytes to operate with to be greater than the size of iov.
This is inconsistent so remove the assertions.

Asserting the offset fits in the iov makes sense if it is expected that
there are other operations that process the content before the offset
and the content is processed in order. Under this expectation, the
offset should point to the end of bytes that are previously processed
and fit in the iov. However, this expectation depends on the details of
the caller, and did not hold true at least one case and required code to
check iov_size(), which is added with commit 83ddb3dbba
("hw/net/net_tx_pkt: Fix overrun in update_sctp_checksum()").

Adding such a check is inefficient and error-prone. These functions
already tolerate the specified number of bytes to operate with to be
greater than the size of iov to avoid such checks so remove the
assertions to tolerate invalid offset as well. They return the number of
bytes they operated with so their callers can still check the returned
value to ensure there are sufficient space at the given offset.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2025-03-10 17:07:16 +08:00
Peter Maydell
02ae315467 util/qemu-timer.c: Don't warp timer from timerlist_rearm()
Currently we call icount_start_warp_timer() from timerlist_rearm().
This produces incorrect behaviour, because timerlist_rearm() is
called, for instance, when a timer callback modifies its timer.  We
cannot decide here to warp the timer forwards to the next timer
deadline merely because all_cpu_threads_idle() is true, because the
timer callback we were called from (or some other callback later in
the list of callbacks being invoked) may be about to raise a CPU
interrupt and move a CPU from idle to ready.

The only valid place to choose to warp the timer forward is from the
main loop, when we know we have no outstanding IO or timer callbacks
that might be about to wake up a CPU.

For Arm guests, this bug was mostly latent until the refactoring
commit f6fc36deef ("target/arm/helper: Implement
CNTHCTL_EL2.CNT[VP]MASK"), which exposed it because it refactored a
timer callback so that it happened to call timer_mod() first and
raise the interrupt second, when it had previously raised the
interrupt first and called timer_mod() afterwards.

This call seems to have originally derived from the
pre-record-and-replay icount code, which (as of e.g.  commit
db1a49726c in 2010) in this location did a call to
qemu_notify_event(), necessary to get the icount code in the vCPU
round-robin thread to stop and recalculate the icount deadline when a
timer was reprogrammed from the IO thread.  In current QEMU,
everything is done on the vCPU thread when we are in icount mode, so
there's no need to try to notify another thread here.

I suspect that the other reason why this call was doing icount timer
warping is that it pre-dates commit efab87cf79 from 2015, which
added a call to icount_start_warp_timer() to main_loop_wait().  Once
the call in timerlist_rearm() has been removed, if the timer
callbacks don't cause any CPU to be woken up then we will end up
calling icount_start_warp_timer() from main_loop_wait() when the rr
main loop code calls rr_wait_io_event().

Remove the incorrect call from timerlist_rearm().

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2703
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20250210135804.3526943-1-peter.maydell@linaro.org
2025-03-07 10:36:14 +00:00
Stefan Hajnoczi
98c7362b1e Generic CPUs / accelerators patch queue
- Merge "qemu/clang-tsa.h" within "qemu/compiler.h"
 - Various cleanups around accelerators initialization code
   (better user/system split)
 - Various trivial cleanups in accel/tcg/,
   Guard few TCG calls with tcg_enabled()
 - Explicit disassemble_info endianness
 - Improve dual-endianness support for MicroBlaze
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmfJw08ACgkQ4+MsLN6t
 wN70whAAtfcdWtqseFfb6fvDtjflgxN51Ui0iaOECXUA18USKriGy34eBcMYMiM2
 +eKgU7+jI6JGE4+burcgWUsPpFFF951/A8+lyIbFgO5yToTDmC+qNe4XfmMAIyXq
 uf9Obr2c0Xk9luh4odb+jPAQodw/7G1fKgcCVIJNDCl/xEcPhS9eNpTaHwcVnkWI
 K6KrxWXOsqG6+evJBPWYoXtOOyt0+JcwAsJoGhprwtGm3P9+jSVXsgeGsJVyZcna
 f32JtjWL754O8XeMkOn4x6rt58VrCIMKI9xT7keDyuhTCq0Zki9RO2nMU2dSw5mN
 AfL9hxqUy0Nijnyslg3ugujDfTePsNyLdwwH7n0mnoD72ELi6WnhDsmOThuEB3Rd
 4/kdwTJfA/rlWk/GF1tbKW7AvQZokRARtzmL3V0HmGJu57lX+2JuszEdYBkqDEP7
 GH1I10B2yANUm+C9y3X8qWOU7Ws433ebJeJoZuyfnbZ9Me+UfRmql/oS+V8ata2i
 fArEItpldUFrWRyYLkTbXrh2dgyV9yJTEir/lzOzeAZZzyabTbjf2z9qnh976GGO
 1QnDy5QA4f54kDBUZe7JK26TZsHPch7cgqXW6f8tRlJF7A9hxGK8d2TUV/lC3/vx
 LUOlWNu03PhiruYmZEcWOsY3Jt9jRCF6lIryrnaJsqnVOVmMUMM=
 =3TRh
 -----END PGP SIGNATURE-----

Merge tag 'accel-cpus-20250306' of https://github.com/philmd/qemu into staging

Generic CPUs / accelerators patch queue

- Merge "qemu/clang-tsa.h" within "qemu/compiler.h"
- Various cleanups around accelerators initialization code
  (better user/system split)
- Various trivial cleanups in accel/tcg/,
  Guard few TCG calls with tcg_enabled()
- Explicit disassemble_info endianness
- Improve dual-endianness support for MicroBlaze

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmfJw08ACgkQ4+MsLN6t
# wN70whAAtfcdWtqseFfb6fvDtjflgxN51Ui0iaOECXUA18USKriGy34eBcMYMiM2
# +eKgU7+jI6JGE4+burcgWUsPpFFF951/A8+lyIbFgO5yToTDmC+qNe4XfmMAIyXq
# uf9Obr2c0Xk9luh4odb+jPAQodw/7G1fKgcCVIJNDCl/xEcPhS9eNpTaHwcVnkWI
# K6KrxWXOsqG6+evJBPWYoXtOOyt0+JcwAsJoGhprwtGm3P9+jSVXsgeGsJVyZcna
# f32JtjWL754O8XeMkOn4x6rt58VrCIMKI9xT7keDyuhTCq0Zki9RO2nMU2dSw5mN
# AfL9hxqUy0Nijnyslg3ugujDfTePsNyLdwwH7n0mnoD72ELi6WnhDsmOThuEB3Rd
# 4/kdwTJfA/rlWk/GF1tbKW7AvQZokRARtzmL3V0HmGJu57lX+2JuszEdYBkqDEP7
# GH1I10B2yANUm+C9y3X8qWOU7Ws433ebJeJoZuyfnbZ9Me+UfRmql/oS+V8ata2i
# fArEItpldUFrWRyYLkTbXrh2dgyV9yJTEir/lzOzeAZZzyabTbjf2z9qnh976GGO
# 1QnDy5QA4f54kDBUZe7JK26TZsHPch7cgqXW6f8tRlJF7A9hxGK8d2TUV/lC3/vx
# LUOlWNu03PhiruYmZEcWOsY3Jt9jRCF6lIryrnaJsqnVOVmMUMM=
# =3TRh
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 06 Mar 2025 23:46:23 HKT
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* tag 'accel-cpus-20250306' of https://github.com/philmd/qemu: (54 commits)
  include: Poison TARGET_PHYS_ADDR_SPACE_BITS definition
  system: Open-code qemu_init_arch_modules() using target_name()
  target/i386: Mark WHPX APIC region as little-endian
  target/alpha: Do not mix exception flags and FPCR bits
  target/riscv: Convert misa_mxl_max using GLib macros
  target/riscv: Declare RISCVCPUClass::misa_mxl_max as RISCVMXL
  target/xtensa: Finalize config in xtensa_register_core()
  target/sparc: Constify SPARCCPUClass::cpu_def
  target/i386: Constify X86CPUModel uses
  disas: Remove target_words_bigendian() call in initialize_debug_target()
  target/xtensa: Set disassemble_info::endian value in disas_set_info()
  target/sh4: Set disassemble_info::endian value in disas_set_info()
  target/riscv: Set disassemble_info::endian value in disas_set_info()
  target/ppc: Set disassemble_info::endian value in disas_set_info()
  target/mips: Set disassemble_info::endian value in disas_set_info()
  target/microblaze: Set disassemble_info::endian value in disas_set_info
  target/arm: Set disassemble_info::endian value in disas_set_info()
  target: Set disassemble_info::endian value for big-endian targets
  target: Set disassemble_info::endian value for little-endian targets
  target/mips: Fix possible MSA int overflow
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-03-07 07:39:49 +08:00
Philippe Mathieu-Daudé
82c4d8a3b4 qemu/compiler: Absorb 'clang-tsa.h'
We already have "qemu/compiler.h" for compiler-specific arrangements,
automatically included by "qemu/osdep.h" for each source file. No
need to explicitly include a header for a Clang particularity.

Suggested-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250117170201.91182-1-philmd@linaro.org>
2025-03-06 14:21:25 +01:00
Maciej S. Szmigiero
b5aa74968b thread-pool: Implement generic (non-AIO) pool support
Migration code wants to manage device data sending threads in one place.

QEMU has an existing thread pool implementation, however it is limited
to queuing AIO operations only and essentially has a 1:1 mapping between
the current AioContext and the AIO ThreadPool in use.

Implement generic (non-AIO) ThreadPool by essentially wrapping Glib's
GThreadPool.

This brings a few new operations on a pool:
* thread_pool_wait() operation waits until all the submitted work requests
have finished.

* thread_pool_set_max_threads() explicitly sets the maximum thread count
in the pool.

* thread_pool_adjust_max_threads_to_work() adjusts the maximum thread count
in the pool to equal the number of still waiting in queue or unfinished work.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/b1efaebdbea7cb7068b8fb74148777012383e12b.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-03-06 06:47:33 +01:00