Commit graph

2578 commits

Author SHA1 Message Date
Marco Cavenati
04a191cb36 migration: add FEATURE_SEEKABLE to QIOChannelBlock
Enable the use of the mapped-ram migration feature with savevm/loadvm
snapshots by adding the QIO_CHANNEL_FEATURE_SEEKABLE feature to
QIOChannelBlock. Implement io_preadv and io_pwritev methods to provide
positioned I/O capabilities that don't modify the channel's position
pointer.

Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
Link: https://lore.kernel.org/r/20251010115954.1995298-2-Marco.Cavenati@eurecom.fr
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-11-03 16:04:09 -05:00
Marco Cavenati
e5423828d6 migration/ram: fix docs of ram_handle_zero
Remove outdated 'ch' parameter from the function documentation.

Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20251001161823.2032399-3-Marco.Cavenati@eurecom.fr
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-11-03 16:04:09 -05:00
Fabiano Rosas
112b55f0b0 migration/savevm: Add a compatibility check for capabilities
It has always been possible to enable arbitrary migration capabilities
and attempt to take a snapshot of the VM with the savevm/loadvm
commands as well as their QMP counterparts
snapshot-save/snapshot-load.

Most migration capabilities are not meant to be used with snapshots
and there's a risk of crashing QEMU or producing incorrect
behavior. Ideally, every migration capability would either be
implemented for savevm or explicitly rejected.

Add a compatibility check routine and reject the snapshot command if
an incompatible capability is enabled. For now only act on the the two
that actually cause a crash: multifd and mapped-ram.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2881
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20251007184213.5990-1-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-11-03 16:04:09 -05:00
Philippe Mathieu-Daudé
4db362f68c system/physmem: Extract API out of 'system/ram_addr.h' header
Very few files use the Physical Memory API. Declare its
methods in their own header: "system/physmem.h".

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20251001175448.18933-19-philmd@linaro.org>
2025-10-07 05:03:56 +02:00
Philippe Mathieu-Daudé
aa60bdb700 system/physmem: Drop 'cpu_' prefix in Physical Memory API
The functions related to the Physical Memory API declared
in "system/ram_addr.h" do not operate on vCPU. Remove the
'cpu_' prefix.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20251001175448.18933-18-philmd@linaro.org>
2025-10-07 05:03:56 +02:00
Philippe Mathieu-Daudé
8bf3a88308 system/physmem: Reduce cpu_physical_memory_sync_dirty_bitmap() scope
cpu_physical_memory_sync_dirty_bitmap() is now only called within
system/physmem.c, by ramblock_sync_dirty_bitmap(). Reduce its scope
by making it internal to this file. Since it doesn't involve any CPU,
remove the 'cpu_' prefix.
Remove the now unneeded "qemu/rcu.h" and "system/memory.h" headers.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20251001175448.18933-17-philmd@linaro.org>
2025-10-07 05:03:56 +02:00
Philippe Mathieu-Daudé
34f9b0ad08 system/ramblock: Move ram_block_is_pmem() declaration
Move ramblock_is_pmem() along with the RAM Block API
exposed by the "system/ramblock.h" header. Rename as
ram_block_is_pmem() to keep API prefix consistency.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Peter Xu <peterx@redhat.com>
Message-Id: <20251002032812.26069-3-philmd@linaro.org>
2025-10-07 03:37:03 +02:00
Steve Sistare
a3eae205c6 migration: cpr-exec mode
Add the cpr-exec migration mode.  Usage:
  qemu-system-$arch -machine aux-ram-share=on ...
  migrate_set_parameter mode cpr-exec
  migrate_set_parameter cpr-exec-command \
    <arg1> <arg2> ... -incoming <uri-1> \
  migrate -d <uri-1>

The migrate command stops the VM, saves state to uri-1,
directly exec's a new version of QEMU on the same host,
replacing the original process while retaining its PID, and
loads state from uri-1.  Guest RAM is preserved in place,
albeit with new virtual addresses.

The new QEMU process is started by exec'ing the command
specified by the @cpr-exec-command parameter.  The first word of
the command is the binary, and the remaining words are its
arguments.  The command may be a direct invocation of new QEMU,
or may be a non-QEMU command that exec's the new QEMU binary.

This mode creates a second migration channel that is not visible
to the user.  At the start of migration, old QEMU saves CPR state
to the second channel, and at the end of migration, it tells the
main loop to call cpr_exec.  New QEMU loads CPR state early, before
objects are created.

Because old QEMU terminates when new QEMU starts, one cannot
stream data between the two, so uri-1 must be a type,
such as a file, that accepts all data before old QEMU exits.
Otherwise, old QEMU may quietly block writing to the channel.

Memory-backend objects must have the share=on attribute, but
memory-backend-epc is not supported.  The VM must be started with
the '-machine aux-ram-share=on' option, which allows anonymous
memory to be transferred in place to the new process.  The memfds
are kept open across exec by clearing the close-on-exec flag, their
values are saved in CPR state, and they are mmap'd in new QEMU.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Link: https://lore.kernel.org/r/1759332851-370353-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Steve Sistare
efc6587313 migration: cpr-exec save and load
To preserve CPR state across exec, create a QEMUFile based on a memfd, and
keep the memfd open across exec.  Save the value of the memfd in an
environment variable so post-exec QEMU can find it.

These new functions are called in a subsequent patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1759332851-370353-6-git-send-email-steven.sistare@oracle.com
[peterx: fix build for Windows]
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Steve Sistare
f57ff59f1e migration: cpr-exec-command parameter
Create the cpr-exec-command migration parameter, defined as a list of
strings.  It will be used for cpr-exec migration mode in a subsequent
patch, and contains forward references to cpr-exec mode in the qapi
doc.

No functional change, except that cpr-exec-command is shown by the
'info migrate' command.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Link: https://lore.kernel.org/r/1759332851-370353-5-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Steve Sistare
a9f9eee58b migration: add cpr_walk_fd
Add a helper to walk all CPR fd's and run a callback for each.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1759332851-370353-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Steve Sistare
dc79c7d5e1 migration: multi-mode notifier
Allow a notifier to be added for multiple migration modes.
To allow a notifier to appear on multiple per-node lists, use
a generic list type.  We can no longer use NotifierWithReturnList,
because it shoe horns the notifier onto a single list.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/1759332851-370353-2-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Daniel P. Berrangé
a5bc1ccca9 migration: simplify error reporting after channel read
The code handling the return value of qio_channel_read proceses
len == 0 (EOF) separately from len < 1  (error), but in both
cases ends up calling qemu_file_set_error_obj() with -EIO as the
errno. This logic can be merged into one codepath to simplify it.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Link: https://lore.kernel.org/r/20250801170212.54409-2-berrange@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Juraj Marcin
725a9e5f78 migration: Fix state transition in postcopy_start() error handling
Commit 4881411136 ("migration: Always set DEVICE state") introduced
DEVICE state to postcopy, which moved the actual state transition that
leads to POSTCOPY_ACTIVE.

However, the error handling part of the postcopy_start() function still
expects the state POSTCOPY_ACTIVE, but depending on where an error
happens, now the state can be either ACTIVE, DEVICE or CANCELLING, but
never POSTCOPY_ACTIVE, as this transition now happens just before a
successful return from the function.

Instead, accept any state except CANCELLING when transitioning to FAILED
state.

Cc: qemu-stable@nongnu.org
Fixes: 4881411136 ("migration: Always set DEVICE state")
Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20250826115145.871272-1-jmarcin@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Peter Xu
82f038d596 migration/multifd/tls: Cleanup BYE message processing on sender side
This patch is a trivial cleanup to the BYE messages on the multifd sender
side.  It could also be a fix, but since we do not have a solid clue,
taking this as a cleanup only.

One trivial concern is, migration_tls_channel_end() might be unsafe to be
invoked in the migration thread if migration is not successful, because
when failed / cancelled we do not know whether the multifd sender threads
can be writting to the channels, while GnuTLS library (when it's a TLS
channel) logically doesn't support concurrent writes.

When at it, cleanup on a few things.  What changed:

  - Introduce a helper to do graceful shutdowns with rich comment, hiding
    the details

  - Only send bye() iff migration succeeded, skip if it failed / cancelled

  - Detect TLS channel using channel type rather than thread created flags

  - Move the loop into the existing one that will close the channels, but
    do graceful shutdowns before channel shutdowns

  - local_err seems to have been leaked if set, fix it along the way

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20250925201601.290546-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Bin Guo
2aae717122 migration: HMP: Adjust the order of output fields
Adjust the positions of 'tls-authz' and 'max-postcopy-bandwidth' in
the fields output by the 'info migrate_parameters' command so that
related fields are next to each other.

For clarity only, no functional changes.

Sample output after this commit:
(qemu) info migrate_parameters
...
max-cpu-throttle: 99
tls-creds: ''
tls-hostname: ''
tls-authz: ''
max-bandwidth: 134217728 bytes/second
avail-switchover-bandwidth: 0 bytes/second
max-postcopy-bandwidth: 0 bytes/second
downtime-limit: 300 ms
...

Cc: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: Bin Guo <guobin@linux.alibaba.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20250929021213.28369-1-guobin@linux.alibaba.com
[peterx: move postcopy-bw before avail-switchover-bw]
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Peter Xu
dc487044d5 migration: Make migration_has_failed() work even for CANCELLING
No issue I hit, the change is only from code observation when I am looking
at a TLS premature termination issue.

We set CANCELLED very late, it means migration_has_failed() may not work
correctly if it's invoked before updating CANCELLING to CANCELLED.

Allow that state will make migration_has_failed() working as expected even
if it's invoked slightly earlier.

One current user is the multifd code for the TLS graceful termination,
where it's before updating to CANCELLED.

Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20250918203937.200833-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
40de712a89 migration: Add error-parameterized function variants in VMSD struct
- We need to have good error reporting in the callbacks in
  VMStateDescription struct. Specifically pre_save, pre_load
  and post_load callbacks.
- It is not possible to change these functions everywhere in one
  patch, therefore, we introduce a duplicate set of callbacks
  with Error object passed to them.
- So, in this commit, we implement 'errp' variants of these callbacks,
  introducing an explicit Error object parameter.
- This is a functional step towards transitioning the entire codebase
  to the new error-parameterized functions.
- Deliberately called in mutual exclusion from their counterparts,
  to prevent conflicts during the transition.
- New impls should preferentally use 'errp' variants of
  these methods, and existing impls incrementally converted.
  The variants without 'errp' are intended to be removed
  once all usage is converted.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-26-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
6f9fc6f501 migration: Remove error variant of vmstate_save_state() function
This commit removes the redundant vmstate_save_state_with_err()
function.

Previously, commit 969298f9d7 introduced vmstate_save_state_with_err()
to handle error propagation, while vmstate_save_state() existed for
non-error scenarios.
This is because there were code paths where vmstate_save_state_v()
(called internally by vmstate_save_state) did not explicitly set
errors on failure.

This change unifies error handling by
 - updating vmstate_save_state() to accept an Error **errp argument.
 - vmstate_save_state_v() ensures errors are set directly within the errp
   object, eliminating the need for two separate functions.

All calls to vmstate_save_state_with_err() are replaced with
vmstate_save_state(). This simplifies the API and improves code
maintainability.

vmstate_save_state() that only calls vmstate_save_state_v(),
by inference, also has errors set in errp in case of failure.
The errors are reported using error_report_err().
If we want the function to exit on error, then &error_fatal is
passed.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-24-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
94272d9b45 migration: Capture error in postcopy_ram_listen_thread()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
postcopy_ram_listen_thread() calls qemu_loadvm_state_main()
to load the vm, and in case of a failure, it should set the error
in the migration object.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-23-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
aa77746602 migration: push Error **errp into loadvm_postcopy_handle_switchover_start()
This is an incremental step in converting vmstate loading code to report
error via Error objects instead of directly printing it to console/monitor.
It is ensured that loadvm_postcopy_handle_switchover_start() must report
an error in errp, in case of failure.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-22-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
d865e4aabd migration: push Error **errp into loadvm_process_enable_colo()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_process_enable_colo() must report an error
in errp, in case of failure.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-21-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
d9d7c8d813 migration: Return -1 on memory allocation failure in ram.c
The function colo_init_ram_cache() currently returns -errno if
qemu_anon_ram_alloc() fails. However, the subsequent cleanup loop that
calls qemu_anon_ram_free() could potentially alter the value of errno.
This would cause the function to return a value that does not accurately
represent the original allocation failure.

This commit changes the return value to -1 on memory allocation failure.
This ensures that the return value is consistent and is not affected by
any errno changes that may occur during the free process.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-20-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
97c2fad858 migration: push Error **errp into loadvm_handle_recv_bitmap()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_handle_recv_bitmap() must report an error
in errp, in case of failure.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-19-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
fd487d7fc7 migration: push Error **errp into loadvm_postcopy_ram_handle_discard()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_postcopy_ram_handle_discard() must report an error
in errp, in case of failure.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-18-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
ff6ae44e00 migration: push Error **errp into loadvm_postcopy_handle_run()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_postcopy_handle_run() must report an error
in errp, in case of failure.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-17-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
208e522766 migration: push Error **errp into loadvm_postcopy_handle_listen()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_postcopy_handle_listen() must report an error
in errp, in case of failure.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-16-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
34a94fe28d migration: push Error **errp into loadvm_postcopy_handle_advise()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_postcopy_handle_advise() must report an error
in errp, in case of failure.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-15-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
44cdbaa98e migration: push Error **errp into ram_postcopy_incoming_init()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that ram_postcopy_incoming_init() must report an error
in errp, in case of failure.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-14-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
3cd7a260e5 migration: make loadvm_postcopy_handle_resume() void
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.

Use warn_report() instead of error_report(); it ensures that
a resume command received while the migration is not
in postcopy recover state is not fatal. It only informs that
the command received is unusual, and therefore we should not set
errp with the error string.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-13-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
3f9d6e77b0 migration: Update qemu_file_get_return_path() docs and remove dead checks
The documentation of qemu_file_get_return_path() states that it can
return NULL on failure. However, a review of the current implementation
reveals that it is guaranteed that it will always succeed and will never
return NULL.

As a result, the NULL checks post calling the function become redundant.
This commit updates the documentation for the function and removes all
NULL checks throughout the migration code.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-12-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
8b6dad124b migration: push Error **errp into qemu_loadvm_section_part_end()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_loadvm_section_part_end() must report an error
in errp, in case of failure.
This patch also removes the setting of errp when errp is NULL in the
out section as it is no longer required in the series.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-11-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
4493752653 migration: push Error **errp into qemu_loadvm_section_start_full()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_loadvm_section_start_full() must report an error
in errp, in case of failure.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-10-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
deecb4e713 migration: push Error **errp into qemu_loadvm_state_main()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_loadvm_state_main() must report an error
in errp, in case of failure.

Set errp explicitly if it is NULL in case of failure in the out
section. This will be removed in the subsequent patch when all of
the calls are converted to passing errp.

The error message in the default case of qemu_loadvm_state_main()
has the word "savevm". This is removed because it can confuse the
user while reading destination side error logs.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-9-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
84279d5dc1 migration: push Error **errp into qemu_load_device_state()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_load_device_state() must report an error
in errp, in case of failure.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-8-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
9535435795 migration: push Error **errp into qemu_loadvm_state()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_loadvm_state() must report an error
in errp, in case of failure.

When postcopy live migration runs, the device states are loaded by
both the qemu coroutine process_incoming_migration_co() and the
postcopy_ram_listen_thread(). Therefore, it is important that the
coroutine also reports the error in case of failure, with
error_report_err(). Otherwise, the source qemu will not display
any errors before going into the postcopy pause state.

Suggested-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-7-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
829f0d238d migration: push Error **errp into loadvm_handle_cmd_packaged()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_handle_cmd_packaged() must report an error
in errp, in case of failure.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-6-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
5fb0cfbb35 migration: push Error **errp into loadvm_process_command()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_process_command() must report an error
in errp, in case of failure.

The errors are temporarily reported using error_report_err().
This is removed in the subsequent patches in this series
when we are actually able to propagate the error to the calling
function.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-5-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
e97da6b583 migration: push Error **errp into vmstate_load()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that vmstate_load() must report an error
in errp, in case of failure.

The errors are temporarily reported using error_report_err().
This is removed in the subsequent patches in this series
when we are actually able to propagate the error to the calling
function.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-4-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:01 -04:00
Arun Menon
71cf63a471 migration: push Error **errp into qemu_loadvm_state_header()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_loadvm_state_header() must report an error
in errp, in case of failure.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-3-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:01 -04:00
Arun Menon
c632ffbd74 migration: push Error **errp into vmstate_load_state()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that vmstate_load_state() must report an error
in errp, in case of failure.

The errors are temporarily reported using error_report_err().
This is removed in the subsequent patches in this series,
when we are actually able to propagate the error to the calling
function using errp. Whereas, if we want the function to exit on
error, then error_fatal is passed.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-2-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:01 -04:00
Arun Menon
73b42fc58d migration: push Error **errp into vmstate_subsection_load()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that vmstate_subsection_load() must report an error
in errp, in case of failure.

The errors are temporarily reported using error_report_err().
This is removed in the subsequent patches in this series,
when we are actually able to propagate the error to the calling
function using errp.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-1-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:01 -04:00
Markus Armbruster
897071bb27 migration/cpr: Clean up error reporting in cpr_resave_fd()
qapi/error.h advises:

 * Please don't error_setg(&error_fatal, ...), use error_report() and
 * exit(), because that's more obvious.

Do that, and replace exit() by g_assert_not_reached(), since this is
actually a programming error.

Cc: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Steve Sistare <steven.sistare@oracle.com>
Message-ID: <20250923091000.3180122-5-armbru@redhat.com>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
2025-09-30 14:43:53 +02:00
Vladimir Sementsov-Ogievskiy
fe6a74f365 migration: qemu_file_set_blocking(): add errp parameter
qemu_file_set_blocking() is a wrapper on qio_channel_set_blocking(),
so let's passthrough the errp.

Note the migration should not be using &error_abort in these calls,
however, this is done to expedite the API conversion.

The original code would have eventually ended up calling either
qemu_socket_set_nonblock which would asset on Linux, or
g_unix_set_fd_nonblocking which would propagate errors. We never
saw asserts in practice, and conceptually they should not happen,
but ideally this code will be later adapted to remove use of
&error_abort.

Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-09-19 12:46:07 +01:00
Vladimir Sementsov-Ogievskiy
7bc2cbe330 migration/qemu-file: don't make incoming fds blocking again
In migration we want to pass fd "as is", not changing its
blocking status.

The only current user of these fds is CPR state (through VMSTATE_FD),
which of-course doesn't want to modify fds on target when source is
still running and use these fds.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2025-09-19 12:46:06 +01:00
Pierrick Bouvier
01b6fb3705 migration/vfio: compile only once
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250730220435.1139101-3-pierrick.bouvier@linaro.org>
[PMD: Cover vfio-stub.c in MAINTAINERS]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2025-09-02 17:57:05 +02:00
Pierrick Bouvier
7ef54b7bf3 migration: compile migration/ram.c once
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250730220435.1139101-2-pierrick.bouvier@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2025-09-02 17:57:05 +02:00
Pierrick Bouvier
b496a392fe migration: rename target.c to vfio.c
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250725201729.17100-3-pierrick.bouvier@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2025-07-29 13:56:39 +02:00
Daniel P. Berrangé
eb3618e9e2 migration: activate TLS thread safety workaround
When either the postcopy or return path capabilities are
enabled, the migration code will use the primary channel
for bidirectional I/O.

If either of those capabilities are enabled, the migration
code needs to mark the channel as expecting concurrent I/O
in order to activate the thread safety workarounds for
GNUTLS bug 1717

Closes: https://gitlab.com/qemu-project/qemu/-/issues/1937
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/qemu-devel/20250718150514.2635338-4-berrange@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-07-22 19:39:30 -03:00
Daniel P. Berrangé
eaec556bc8 migration: show error message when postcopy fails
The 'info migrate' command only shows the error message when the
migration state is 'failed'. When postcopy is used, however,
the 'postcopy-paused' state is used instead of 'failed', so we
must show the error message there too.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/qemu-devel/20250721133913.2914669-1-berrange@redhat.com
[line break to satisfy checkpatch]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-07-22 19:39:29 -03:00