No description
Find a file
Peter Xu 8597af7615 migration/block: Rewrite disk activation
This patch proposes a flag to maintain disk activation status globally.  It
mostly rewrites disk activation mgmt for QEMU, including COLO and QMP
command xen_save_devices_state.

Backgrounds
===========

We have two problems on disk activations, one resolved, one not.

Problem 1: disk activation recover (for switchover interruptions)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When migration is either cancelled or failed during switchover, especially
when after the disks are inactivated, QEMU needs to remember re-activate
the disks again before vm starts.

It used to be done separately in two paths: one in qmp_migrate_cancel(),
the other one in the failure path of migration_completion().

It used to be fixed in different commits, all over the places in QEMU.  So
these are the relevant changes I saw, I'm not sure if it's complete list:

 - In 2016, commit fe904ea824 ("migration: regain control of images when
   migration fails to complete")

 - In 2017, commit 1d2acc3162 ("migration: re-active images while migration
   been canceled after inactive them")

 - In 2023, commit 6dab4c93ec ("migration: Attempt disk reactivation in
   more failure scenarios")

Now since we have a slightly better picture maybe we can unify the
reactivation in a single path.

One side benefit of doing so is, we can move the disk operation outside QMP
command "migrate_cancel".  It's possible that in the future we may want to
make "migrate_cancel" be OOB-compatible, while that requires the command
doesn't need BQL in the first place.  This will already do that and make
migrate_cancel command lightweight.

Problem 2: disk invalidation on top of invalidated disks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is an unresolved bug for current QEMU.  Link in "Resolves:" at the
end.  It turns out besides the src switchover phase (problem 1 above), QEMU
also needs to remember block activation on destination.

Consider two continuous migration in a row, where the VM was always paused.
In that scenario, the disks are not activated even until migration
completed in the 1st round.  When the 2nd round starts, if QEMU doesn't
know the status of the disks, it needs to try inactivate the disk again.

Here the issue is the block layer API bdrv_inactivate_all() will crash a
QEMU if invoked on already inactive disks for the 2nd migration.  For
detail, see the bug link at the end.

Implementation
==============

This patch proposes to maintain disk activation with a global flag, so we
know:

  - If we used to inactivate disks for migration, but migration got
  cancelled, or failed, QEMU will know it should reactivate the disks.

  - On incoming side, if the disks are never activated but then another
  migration is triggered, QEMU should be able to tell that inactivate is
  not needed for the 2nd migration.

We used to have disk_inactive, but it only solves the 1st issue, not the
2nd.  Also, it's done in completely separate paths so it's extremely hard
to follow either how the flag changes, or the duration that the flag is
valid, and when we will reactivate the disks.

Convert the existing disk_inactive flag into that global flag (also invert
its naming), and maintain the disk activation status for the whole
lifecycle of qemu.  That includes the incoming QEMU.

Put both of the error cases of source migration (failure, cancelled)
together into migration_iteration_finish(), which will be invoked for
either of the scenario.  So from that part QEMU should behave the same as
before.  However with such global maintenance on disk activation status, we
not only cleanup quite a few temporary paths that we try to maintain the
disk activation status (e.g. in postcopy code), meanwhile it fixes the
crash for problem 2 in one shot.

For freshly started QEMU, the flag is initialized to TRUE showing that the
QEMU owns the disks by default.

For incoming migrated QEMU, the flag will be initialized to FALSE once and
for all showing that the dest QEMU doesn't own the disks until switchover.
That is guaranteed by the "once" variable.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2395
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-7-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-09 17:38:57 -03:00
.github/workflows github: fix config mistake preventing repo lockdown commenting 2022-04-26 16:12:26 +01:00
.gitlab/issue_templates .gitlab/issue_templates: Move suggestions into comments 2022-12-15 15:19:24 +01:00
.gitlab-ci.d rust: ci: add job that runs Rust tools 2024-12-10 18:49:22 +01:00
accel accel/tcg: Move gen_intermediate_code to TCGCPUOps.translate_core 2024-12-24 08:32:15 -08:00
audio include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
authz error: Drop superfluous #include "qapi/qmp/qerror.h" 2023-02-23 13:56:14 +01:00
backends Accel & Exec patch queue 2024-12-21 11:07:00 -05:00
block Revert "vvfat: fix ubsan issue in create_long_filename" 2024-12-31 18:20:41 +03:00
bsd-user user: Move various declarations out of 'exec/exec-all.h' 2024-12-20 17:44:57 +01:00
chardev include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
common-user common-user/host/ppc: Implement safe-syscall.inc.S 2023-01-23 14:39:48 -10:00
configs target/i386: Reset TSCs of parked vCPUs too on VM reset 2024-12-19 19:36:38 +01:00
contrib contrib/plugins/bbv.c: Start bb index from 1 2024-12-28 14:42:53 +03:00
crypto include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
disas disas/riscv: enable disassembly for compressed sspush/sspopchk 2024-10-31 13:51:24 +10:00
docs docs/about/deprecated: Remove paragraph about initial deprecation in 2.10 2025-01-07 15:01:52 +01:00
dump include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
ebpf ebpf: improve trace event coverage to all key operations 2024-10-28 14:37:25 +08:00
fpu softfloat: Add float_muladd_suppress_add_product_zero 2024-12-24 08:32:15 -08:00
fsdev * pc: Add a description for the i8042 property 2024-10-04 19:28:37 +01:00
gdb-xml target/i386/gdbstub: Expose orig_ax 2024-10-13 10:05:51 -07:00
gdbstub include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
host/include target/i386/hvf: fix handling of XSAVE-related CPUID bits 2024-10-31 18:28:32 +01:00
hw Xen emulation fixes 2025-01-09 08:39:32 -05:00
include migration/block: Rewrite disk activation 2025-01-09 17:38:57 -03:00
io qapi/crypto: Rename QCryptoHashAlgorithm to *Algo, and drop prefix 2024-09-10 14:02:16 +02:00
libdecnumber libdecnumber/dpd/decimal64: Fix compiler warning from Clang 15 2022-11-11 09:13:52 +01:00
linux-headers linux-headers: Update to Linux 6.13-rc1 2024-12-11 09:18:38 +01:00
linux-user accel/tcg: Include missing 'exec/translation-block.h' header 2024-12-20 17:44:57 +01:00
migration migration/block: Rewrite disk activation 2025-01-09 17:38:57 -03:00
monitor migration/block: Rewrite disk activation 2025-01-09 17:38:57 -03:00
nbd include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
net net/vmnet: Pad short Ethernet frames 2024-12-31 21:21:34 +01:00
pc-bios pc-bios: add missing riscv64 descriptor 2024-12-16 07:31:28 +01:00
plugins accel/tcg: Include missing 'exec/translation-block.h' header 2024-12-20 17:44:57 +01:00
po po: update Italian translation 2024-08-13 19:01:42 +02:00
python python: silence pylint raising-non-exception error 2024-11-25 11:03:14 +01:00
qapi qapi/qom: Change Since entry for AcpiGenericPortProperties to 9.2 2024-11-26 17:18:06 -05:00
qga qga: implement a 'guest-get-load' command 2025-01-06 12:48:46 +02:00
qobject qobject: remove return after g_assert_not_reached() 2024-09-24 13:53:35 +02:00
qom qom: Create system containers explicitly 2024-12-20 17:44:56 +01:00
replay include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
roms roms: re-add edk2-basetools target 2024-12-16 07:31:28 +01:00
rust Accel & Exec patch queue 2024-12-21 11:07:00 -05:00
scripts qemu-ga: Optimize freeze-hook script logic of logging error 2025-01-06 12:57:13 +02:00
scsi configure, meson: rename targetos to host_os 2023-12-31 09:11:29 +01:00
semihosting semihosting: Restrict to TCG 2024-07-22 09:38:16 +01:00
stats include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
storage-daemon include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
stubs Accel & Exec patch queue 2024-12-21 11:07:00 -05:00
subprojects rust: add meson_version to all subprojects 2024-11-07 16:54:02 +01:00
system Remove the deprecated "-runas" command line option 2025-01-07 15:00:57 +01:00
target target/loongarch: Only support 64bit pte width 2025-01-09 14:13:17 +08:00
tcg tcg/optimize: Move fold_cmp_vec, fold_cmpsel_vec into alphabetic sort 2024-12-24 08:32:15 -08:00
tests tests/functional/test_x86_64_hotplug_cpu: Fix race condition during unplug 2025-01-07 15:02:46 +01:00
tools qemu-vmsr-helper: implement --verbose/-v 2024-07-31 13:15:06 +02:00
trace trace: Don't include trace-root.h in control.c or control-target.c 2024-11-19 14:14:13 +00:00
ui ui & main loop: Redesign of system-specific main thread event handling 2024-12-31 21:21:34 +01:00
util util/qemu-timer: fix indentation 2024-12-20 17:44:57 +01:00
.dir-locals.el
.editorconfig .editorconfig: update the automatic mode setting for Emacs 2021-03-10 15:34:11 +00:00
.exrc
.gdbinit
.git-blame-ignore-revs metadata: add .git-blame-ignore-revs 2023-04-04 15:56:44 +01:00
.gitattributes rust: patch bilge-impl to allow compilation with 1.63.0 2024-11-05 14:18:16 +01:00
.gitignore configure: rename --enable-pypi to --enable-download, control subprojects too 2023-06-06 16:30:01 +02:00
.gitlab-ci.yml docs: Document GitLab custom CI/CD variables 2021-07-29 07:56:01 +02:00
.gitmodules meson: subprojects: replace berkeley-{soft,test}float-3 with wraps 2023-06-06 16:30:01 +02:00
.gitpublish
.mailmap MAINTAINERS: update email address for Leif Lindholm 2024-12-11 15:31:09 +00:00
.patchew.yml scripts/checkpatch: roll diff tweaking into checkpatch itself 2021-06-25 10:08:33 +01:00
.readthedocs.yml readthodocs: fully specify a build environment 2024-01-12 13:23:48 +00:00
.travis.yml Revert "Remove the unused sh4eb target" 2024-11-04 14:16:11 +01:00
block.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
blockdev-nbd.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
blockdev.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
blockjob.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
configure configure: Use -ef to compare paths 2024-11-18 13:44:54 +01:00
COPYING
COPYING.LIB COPYING.LIB: Synchronize the LGPL 2.1 with the version from gnu.org 2019-01-30 11:01:22 +01:00
cpu-common.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
cpu-target.c Accel & Exec patch queue 2024-12-21 11:07:00 -05:00
event-loop-base.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
gitdm.config contrib/gitdm: add group map for AMD 2023-03-22 15:08:26 +00:00
hmp-commands-info.hx hmp-commands-info.hx: Add missing info command for stats subcommand 2024-06-30 19:51:44 +03:00
hmp-commands.hx hmp/migration: Fix "migrate" command's documentation 2024-05-08 09:22:37 -03:00
iothread.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
job-qmp.c qapi job: Elide redundant has_FOO in generated C 2022-12-14 20:04:47 +01:00
job.c block: remove AioContext locking 2023-12-21 22:49:27 +01:00
Kconfig build-sys: Add rust feature option 2024-10-07 16:41:58 +02:00
Kconfig.host hw/core: Add Enclave Image Format (EIF) related helpers 2024-10-31 18:28:32 +01:00
LICENSE tcg/LICENSE: Remove out of date claim about TCG subdirectory licensing 2019-11-11 15:11:21 +01:00
MAINTAINERS MAINTAINERS: Add myself as maintainer for apple-gfx, reviewer for HVF 2024-12-31 21:21:34 +01:00
Makefile contrib/plugins: remove Makefile for contrib/plugins 2024-11-05 09:13:51 +00:00
meson.build qga: implement a 'guest-get-load' command 2025-01-06 12:48:46 +02:00
meson_options.txt * rust: cleanups 2024-11-06 21:27:47 +00:00
module-common.c
os-posix.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
os-win32.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
page-target.c exec: Expose 'target_page.h' API to user emulation 2024-04-26 15:28:11 +02:00
page-vary-common.c Remove qemu-common.h include from most units 2022-04-06 14:31:55 +02:00
page-vary-target.c exec: Rename target specific page-vary.c -> page-vary-target.c 2023-10-04 11:03:54 -07:00
pythondeps.toml Require meson version 1.5.0 2024-10-07 16:41:57 +02:00
qemu-bridge-helper.c qemu-bridge-helper: relocate path to default ACL 2020-09-30 19:11:36 +02:00
qemu-edid.c qemu-edid: Restrict input parameter -d to avoid division by zero 2022-10-12 13:38:15 +02:00
qemu-img-cmds.hx docs/devel/docs: Document .hx file syntax 2024-01-15 17:12:22 +00:00
qemu-img.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
qemu-io-cmds.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
qemu-io.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
qemu-keymap.c qemu-keymap: Release local allocation references 2024-10-03 17:26:05 +03:00
qemu-nbd.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
qemu-options.hx Remove the deprecated "-runas" command line option 2025-01-07 15:00:57 +01:00
qemu.nsi license: Simplify GPL-2.0-or-later license descriptions 2024-09-20 10:11:59 +03:00
qemu.sasl sasl: remove comment about obsolete kerberos versions 2021-06-14 13:28:50 +01:00
README.rst README.rst: add the missing punctuations 2024-07-17 14:04:15 +03:00
replication.c replication: move include out of root directory 2021-05-26 14:49:46 +02:00
trace-events system/dma-helpers.c: Move trace events to system/trace-events 2024-11-19 14:14:13 +00:00
VERSION Open 10.0 development tree 2024-12-10 17:41:17 +00:00
version.rc configure: remove CONFIG_FILEVERSION and CONFIG_PRODUCTVERSION 2021-01-02 21:03:37 +01:00

===========
QEMU README
===========

QEMU is a generic and open source machine & userspace emulator and
virtualizer.

QEMU is capable of emulating a complete machine in software without any
need for hardware virtualization support. By using dynamic translation,
it achieves very good performance. QEMU can also integrate with the Xen
and KVM hypervisors to provide emulated hardware while allowing the
hypervisor to manage the CPU. With hypervisor support, QEMU can achieve
near native performance for CPUs. When QEMU emulates CPUs directly it is
capable of running operating systems made for one machine (e.g. an ARMv7
board) on a different machine (e.g. an x86_64 PC board).

QEMU is also capable of providing userspace API virtualization for Linux
and BSD kernel interfaces. This allows binaries compiled against one
architecture ABI (e.g. the Linux PPC64 ABI) to be run on a host using a
different architecture ABI (e.g. the Linux x86_64 ABI). This does not
involve any hardware emulation, simply CPU and syscall emulation.

QEMU aims to fit into a variety of use cases. It can be invoked directly
by users wishing to have full control over its behaviour and settings.
It also aims to facilitate integration into higher level management
layers, by providing a stable command line interface and monitor API.
It is commonly invoked indirectly via the libvirt library when using
open source applications such as oVirt, OpenStack and virt-manager.

QEMU as a whole is released under the GNU General Public License,
version 2. For full licensing details, consult the LICENSE file.


Documentation
=============

Documentation can be found hosted online at
`<https://www.qemu.org/documentation/>`_. The documentation for the
current development version that is available at
`<https://www.qemu.org/docs/master/>`_ is generated from the ``docs/``
folder in the source tree, and is built by `Sphinx
<https://www.sphinx-doc.org/en/master/>`_.


Building
========

QEMU is multi-platform software intended to be buildable on all modern
Linux platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety
of other UNIX targets. The simple steps to build QEMU are:


.. code-block:: shell

  mkdir build
  cd build
  ../configure
  make

Additional information can also be found online via the QEMU website:

* `<https://wiki.qemu.org/Hosts/Linux>`_
* `<https://wiki.qemu.org/Hosts/Mac>`_
* `<https://wiki.qemu.org/Hosts/W32>`_


Submitting patches
==================

The QEMU source code is maintained under the GIT version control system.

.. code-block:: shell

   git clone https://gitlab.com/qemu-project/qemu.git

When submitting patches, one common approach is to use 'git
format-patch' and/or 'git send-email' to format & send the mail to the
qemu-devel@nongnu.org mailing list. All patches submitted must contain
a 'Signed-off-by' line from the author. Patches should follow the
guidelines set out in the `style section
<https://www.qemu.org/docs/master/devel/style.html>`_ of
the Developers Guide.

Additional information on submitting patches can be found online via
the QEMU website:

* `<https://wiki.qemu.org/Contribute/SubmitAPatch>`_
* `<https://wiki.qemu.org/Contribute/TrivialPatches>`_

The QEMU website is also maintained under source control.

.. code-block:: shell

  git clone https://gitlab.com/qemu-project/qemu-web.git

* `<https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/>`_

A 'git-publish' utility was created to make above process less
cumbersome, and is highly recommended for making regular contributions,
or even just for sending consecutive patch series revisions. It also
requires a working 'git send-email' setup, and by default doesn't
automate everything, so you may want to go through the above steps
manually for once.

For installation instructions, please go to:

*  `<https://github.com/stefanha/git-publish>`_

The workflow with 'git-publish' is:

.. code-block:: shell

  $ git checkout master -b my-feature
  $ # work on new commits, add your 'Signed-off-by' lines to each
  $ git publish

Your patch series will be sent and tagged as my-feature-v1 if you need to refer
back to it in the future.

Sending v2:

.. code-block:: shell

  $ git checkout my-feature # same topic branch
  $ # making changes to the commits (using 'git rebase', for example)
  $ git publish

Your patch series will be sent with 'v2' tag in the subject and the git tip
will be tagged as my-feature-v2.

Bug reporting
=============

The QEMU project uses GitLab issues to track bugs. Bugs
found when running code built from QEMU git or upstream released sources
should be reported via:

* `<https://gitlab.com/qemu-project/qemu/-/issues>`_

If using QEMU via an operating system vendor pre-built binary package, it
is preferable to report bugs to the vendor's own bug tracker first. If
the bug is also known to affect latest upstream code, it can also be
reported via GitLab.

For additional information on bug reporting consult:

* `<https://wiki.qemu.org/Contribute/ReportABug>`_


ChangeLog
=========

For version history and release notes, please visit
`<https://wiki.qemu.org/ChangeLog/>`_ or look at the git history for
more detailed information.


Contact
=======

The QEMU community can be contacted in a number of ways, with the two
main methods being email and IRC:

* `<mailto:qemu-devel@nongnu.org>`_
* `<https://lists.nongnu.org/mailman/listinfo/qemu-devel>`_
* #qemu on irc.oftc.net

Information on additional methods of contacting the community can be
found online via the QEMU website:

* `<https://wiki.qemu.org/Contribute/StartHere>`_