Stable 2.1 #20

hkscy · 2015-03-12T18:02:48Z

No description provided.

The "hotplugged" device property was not reported before commit f4eb32b ("qmp: show QOM properties in device-list-properties"). Fix this difference. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> (cherry picked from commit 4115dd6) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Update -device FOO,help to include QOM properties in addition to qdev properties. Devices are gradually adding more QOM properties that are not reflected as qdev properties. It is important to report all device properties since management tools like libvirt use this information (and device-list-properties QMP) to detect the presence of QEMU features. This patch reuses the device-list-properties QMP machinery to avoid code duplication. Reported-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Cole Robinson <crobinso@redhat.com> (cherry picked from commit ef52358) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

When new MSI-X vectors are enabled we need to disable MSI-X and re-enable it with the correct number of vectors. That means we need to reprogram the eventfd triggers for each vector. Prior to f4d45d4 vector->use tracked whether a vector was masked or unmasked and we could always pick the KVM path when available for unmasked vectors. Now vfio doesn't track mask state itself and vector->use and virq remains configured even for masked vectors. Therefore we need to ask the MSI-X code whether a vector is masked in order to select the correct signaling path. As noted in the comment, MSI relies on hardware to handle masking. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: qemu-stable@nongnu.org # QEMU 2.1 (cherry picked from commit c048be5) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Some non-linux systems, for example a system with FreeBSD kernel and glibc, may declare struct mmsghdr (in glibc) but may not have linux-specific header file linux/ip.h. The actual implementation in qemu includes this linux-specific header file unconditionally, so compilation fails if it is not present. Include this header in the configure test too. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (cherry picked from commit bff6cb7) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Commit 58ac321 introduced a check to ide dma processing which constrains all requests to drive size. However, apparently, some valid requests (like TRIM) does not fit in this constraint, and fails in 2.1. So check the range only for reads and writes. Cc: qemu-stable@nongnu.org Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit d66168e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Cc: qemu-stable@nongnu.org Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 41d2f71) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

If user specifies a node number that exceeds the available numa nodes in emulated system for pc-dimm device, the device will report an invalid _PXM to OSPM. Fix this by checking the node property value. Cc: qemu-stable@nongnu.org Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit cfe0ffd) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

- int should be printed using %d - print actual wrong value for property Cc: qemu-stable@nongnu.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 988eba0) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

…m with 0x The error messages before and after patch are: before: qemu-system-x86_64: total memory for NUMA nodes (134217728) should equal RAM size (20000000) after: qemu-system-x86_64: total memory for NUMA nodes (0x8000000) should equal RAM size (0x20000000) Cc: qemu-stable@nongnu.org Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit c68233a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

RSDP should be aligned at a 16-byte boundary. This would by chance at the moment, fix up acpi build to make it robust. Cc: qemu-stable@nongnu.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> (cherry picked from commit d67aadc) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

In function virtio_blk_handle_request, it may freed memory pointed by req, So do not access member of req after calling this function. Cc: qemu-stable@nongnu.org Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 1bdb176) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

When we take an exception resulting from a BRK instruction, the architecture requires that the "preferred return address" reported to the exception handler is the address of the BRK itself, not the following instruction (like undefined insns, and in contrast with SVC, HVC and SMC). Follow this, rather than incorrectly reporting the address of the following insn. (We do get this correct for the A32/T32 BKPT insns.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Cc: qemu-stable@nongnu.org (cherry picked from commit 229a138) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

The function IDs for PSCI v0.1 are exported by KVM and defined as KVM_PSCI_FN_<something>. To build using these defines in non-KVM code, QEMU defines these IDs locally and check their correctness against the KVM headers when those are available. However, the naming scheme used for QEMU (almost) clashes with the PSCI v0.2 definitions from Linux so to avoid unfortunate naming when we introduce local PSCI v0.2 defines, rename the current local defines with QEMU_ prependend and clearly identify the PSCI version as v0.1 in the defines. Cc: qemu-stable@nongnu.org Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit a65c9c1) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

The current code supplies the PSCI v0.1 function IDs in the DT even when KVM uses PSCI v0.2. This will break guest kernels that only support PSCI v0.1 as they will use the IDs provided in the DT. Guest kernels with PSCI v0.2 support are not affected by this patch, because they ignore the function IDs in the device tree and rely on the architecture definition. Define QEMU versions of the constants and check that they correspond to the Linux defines on Linux build hosts. After this patch, both guest kernels with PSCI v0.1 support and guest kernels with PSCI v0.2 should work. Tested on TC2 for 32-bit and APM Mustang for 64-bit (aarch64 guest only). Both cases tested with 3.14 and linus/master and verified I could bring up 2 cpus with both guest kernels. Also tested 32-bit with a 3.14 host kernel with only PSCI v0.1 and both guests booted here as well. Cc: qemu-stable@nongnu.org Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 863714b) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

bs->total_sectors is not yet updated at this point. resulting in memory corruption if the volume has grown and data is written to the newly availble areas. CC: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit d832fb4) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

The following O_DIRECT read from a <512 byte file fails: $ truncate -s 320 test.img $ qemu-io -n -c 'read -P 0 0 512' test.img qemu-io: can't open device test.img: Could not read image for determining its format: Invalid argument Note that qemu-io completes successfully without the -n (O_DIRECT) option. This patch fixes qemu-iotests ./check -nocache -vmdk 059. Cc: qemu-stable@nongnu.org Suggested-by: Kevin Wolf <kwolf@redhat.com> Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 61ed73c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 8d9eb33) Conflicts: tests/qemu-iotests/group *fix up context mismatches due to lack of 099 and 103 tests Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Otherwise error_callback_bh will access the already released acb. Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit cbf95a0) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

When running VMware ESXi under qemu-kvm the guest discards frames that are too short. Short ARP Requests will be dropped, this prevents guests on the same bridge as VMware ESXi from communicating. This patch simply adds the padding on the network device itself. Signed-off-by: Ben Draper <ben@xrsa.net> Reviewed-by: Dmitry Fleytman <dmitry@daynix.com> Cc: qemu-stable@nongnu.org Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (cherry picked from commit 40a87c6) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

When memory is allocated on a wrong node, MPOL_MF_STRICT doesn't move it - it just fails the allocation. A simple way to reproduce the failure is with mlock=on realtime feature. The code comment actually says: "ensure policy won't be ignored" so setting MPOL_MF_MOVE seems like a better way to do this. Cc: qemu-stable@nongnu.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 288d332) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Prevent out-of-bounds array access on acpi_pcihp_pci_status. Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: qemu-stable@nongnu.org Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> (cherry picked from commit fa365d7) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

commit 868270f acpi-build: tweak acpi migration limits broke kernel loading with -kernel/-initrd: it doubled the size of ACPI tables but did not reserve enough memory. As a result, issues on boot and halt are observed. Fix this up by doubling reserved memory for new machine types. Cc: qemu-stable@nongnu.org Reported-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 927766c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

EventNotifier is implemented using an eventfd or pipe. It therefore consumes file descriptors, which can be limited by rlimits and should therefore be used sparingly. Switch from EventNotifier to QEMUBH in thread-pool.c. Originally EventNotifier was used because qemu_bh_schedule() was not thread-safe yet. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit c2e50e3) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

The thread pool has a race condition if two elements complete before thread_pool_completion_bh() runs: If element A's callback waits for element B using aio_poll() it will deadlock since pool->completion_bh is not marked scheduled when the nested aio_poll() runs. Fix this by marking the BH scheduled while thread_pool_completion_bh() is executing. This way any nested aio_poll() loops will enter thread_pool_completion_bh() and complete the remaining elements. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit 3c80ca1) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

On sPAPR, virtio devices are connected to the PCI bus and use MSI-X. Commit cc943c3 has modified MSI-X so that writes are made using the bus master address space and follow the IOMMU path. Unfortunately, the IOMMU address space address space does not have an MSI window: the notification is silently dropped in unassigned_mem_write instead of reaching the guest... The most visible effect is that all virtio devices are non-functional on sPAPR since then. :( This patch does the following: 1) map the MSI window into the IOMMU address space for each PHB - since each PHB instantiates its own IOMMU address space, we can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW) - no real need to keep the MSI window setup in a separate function, the spapr_pci_msi_init() code moves to spapr_phb_realize(). 2) kill the global MSI window as it is not needed in the end Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de> (cherry picked from commit 8c46f7e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

QOM backends can refer to chardevs, but not vice versa. So process -chardev and -fsdev options before -object This fixes the rng-egd backend to virtio-rng. Reported-by: Amos Kong <akong@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 7b71758) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Commit e8f6d00 ("target-i386: raise page fault for reserved physical address bits") added a check that the NX bit is not set on PAE PDPEs, but it also added it to rsvd_mask for the rest of the function. This caused any PDEs or PTEs with NX set to be erroneously rejected, making PAE guests with NX support unusable. Signed-off-by: William Grant <wgrant@ubuntu.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 1844e68) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

We currently define the number of variable range MTRR registers as 8 in the CPUX86State structure and vmstate, but use MSR_MTRRcap_VCNT (also 8) to report to guests the number available. Change this to use MSR_MTRRcap_VCNT consistently. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit d8b5c67) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

The MTRR state in KVM currently runs completely independent of the QEMU state in CPUX86State.mtrr_*. This means that on migration, the target loses MTRR state from the source. Generally that's ok though because KVM ignores it and maps everything as write-back anyway. The exception to this rule is when we have an assigned device and an IOMMU that doesn't promote NoSnoop transactions from that device to be cache coherent. In that case KVM trusts the guest mapping of memory as configured in the MTRR. This patch updates kvm_get|put_msrs() so that we retrieve the actual vCPU MTRR settings and therefore keep CPUX86State synchronized for migration. kvm_put_msrs() is also used on vCPU reset and therefore allows future modificaitons of MTRR state at reset to be realized. Note that the entries array used by both functions was already slightly undersized for holding every possible MSR, so this patch increases it beyond the 28 new entries necessary for MTRR state. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit d1ae67f) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

The SDM specifies (June 2014 Vol3 11.11.5): On a hardware reset, the P6 and more recent processors clear the valid flags in variable-range MTRRs and clear the E flag in the IA32_MTRR_DEF_TYPE MSR to disable all MTRRs. All other bits in the MTRRs are undefined. We currently do none of that, so whatever MTRR settings you had prior to reset is what you have after reset. Usually this doesn't matter because KVM often ignores the guest mappings and uses write-back anyway. However, if you have an assigned device and an IOMMU that allows NoSnoop for that device, KVM defers to the guest memory mappings which are now stale after reset. The result is that OVMF rebooting on such a configuration takes a full minute to LZMA decompress the firmware volume, a process that is nearly instant on the initial boot. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 9db2efd) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Both virtio-blk and virtio-scsi use virtio_queue_empty() as the loop condition in VQ handlers (virtio_blk_handle_vq, virtio_scsi_handle_cmd_vq). When a device is marked broken in virtqueue_pop, for example if a vIOMMU address translation failed, we want to break out of the loop. This fixes a hanging problem when booting a CentOS 3.10.0-862.el7.x86_64 kernel with ATS enabled: $ qemu-system-x86_64 \ ... \ -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \ -device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.4,addr=0x0 The dead loop happens immediately when the kernel boots and initializes the device, where virtio_scsi_data_plane_handle_cmd will not return: > ... > qemu#13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq > qemu#14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd > qemu#15 0x00005586602ddab7 in virtio_queue_notify_aio_vq > qemu#16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll > qemu#17 0x00005586607885da in run_poll_handlers_once > qemu#18 0x000055866078880e in try_poll_mode > qemu#19 0x00005586607888eb in aio_poll > qemu#20 0x0000558660784561 in aio_wait_bh_oneshot > qemu#21 0x00005586602b9582 in virtio_scsi_dataplane_stop > qemu#22 0x00005586605a7110 in virtio_bus_stop_ioeventfd > qemu#23 0x00005586605a9426 in virtio_pci_stop_ioeventfd > qemu#24 0x00005586605ab808 in virtio_pci_common_write > qemu#25 0x0000558660242396 in memory_region_write_accessor > qemu#26 0x00005586602425ab in access_with_adjusted_size > qemu#27 0x0000558660245281 in memory_region_dispatch_write > qemu#28 0x00005586601e008e in flatview_write_continue > qemu#29 0x00005586601e01d8 in flatview_write > qemu#30 0x00005586601e04de in address_space_write > qemu#31 0x00005586601e052f in address_space_rw > qemu#32 0x00005586602607f2 in kvm_cpu_exec > qemu#33 0x0000558660227148 in qemu_kvm_cpu_thread_fn > qemu#34 0x000055866078bde7 in qemu_thread_start > qemu#35 0x00007f5784906594 in start_thread > qemu#36 0x00007f5784639e6f in clone With this patch, virtio_queue_empty will now return 1 as soon as the vdev is marked as broken, after a "virtio: zero sized buffers are not allowed" error. To be consistent, update virtio_queue_empty_rcu as well. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20180910145616.8598-2-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

In qemu_laio_process_completions_and_submit, the AioContext is acquired before the ioq_submit iteration and after qemu_laio_process_completions, but the latter is not thread safe either. This change avoids a number of random crashes when the Main Thread and an IO Thread collide processing completions for the same AioContext. This is an example of such crash: - The IO Thread is trying to acquire the AioContext at aio_co_enter, which evidences that it didn't lock it before: Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)): #0 0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 qemu#1 0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0 qemu#2 0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0) at ../nptl/pthread_mutex_lock.c:114 qemu#3 0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66 qemu#4 0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493 qemu#5 0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478 qemu#6 0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 qemu#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 qemu#9 0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406 qemu#10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true) at util/aio-posix.c:693 qemu#11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64 qemu#12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308 qemu#13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 - The Main Thread is also processing completions from the same AioContext, and crashes due to failed assertion at util/iov.c:78: Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)): #0 0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 qemu#1 0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90 qemu#2 0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:92 qemu#3 0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101 qemu#4 0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78 qemu#5 0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410 qemu#6 0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 qemu#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 qemu#9 0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272 qemu#10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497 qemu#11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574 qemu#12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false) at util/aio-posix.c:604 qemu#13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273 qemu#14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390 qemu#15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590 qemu#16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774 qemu#17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401 qemu#18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449 qemu#19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080) at block/commit.c:92 qemu#20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973 qemu#21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90 qemu#22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118 qemu#23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436 qemu#24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 qemu#25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201 qemu#26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854 qemu#27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215 qemu#28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238 qemu#29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497 qemu#30 0x00005631fbd81412 in main_loop () at vl.c:1866 qemu#31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4647 - A closer examination shows that s->io_q.in_flight appears to have gone backwards: (gdb) frame 7 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 222 qemu_laio_process_completion(laiocb); (gdb) p s $2 = (LinuxAioState *) 0x7fdfc0297670 (gdb) p *s $3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0, in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0, sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241} (gdb) p/x s->io_q.in_flight $4 = 0xfffffff0 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Let start from the beginning: Commit b9e413d (in 2.9) "block: explicitly acquire aiocontext in aio callbacks that need it" added pairs of aio_context_acquire/release to mirror_write_complete and mirror_read_complete, when they were aio callbacks for blk_aio_* calls. Then, commit 2e1990b (in 3.0) "block/mirror: Convert to coroutines" dropped these blk_aio_* calls, than mirror_write_complete and mirror_read_complete are not callbacks more, and don't need additional aiocontext acquiring. Furthermore, mirror_read_complete calls blk_co_pwritev inside these pair of aio_context_acquire/release, which leads to the following dead-lock with mirror: (gdb) info thr Id Target Id Frame 3 Thread (LWP 145412) "qemu-system-x86" syscall () 2 Thread (LWP 145416) "qemu-system-x86" __lll_lock_wait () * 1 Thread (LWP 145411) "qemu-system-x86" __lll_lock_wait () (gdb) bt #0 __lll_lock_wait () #1 _L_lock_812 () #2 __GI___pthread_mutex_lock #3 qemu_mutex_lock_impl (mutex=0x561032dce420 <qemu_global_mutex>, file=0x5610327d8654 "util/main-loop.c", line=236) at util/qemu-thread-posix.c:66 #4 qemu_mutex_lock_iothread_impl #5 os_host_main_loop_wait (timeout=480116000) at util/main-loop.c:236 #6 main_loop_wait (nonblocking=0) at util/main-loop.c:497 #7 main_loop () at vl.c:1892 #8 main Printing contents of qemu_global_mutex, I see that "__owner = 145416", so, thr1 is main loop, and now it wants BQL, which is owned by thr2. (gdb) thr 2 (gdb) bt #0 __lll_lock_wait () #1 _L_lock_870 () #2 __GI___pthread_mutex_lock #3 qemu_mutex_lock_impl (mutex=0x561034d25dc0, ... #4 aio_context_acquire (ctx=0x561034d25d60) #5 dma_blk_cb #6 dma_blk_io #7 dma_blk_read #8 ide_dma_cb #9 bmdma_cmd_writeb #10 bmdma_write #11 memory_region_write_accessor #12 access_with_adjusted_size #15 flatview_write #16 address_space_write #17 address_space_rw #18 kvm_handle_io #19 kvm_cpu_exec #20 qemu_kvm_cpu_thread_fn #21 qemu_thread_start #22 start_thread #23 clone () Printing mutex in fr 2, I see "__owner = 145411", so thr2 wants aio context mutex, which is owned by thr1. Classic dead-lock. Then, let's check that aio context is hold by mirror coroutine: just print coroutine stack of first tracked request in mirror job target: (gdb) [...] (gdb) qemu coroutine 0x561035dd0860 #0 qemu_coroutine_switch #1 qemu_coroutine_yield #2 qemu_co_mutex_lock_slowpath #3 qemu_co_mutex_lock #4 qcow2_co_pwritev #5 bdrv_driver_pwritev #6 bdrv_aligned_pwritev #7 bdrv_co_pwritev #8 blk_co_pwritev #9 mirror_read_complete () at block/mirror.c:232 #10 mirror_co_read () at block/mirror.c:370 #11 coroutine_trampoline #12 __start_context Yes it is mirror_read_complete calling blk_co_pwritev after acquiring aio context. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

RH-Author: Kevin Wolf <kwolf@redhat.com> Message-id: <20181010202213.7372-14-kwolf@redhat.com> Patchwork-id: 82603 O-Subject: [RHEL-8 qemu-kvm PATCH 23/44] block/linux-aio: acquire AioContext before qemu_laio_process_completions Bugzilla: 1637976 RH-Acked-by: Max Reitz <mreitz@redhat.com> RH-Acked-by: John Snow <jsnow@redhat.com> RH-Acked-by: Thomas Huth <thuth@redhat.com> From: Sergio Lopez <slp@redhat.com> In qemu_laio_process_completions_and_submit, the AioContext is acquired before the ioq_submit iteration and after qemu_laio_process_completions, but the latter is not thread safe either. This change avoids a number of random crashes when the Main Thread and an IO Thread collide processing completions for the same AioContext. This is an example of such crash: - The IO Thread is trying to acquire the AioContext at aio_co_enter, which evidences that it didn't lock it before: Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)): #0 0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 qemu#1 0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0 qemu#2 0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0) at ../nptl/pthread_mutex_lock.c:114 qemu#3 0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66 qemu#4 0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493 qemu#5 0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478 qemu#6 0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 qemu#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 qemu#9 0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406 qemu#10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true) at util/aio-posix.c:693 qemu#11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64 qemu#12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308 qemu#13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 - The Main Thread is also processing completions from the same AioContext, and crashes due to failed assertion at util/iov.c:78: Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)): #0 0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 qemu#1 0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90 qemu#2 0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:92 qemu#3 0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101 qemu#4 0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78 qemu#5 0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410 qemu#6 0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 qemu#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 qemu#9 0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272 qemu#10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497 qemu#11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574 qemu#12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false) at util/aio-posix.c:604 qemu#13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273 qemu#14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390 qemu#15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590 qemu#16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774 qemu#17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401 qemu#18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449 qemu#19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080) at block/commit.c:92 qemu#20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973 qemu#21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90 qemu#22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118 qemu#23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436 qemu#24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 qemu#25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201 qemu#26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854 qemu#27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215 qemu#28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238 qemu#29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497 qemu#30 0x00005631fbd81412 in main_loop () at vl.c:1866 qemu#31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4647 - A closer examination shows that s->io_q.in_flight appears to have gone backwards: (gdb) frame 7 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 222 qemu_laio_process_completion(laiocb); (gdb) p s $2 = (LinuxAioState *) 0x7fdfc0297670 (gdb) p *s $3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0, in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0, sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241} (gdb) p/x s->io_q.in_flight $4 = 0xfffffff0 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit e091f0e) Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>

When a monitor is connected to a Spice chardev, the monitor cleanup can dead-lock: #0 0x00007f43446637fd in __lll_lock_wait () at /lib64/libpthread.so.0 #1 0x00007f434465ccf4 in pthread_mutex_lock () at /lib64/libpthread.so.0 #2 0x0000556dd79f22ba in qemu_mutex_lock_impl (mutex=0x556dd81c9220 <monitor_lock>, file=0x556dd7ae3648 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66 #3 0x0000556dd7431bd5 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x556dd9abc850, errp=0x7fffb7bbddd8) at /home/elmarco/src/qq/monitor.c:645 #4 0x0000556dd79d476b in qapi_event_send_spice_disconnected (server=0x556dd98ee760, client=0x556ddaaa8560, errp=0x556dd82180d0 <error_abort>) at qapi/qapi-events-ui.c:149 #5 0x0000556dd7870fc1 in channel_event (event=3, info=0x556ddad1b590) at /home/elmarco/src/qq/ui/spice-core.c:235 #6 0x00007f434560a6bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x556ddad1b590) at reds.c:316 #7 0x00007f43455f393b in main_dispatcher_self_handle_channel_event (info=0x556ddad1b590, event=3, self=0x556dd9a7d8c0) at main-dispatcher.c:197 #8 0x00007f43455f393b in main_dispatcher_channel_event (self=0x556dd9a7d8c0, event=event@entry=3, info=0x556ddad1b590) at main-dispatcher.c:197 #9 0x00007f4345612833 in red_stream_push_channel_event (s=s@entry=0x556ddae2ef40, event=event@entry=3) at red-stream.c:414 #10 0x00007f434561286b in red_stream_free (s=0x556ddae2ef40) at red-stream.c:388 #11 0x00007f43455f9ddc in red_channel_client_finalize (object=0x556dd9bb21a0) at red-channel-client.c:347 #12 0x00007f434b5f9fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0 #13 0x00007f43455fc212 in red_channel_client_push (rcc=0x556dd9bb21a0) at red-channel-client.c:1341 #14 0x0000556dd76081ba in spice_port_set_fe_open (chr=0x556dd9925e20, fe_open=0) at /home/elmarco/src/qq/chardev/spice.c:241 #15 0x0000556dd796d74a in qemu_chr_fe_set_open (be=0x556dd9a37c00, fe_open=0) at /home/elmarco/src/qq/chardev/char-fe.c:340 #16 0x0000556dd796d4d9 in qemu_chr_fe_set_handlers (b=0x556dd9a37c00, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=true) at /home/elmarco/src/qq/chardev/char-fe.c:280 #17 0x0000556dd796d359 in qemu_chr_fe_deinit (b=0x556dd9a37c00, del=false) at /home/elmarco/src/qq/chardev/char-fe.c:233 #18 0x0000556dd7432240 in monitor_data_destroy (mon=0x556dd9a37c00) at /home/elmarco/src/qq/monitor.c:786 #19 0x0000556dd743b968 in monitor_cleanup () at /home/elmarco/src/qq/monitor.c:4683 #20 0x0000556dd75ce776 in main (argc=3, argv=0x7fffb7bbe458, envp=0x7fffb7bbe478) at /home/elmarco/src/qq/vl.c:4660 Because spice code tries to emit a "disconnected" signal on the monitors. Fix this dead-lock by releasing the monitor lock for flush/destroy. monitor_lock protects mon_list, monitor_qapi_event_state and monitor_destroyed. monitor_flush() and monitor_data_destroy() don't access any of those variables. monitor_cleanup()'s loop is safe because it uses QTAILQ_FOREACH_SAFE(), and no further monitor can be added after calling monitor_cleanup() thanks to monitor_destroyed check in monitor_list_append(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20181205203737.9011-8-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>

Spotted by ASAN, during make check... Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f8e27262c48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7f8e26a5f3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) #2 0x555ab67078a8 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:67 #3 0x555ab67071e4 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:24 #4 0x555ab6713fbf in qstring_from_escaped_str /home/elmarco/src/qq/qobject/json-parser.c:144 #5 0x555ab671738c in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:506 #6 0x555ab67179c3 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:569 #7 0x555ab6715123 in parse_pair /home/elmarco/src/qq/qobject/json-parser.c:306 #8 0x555ab6715483 in parse_object /home/elmarco/src/qq/qobject/json-parser.c:357 #9 0x555ab671798b in parse_value /home/elmarco/src/qq/qobject/json-parser.c:561 #10 0x555ab6717a6b in json_parser_parse_err /home/elmarco/src/qq/qobject/json-parser.c:592 #11 0x555ab4fd4dcf in handle_qmp_command /home/elmarco/src/qq/monitor.c:4257 #12 0x555ab6712c4d in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105 #13 0x555ab67e01e2 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323 #14 0x555ab67e0af6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373 #15 0x555ab6713010 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124 #16 0x555ab4fd58ec in monitor_qmp_read /home/elmarco/src/qq/monitor.c:4337 #17 0x555ab6559df2 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175 #18 0x555ab6559e95 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187 #19 0x555ab6560127 in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66 #20 0x555ab65d9c73 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84 #21 0x7f8e26a598ac in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4c8ac) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180809114417.28718-4-marcandre.lureau@redhat.com> [Screwed up in commit b273145] Cc: qemu-stable@nongnu.org Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> (cherry picked from commit cb9ec42) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Let start from the beginning: Commit b9e413d (in 2.9) "block: explicitly acquire aiocontext in aio callbacks that need it" added pairs of aio_context_acquire/release to mirror_write_complete and mirror_read_complete, when they were aio callbacks for blk_aio_* calls. Then, commit 2e1990b (in 3.0) "block/mirror: Convert to coroutines" dropped these blk_aio_* calls, than mirror_write_complete and mirror_read_complete are not callbacks more, and don't need additional aiocontext acquiring. Furthermore, mirror_read_complete calls blk_co_pwritev inside these pair of aio_context_acquire/release, which leads to the following dead-lock with mirror: (gdb) info thr Id Target Id Frame 3 Thread (LWP 145412) "qemu-system-x86" syscall () 2 Thread (LWP 145416) "qemu-system-x86" __lll_lock_wait () * 1 Thread (LWP 145411) "qemu-system-x86" __lll_lock_wait () (gdb) bt #0 __lll_lock_wait () #1 _L_lock_812 () #2 __GI___pthread_mutex_lock #3 qemu_mutex_lock_impl (mutex=0x561032dce420 <qemu_global_mutex>, file=0x5610327d8654 "util/main-loop.c", line=236) at util/qemu-thread-posix.c:66 #4 qemu_mutex_lock_iothread_impl #5 os_host_main_loop_wait (timeout=480116000) at util/main-loop.c:236 #6 main_loop_wait (nonblocking=0) at util/main-loop.c:497 #7 main_loop () at vl.c:1892 #8 main Printing contents of qemu_global_mutex, I see that "__owner = 145416", so, thr1 is main loop, and now it wants BQL, which is owned by thr2. (gdb) thr 2 (gdb) bt #0 __lll_lock_wait () #1 _L_lock_870 () #2 __GI___pthread_mutex_lock #3 qemu_mutex_lock_impl (mutex=0x561034d25dc0, ... #4 aio_context_acquire (ctx=0x561034d25d60) #5 dma_blk_cb #6 dma_blk_io #7 dma_blk_read #8 ide_dma_cb #9 bmdma_cmd_writeb #10 bmdma_write #11 memory_region_write_accessor #12 access_with_adjusted_size #15 flatview_write #16 address_space_write #17 address_space_rw #18 kvm_handle_io #19 kvm_cpu_exec #20 qemu_kvm_cpu_thread_fn #21 qemu_thread_start #22 start_thread #23 clone () Printing mutex in fr 2, I see "__owner = 145411", so thr2 wants aio context mutex, which is owned by thr1. Classic dead-lock. Then, let's check that aio context is hold by mirror coroutine: just print coroutine stack of first tracked request in mirror job target: (gdb) [...] (gdb) qemu coroutine 0x561035dd0860 #0 qemu_coroutine_switch #1 qemu_coroutine_yield #2 qemu_co_mutex_lock_slowpath #3 qemu_co_mutex_lock #4 qcow2_co_pwritev #5 bdrv_driver_pwritev #6 bdrv_aligned_pwritev #7 bdrv_co_pwritev #8 blk_co_pwritev #9 mirror_read_complete () at block/mirror.c:232 #10 mirror_co_read () at block/mirror.c:370 #11 coroutine_trampoline #12 __start_context Yes it is mirror_read_complete calling blk_co_pwritev after acquiring aio context. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit d12ade5) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Even for block nodes with bs->drv == NULL, we can't just ignore a bdrv_set_aio_context() call. Leaving the node in its old context can mean that it's still in an iothread context in bdrv_close_all() during shutdown, resulting in an attempted unlock of the AioContext lock which we don't hold. This is an example stack trace of a related crash: #0 0x00007ffff59da57f in raise () at /lib64/libc.so.6 #1 0x00007ffff59c4895 in abort () at /lib64/libc.so.6 #2 0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97 #4 0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507 #5 0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0) at block/io.c:833 #6 0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990 #7 0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990 #8 0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51 #9 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248 #10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194 #12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194 #13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248 #14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124 #16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153 #17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358 #18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542 #19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598 #20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785 #21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483 #22 0x0000555555aae02f in bdrv_close_all () at block.c:3412 #23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776 The reproducer I used is a qcow2 image on gluster volume, where the virtual disk size (4 GB) is larger than the gluster volume size (64M), so we can easily trigger an ENOSPC. This backend is assigned to a virtio-blk device using an iothread, and then from the guest a 'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop because of an I/O error. qemu_gluster_co_flush_to_disk() sets bs->drv = NULL on error, so when virtio-blk stops the dataplane, the block nodes stay in the iothread AioContext. A 'quit' monitor command issued from this paused state crashes the process. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

Even for block nodes with bs->drv == NULL, we can't just ignore a bdrv_set_aio_context() call. Leaving the node in its old context can mean that it's still in an iothread context in bdrv_close_all() during shutdown, resulting in an attempted unlock of the AioContext lock which we don't hold. This is an example stack trace of a related crash: #0 0x00007ffff59da57f in raise () at /lib64/libc.so.6 #1 0x00007ffff59c4895 in abort () at /lib64/libc.so.6 #2 0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97 #4 0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507 #5 0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0) at block/io.c:833 #6 0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990 #7 0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990 #8 0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51 #9 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248 #10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194 #12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194 #13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248 #14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124 #16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153 #17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358 #18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542 #19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598 #20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785 #21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483 #22 0x0000555555aae02f in bdrv_close_all () at block.c:3412 #23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776 The reproducer I used is a qcow2 image on gluster volume, where the virtual disk size (4 GB) is larger than the gluster volume size (64M), so we can easily trigger an ENOSPC. This backend is assigned to a virtio-blk device using an iothread, and then from the guest a 'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop because of an I/O error. qemu_gluster_co_flush_to_disk() sets bs->drv = NULL on error, so when virtio-blk stops the dataplane, the block nodes stay in the iothread AioContext. A 'quit' monitor command issued from this paused state crashes the process. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> (cherry picked from commit 1bffe1a) *drop context dependency on e64f25f Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Even for block nodes with bs->drv == NULL, we can't just ignore a bdrv_set_aio_context() call. Leaving the node in its old context can mean that it's still in an iothread context in bdrv_close_all() during shutdown, resulting in an attempted unlock of the AioContext lock which we don't hold. This is an example stack trace of a related crash: #0 0x00007ffff59da57f in raise () at /lib64/libc.so.6 #1 0x00007ffff59c4895 in abort () at /lib64/libc.so.6 #2 0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97 #4 0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507 #5 0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0) at block/io.c:833 #6 0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990 #7 0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990 #8 0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51 #9 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248 #10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194 #12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194 #13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248 #14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124 #16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153 #17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358 #18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542 #19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598 #20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785 #21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483 #22 0x0000555555aae02f in bdrv_close_all () at block.c:3412 #23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776 The reproducer I used is a qcow2 image on gluster volume, where the virtual disk size (4 GB) is larger than the gluster volume size (64M), so we can easily trigger an ENOSPC. This backend is assigned to a virtio-blk device using an iothread, and then from the guest a 'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop because of an I/O error. qemu_gluster_co_flush_to_disk() sets bs->drv = NULL on error, so when virtio-blk stops the dataplane, the block nodes stay in the iothread AioContext. A 'quit' monitor command issued from this paused state crashes the process. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> (cherry picked from commit 1bffe1a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

All paths that lead to bdrv_backup_top_drop(), except for the call from backup_clean(), imply that the BDS AioContext has already been acquired, so doing it there too can potentially lead to QEMU hanging on AIO_WAIT_WHILE(). An easy way to trigger this situation is by issuing a two actions transaction, with a proper and a bogus blockdev-backup, so the second one will trigger a rollback. This will trigger a hang with an stack trace like this one: #0 0x00007fb680c75016 in __GI_ppoll (fds=0x55e74580f7c0, nfds=1, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39 qemu#1 0x000055e743386e09 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 qemu#2 0x000055e743386e09 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:336 qemu#3 0x000055e743388dc4 in aio_poll (ctx=0x55e7458925d0, blocking=blocking@entry=true) at util/aio-posix.c:669 qemu#4 0x000055e743305dea in bdrv_flush (bs=bs@entry=0x55e74593c0d0) at block/io.c:2878 qemu#5 0x000055e7432be58e in bdrv_close (bs=0x55e74593c0d0) at block.c:4017 qemu#6 0x000055e7432be58e in bdrv_delete (bs=<optimized out>) at block.c:4262 qemu#7 0x000055e7432be58e in bdrv_unref (bs=bs@entry=0x55e74593c0d0) at block.c:5644 qemu#8 0x000055e743316b9b in bdrv_backup_top_drop (bs=bs@entry=0x55e74593c0d0) at block/backup-top.c:273 qemu#9 0x000055e74331461f in backup_job_create (job_id=0x0, bs=bs@entry=0x55e7458d5820, target=target@entry=0x55e74589f640, speed=0, sync_mode=MIRROR_SYNC_MODE_FULL, sync_bitmap=sync_bitmap@entry=0x0, bitmap_mode=BITMAP_SYNC_MODE_ON_SUCCESS, compress=false, filter_node_name=0x0, on_source_error=BLOCKDEV_ON_ERROR_REPORT, on_target_error=BLOCKDEV_ON_ERROR_REPORT, creation_flags=0, cb=0x0, opaque=0x0, txn=0x0, errp=0x7ffddfd1efb0) at block/backup.c:478 qemu#10 0x000055e74315bc52 in do_backup_common (backup=backup@entry=0x55e746c066d0, bs=bs@entry=0x55e7458d5820, target_bs=target_bs@entry=0x55e74589f640, aio_context=aio_context@entry=0x55e7458a91e0, txn=txn@entry=0x0, errp=errp@entry=0x7ffddfd1efb0) at blockdev.c:3580 qemu#11 0x000055e74315c37c in do_blockdev_backup (backup=backup@entry=0x55e746c066d0, txn=0x0, errp=errp@entry=0x7ffddfd1efb0) at /usr/src/debug/qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64/./qapi/qapi-types-block-core.h:1492 qemu#12 0x000055e74315c449 in blockdev_backup_prepare (common=0x55e746a8de90, errp=0x7ffddfd1f018) at blockdev.c:1885 qemu#13 0x000055e743160152 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x55e7467fe2c0, errp=errp@entry=0x7ffddfd1f088) at blockdev.c:2340 qemu#14 0x000055e743287ff5 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffddfd1f0f8) at qapi/qapi-commands-transaction.c:44 qemu#15 0x000055e74333de6c in do_qmp_dispatch (errp=0x7ffddfd1f0f0, allow_oob=<optimized out>, request=<optimized out>, cmds=0x55e743c28d60 <qmp_commands>) at qapi/qmp-dispatch.c:132 qemu#16 0x000055e74333de6c in qmp_dispatch (cmds=0x55e743c28d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 qemu#17 0x000055e74325c061 in monitor_qmp_dispatch (mon=0x55e745908030, req=<optimized out>) at monitor/qmp.c:145 qemu#18 0x000055e74325c6fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234 qemu#19 0x000055e743385866 in aio_bh_call (bh=0x55e745807ae0) at util/async.c:117 qemu#20 0x000055e743385866 in aio_bh_poll (ctx=ctx@entry=0x55e7458067a0) at util/async.c:117 qemu#21 0x000055e743388c54 in aio_dispatch (ctx=0x55e7458067a0) at util/aio-posix.c:459 qemu#22 0x000055e743385742 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 qemu#23 0x00007fb68543e67d in g_main_dispatch (context=0x55e745893a40) at gmain.c:3176 qemu#24 0x00007fb68543e67d in g_main_context_dispatch (context=context@entry=0x55e745893a40) at gmain.c:3829 qemu#25 0x000055e743387d08 in glib_pollfds_poll () at util/main-loop.c:219 qemu#26 0x000055e743387d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 qemu#27 0x000055e743387d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 qemu#28 0x000055e74316a3c1 in main_loop () at vl.c:1828 qemu#29 0x000055e743016a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 Fix this by not acquiring the AioContext there, and ensuring all paths leading to it have it already acquired (backup_clean()). RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782111 Signed-off-by: Sergio Lopez <slp@redhat.com>

Dirty map addition and removal functions are not acquiring to BDS AioContext, while they may call to code that expects it to be acquired. This may trigger a crash with a stack trace like this one: #0 0x00007f0ef146370f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 qemu#1 0x00007f0ef144db25 in __GI_abort () at abort.c:79 qemu#2 0x0000565022294dce in error_exit (err=<optimized out>, msg=msg@entry=0x56502243a730 <__func__.16350> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 qemu#3 0x00005650222950ba in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5650244b0240, file=file@entry=0x565022439adf "util/async.c", line=line@entry=526) at util/qemu-thread-posix.c:108 qemu#4 0x0000565022290029 in aio_context_release (ctx=ctx@entry=0x5650244b01e0) at util/async.c:526 qemu#5 0x000056502221cd08 in bdrv_can_store_new_dirty_bitmap (bs=bs@entry=0x5650244dc820, name=name@entry=0x56502481d360 "bitmap1", granularity=granularity@entry=65536, errp=errp@entry=0x7fff22831718) at block/dirty-bitmap.c:542 qemu#6 0x000056502206ae53 in qmp_block_dirty_bitmap_add (errp=0x7fff22831718, disabled=false, has_disabled=<optimized out>, persistent=<optimized out>, has_persistent=true, granularity=65536, has_granularity=<optimized out>, name=0x56502481d360 "bitmap1", node=<optimized out>) at blockdev.c:2894 qemu#7 0x000056502206ae53 in qmp_block_dirty_bitmap_add (node=<optimized out>, name=0x56502481d360 "bitmap1", has_granularity=<optimized out>, granularity=<optimized out>, has_persistent=true, persistent=<optimized out>, has_disabled=false, disabled=false, errp=0x7fff22831718) at blockdev.c:2856 qemu#8 0x00005650221847a3 in qmp_marshal_block_dirty_bitmap_add (args=<optimized out>, ret=<optimized out>, errp=0x7fff22831798) at qapi/qapi-commands-block-core.c:651 qemu#9 0x0000565022247e6c in do_qmp_dispatch (errp=0x7fff22831790, allow_oob=<optimized out>, request=<optimized out>, cmds=0x565022b32d60 <qmp_commands>) at qapi/qmp-dispatch.c:132 qemu#10 0x0000565022247e6c in qmp_dispatch (cmds=0x565022b32d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 qemu#11 0x0000565022166061 in monitor_qmp_dispatch (mon=0x56502450faa0, req=<optimized out>) at monitor/qmp.c:145 qemu#12 0x00005650221666fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234 qemu#13 0x000056502228f866 in aio_bh_call (bh=0x56502440eae0) at util/async.c:117 qemu#14 0x000056502228f866 in aio_bh_poll (ctx=ctx@entry=0x56502440d7a0) at util/async.c:117 qemu#15 0x0000565022292c54 in aio_dispatch (ctx=0x56502440d7a0) at util/aio-posix.c:459 qemu#16 0x000056502228f742 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 qemu#17 0x00007f0ef5ce667d in g_main_dispatch (context=0x56502449aa40) at gmain.c:3176 qemu#18 0x00007f0ef5ce667d in g_main_context_dispatch (context=context@entry=0x56502449aa40) at gmain.c:3829 qemu#19 0x0000565022291d08 in glib_pollfds_poll () at util/main-loop.c:219 qemu#20 0x0000565022291d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 qemu#21 0x0000565022291d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 qemu#22 0x00005650220743c1 in main_loop () at vl.c:1828 qemu#23 0x0000565021f20a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 Fix this by acquiring the AioContext at qmp_block_dirty_bitmap_add() and qmp_block_dirty_bitmap_add(). RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782175 Signed-off-by: Sergio Lopez <slp@redhat.com>

external_snapshot_abort() calls to bdrv_set_backing_hd(), which returns state->old_bs to the main AioContext, as it's intended to be used then the BDS is going to be released. As that's not the case when aborting an external snapshot, return it to the AioContext it was before the call. This issue can be triggered by issuing a transaction with two actions, a proper blockdev-snapshot-sync and a bogus one, so the second will trigger a transaction abort. This results in a crash with an stack trace like this one: #0 0x00007fa1048b28df in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 qemu#1 0x00007fa10489ccf5 in __GI_abort () at abort.c:79 qemu#2 0x00007fa10489cbc9 in __assert_fail_base (fmt=0x7fa104a03300 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=0x557224014d30 "block.c", line=2240, function=<optimized out>) at assert.c:92 qemu#3 0x00007fa1048aae96 in __GI___assert_fail (assertion=assertion@entry=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=file@entry=0x557224014d30 "block.c", line=line@entry=2240, function=function@entry=0x5572240b5d60 <__PRETTY_FUNCTION__.31620> "bdrv_replace_child_noperm") at assert.c:101 qemu#4 0x0000557223e631f8 in bdrv_replace_child_noperm (child=0x557225b9c980, new_bs=new_bs@entry=0x557225c42e40) at block.c:2240 qemu#5 0x0000557223e68be7 in bdrv_replace_node (from=0x557226951a60, to=0x557225c42e40, errp=0x5572247d6138 <error_abort>) at block.c:4196 qemu#6 0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1731 qemu#7 0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1717 qemu#8 0x0000557223d09013 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x557225cc7d70, errp=errp@entry=0x7ffe704c0c98) at blockdev.c:2360 qemu#9 0x0000557223e32085 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffe704c0d08) at qapi/qapi-commands-transaction.c:44 qemu#10 0x0000557223ee798c in do_qmp_dispatch (errp=0x7ffe704c0d00, allow_oob=<optimized out>, request=<optimized out>, cmds=0x5572247d3cc0 <qmp_commands>) at qapi/qmp-dispatch.c:132 qemu#11 0x0000557223ee798c in qmp_dispatch (cmds=0x5572247d3cc0 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 qemu#12 0x0000557223e06141 in monitor_qmp_dispatch (mon=0x557225c69ff0, req=<optimized out>) at monitor/qmp.c:120 qemu#13 0x0000557223e0678a in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:209 qemu#14 0x0000557223f2f366 in aio_bh_call (bh=0x557225b9dc60) at util/async.c:117 qemu#15 0x0000557223f2f366 in aio_bh_poll (ctx=ctx@entry=0x557225b9c840) at util/async.c:117 qemu#16 0x0000557223f32754 in aio_dispatch (ctx=0x557225b9c840) at util/aio-posix.c:459 qemu#17 0x0000557223f2f242 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 qemu#18 0x00007fa10913467d in g_main_dispatch (context=0x557225c28e80) at gmain.c:3176 qemu#19 0x00007fa10913467d in g_main_context_dispatch (context=context@entry=0x557225c28e80) at gmain.c:3829 qemu#20 0x0000557223f31808 in glib_pollfds_poll () at util/main-loop.c:219 qemu#21 0x0000557223f31808 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 qemu#22 0x0000557223f31808 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 qemu#23 0x0000557223d13201 in main_loop () at vl.c:1828 qemu#24 0x0000557223bbfb82 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1779036 Signed-off-by: Sergio Lopez <slp@redhat.com>

start vm with libvirt, when GuestOS running, enter poweroff command using the xhci keyboard, then ASAN shows memory leak stack： Direct leak of 80 byte(s) in 5 object(s) allocated from: #0 0xfffd1e6431cb in __interceptor_malloc (/lib64/libasan.so.4+0xd31cb) #1 0xfffd1e107163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163) #2 0xaaad39051367 in qemu_sglist_init /qemu/dma-helpers.c:43 #3 0xaaad3947c407 in pci_dma_sglist_init /qemu/include/hw/pci/pci.h:842 #4 0xaaad3947c407 in xhci_xfer_create_sgl /qemu/hw/usb/hcd-xhci.c:1446 #5 0xaaad3947c407 in xhci_setup_packet /qemu/hw/usb/hcd-xhci.c:1618 #6 0xaaad3948625f in xhci_submit /qemu/hw/usb/hcd-xhci.c:1827 #7 0xaaad3948625f in xhci_fire_transfer /qemu/hw/usb/hcd-xhci.c:1839 #8 0xaaad3948625f in xhci_kick_epctx /qemu/hw/usb/hcd-xhci.c:1991 #9 0xaaad3948f537 in xhci_doorbell_write /qemu/hw/usb/hcd-xhci.c:3158 #10 0xaaad38bcbfc7 in memory_region_write_accessor /qemu/memory.c:483 #11 0xaaad38bc654f in access_with_adjusted_size /qemu/memory.c:544 #12 0xaaad38bd1877 in memory_region_dispatch_write /qemu/memory.c:1482 #13 0xaaad38b1c77f in flatview_write_continue /qemu/exec.c:3167 #14 0xaaad38b1ca83 in flatview_write /qemu/exec.c:3207 #15 0xaaad38b268db in address_space_write /qemu/exec.c:3297 #16 0xaaad38bf909b in kvm_cpu_exec /qemu/accel/kvm/kvm-all.c:2383 #17 0xaaad38bb063f in qemu_kvm_cpu_thread_fn /qemu/cpus.c:1246 #18 0xaaad39821c93 in qemu_thread_start /qemu/util/qemu-thread-posix.c:519 #19 0xfffd1c8378bb (/lib64/libpthread.so.0+0x78bb) #20 0xfffd1c77616b (/lib64/libc.so.6+0xd616b) Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com> Message-id: 20200110105855.81144-1-kuhn.chenqun@huawei.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

All paths that lead to bdrv_backup_top_drop(), except for the call from backup_clean(), imply that the BDS AioContext has already been acquired, so doing it there too can potentially lead to QEMU hanging on AIO_WAIT_WHILE(). An easy way to trigger this situation is by issuing a two actions transaction, with a proper and a bogus blockdev-backup, so the second one will trigger a rollback. This will trigger a hang with an stack trace like this one: #0 0x00007fb680c75016 in __GI_ppoll (fds=0x55e74580f7c0, nfds=1, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39 qemu#1 0x000055e743386e09 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 qemu#2 0x000055e743386e09 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:336 qemu#3 0x000055e743388dc4 in aio_poll (ctx=0x55e7458925d0, blocking=blocking@entry=true) at util/aio-posix.c:669 qemu#4 0x000055e743305dea in bdrv_flush (bs=bs@entry=0x55e74593c0d0) at block/io.c:2878 qemu#5 0x000055e7432be58e in bdrv_close (bs=0x55e74593c0d0) at block.c:4017 qemu#6 0x000055e7432be58e in bdrv_delete (bs=<optimized out>) at block.c:4262 qemu#7 0x000055e7432be58e in bdrv_unref (bs=bs@entry=0x55e74593c0d0) at block.c:5644 qemu#8 0x000055e743316b9b in bdrv_backup_top_drop (bs=bs@entry=0x55e74593c0d0) at block/backup-top.c:273 qemu#9 0x000055e74331461f in backup_job_create (job_id=0x0, bs=bs@entry=0x55e7458d5820, target=target@entry=0x55e74589f640, speed=0, sync_mode=MIRROR_SYNC_MODE_FULL, sync_bitmap=sync_bitmap@entry=0x0, bitmap_mode=BITMAP_SYNC_MODE_ON_SUCCESS, compress=false, filter_node_name=0x0, on_source_error=BLOCKDEV_ON_ERROR_REPORT, on_target_error=BLOCKDEV_ON_ERROR_REPORT, creation_flags=0, cb=0x0, opaque=0x0, txn=0x0, errp=0x7ffddfd1efb0) at block/backup.c:478 qemu#10 0x000055e74315bc52 in do_backup_common (backup=backup@entry=0x55e746c066d0, bs=bs@entry=0x55e7458d5820, target_bs=target_bs@entry=0x55e74589f640, aio_context=aio_context@entry=0x55e7458a91e0, txn=txn@entry=0x0, errp=errp@entry=0x7ffddfd1efb0) at blockdev.c:3580 qemu#11 0x000055e74315c37c in do_blockdev_backup (backup=backup@entry=0x55e746c066d0, txn=0x0, errp=errp@entry=0x7ffddfd1efb0) at /usr/src/debug/qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64/./qapi/qapi-types-block-core.h:1492 qemu#12 0x000055e74315c449 in blockdev_backup_prepare (common=0x55e746a8de90, errp=0x7ffddfd1f018) at blockdev.c:1885 qemu#13 0x000055e743160152 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x55e7467fe2c0, errp=errp@entry=0x7ffddfd1f088) at blockdev.c:2340 qemu#14 0x000055e743287ff5 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffddfd1f0f8) at qapi/qapi-commands-transaction.c:44 qemu#15 0x000055e74333de6c in do_qmp_dispatch (errp=0x7ffddfd1f0f0, allow_oob=<optimized out>, request=<optimized out>, cmds=0x55e743c28d60 <qmp_commands>) at qapi/qmp-dispatch.c:132 qemu#16 0x000055e74333de6c in qmp_dispatch (cmds=0x55e743c28d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 qemu#17 0x000055e74325c061 in monitor_qmp_dispatch (mon=0x55e745908030, req=<optimized out>) at monitor/qmp.c:145 qemu#18 0x000055e74325c6fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234 qemu#19 0x000055e743385866 in aio_bh_call (bh=0x55e745807ae0) at util/async.c:117 qemu#20 0x000055e743385866 in aio_bh_poll (ctx=ctx@entry=0x55e7458067a0) at util/async.c:117 qemu#21 0x000055e743388c54 in aio_dispatch (ctx=0x55e7458067a0) at util/aio-posix.c:459 qemu#22 0x000055e743385742 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 qemu#23 0x00007fb68543e67d in g_main_dispatch (context=0x55e745893a40) at gmain.c:3176 qemu#24 0x00007fb68543e67d in g_main_context_dispatch (context=context@entry=0x55e745893a40) at gmain.c:3829 qemu#25 0x000055e743387d08 in glib_pollfds_poll () at util/main-loop.c:219 qemu#26 0x000055e743387d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 qemu#27 0x000055e743387d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 qemu#28 0x000055e74316a3c1 in main_loop () at vl.c:1828 qemu#29 0x000055e743016a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 Fix this by not acquiring the AioContext there, and ensuring all paths leading to it have it already acquired (backup_clean()). RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782111 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Dirty map addition and removal functions are not acquiring to BDS AioContext, while they may call to code that expects it to be acquired. This may trigger a crash with a stack trace like this one: #0 0x00007f0ef146370f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 qemu#1 0x00007f0ef144db25 in __GI_abort () at abort.c:79 qemu#2 0x0000565022294dce in error_exit (err=<optimized out>, msg=msg@entry=0x56502243a730 <__func__.16350> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 qemu#3 0x00005650222950ba in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5650244b0240, file=file@entry=0x565022439adf "util/async.c", line=line@entry=526) at util/qemu-thread-posix.c:108 qemu#4 0x0000565022290029 in aio_context_release (ctx=ctx@entry=0x5650244b01e0) at util/async.c:526 qemu#5 0x000056502221cd08 in bdrv_can_store_new_dirty_bitmap (bs=bs@entry=0x5650244dc820, name=name@entry=0x56502481d360 "bitmap1", granularity=granularity@entry=65536, errp=errp@entry=0x7fff22831718) at block/dirty-bitmap.c:542 qemu#6 0x000056502206ae53 in qmp_block_dirty_bitmap_add (errp=0x7fff22831718, disabled=false, has_disabled=<optimized out>, persistent=<optimized out>, has_persistent=true, granularity=65536, has_granularity=<optimized out>, name=0x56502481d360 "bitmap1", node=<optimized out>) at blockdev.c:2894 qemu#7 0x000056502206ae53 in qmp_block_dirty_bitmap_add (node=<optimized out>, name=0x56502481d360 "bitmap1", has_granularity=<optimized out>, granularity=<optimized out>, has_persistent=true, persistent=<optimized out>, has_disabled=false, disabled=false, errp=0x7fff22831718) at blockdev.c:2856 qemu#8 0x00005650221847a3 in qmp_marshal_block_dirty_bitmap_add (args=<optimized out>, ret=<optimized out>, errp=0x7fff22831798) at qapi/qapi-commands-block-core.c:651 qemu#9 0x0000565022247e6c in do_qmp_dispatch (errp=0x7fff22831790, allow_oob=<optimized out>, request=<optimized out>, cmds=0x565022b32d60 <qmp_commands>) at qapi/qmp-dispatch.c:132 qemu#10 0x0000565022247e6c in qmp_dispatch (cmds=0x565022b32d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 qemu#11 0x0000565022166061 in monitor_qmp_dispatch (mon=0x56502450faa0, req=<optimized out>) at monitor/qmp.c:145 qemu#12 0x00005650221666fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234 qemu#13 0x000056502228f866 in aio_bh_call (bh=0x56502440eae0) at util/async.c:117 qemu#14 0x000056502228f866 in aio_bh_poll (ctx=ctx@entry=0x56502440d7a0) at util/async.c:117 qemu#15 0x0000565022292c54 in aio_dispatch (ctx=0x56502440d7a0) at util/aio-posix.c:459 qemu#16 0x000056502228f742 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 qemu#17 0x00007f0ef5ce667d in g_main_dispatch (context=0x56502449aa40) at gmain.c:3176 qemu#18 0x00007f0ef5ce667d in g_main_context_dispatch (context=context@entry=0x56502449aa40) at gmain.c:3829 qemu#19 0x0000565022291d08 in glib_pollfds_poll () at util/main-loop.c:219 qemu#20 0x0000565022291d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 qemu#21 0x0000565022291d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 qemu#22 0x00005650220743c1 in main_loop () at vl.c:1828 qemu#23 0x0000565021f20a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 Fix this by acquiring the AioContext at qmp_block_dirty_bitmap_add() and qmp_block_dirty_bitmap_add(). RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782175 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

external_snapshot_abort() calls to bdrv_set_backing_hd(), which returns state->old_bs to the main AioContext, as it's intended to be used then the BDS is going to be released. As that's not the case when aborting an external snapshot, return it to the AioContext it was before the call. This issue can be triggered by issuing a transaction with two actions, a proper blockdev-snapshot-sync and a bogus one, so the second will trigger a transaction abort. This results in a crash with an stack trace like this one: #0 0x00007fa1048b28df in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 qemu#1 0x00007fa10489ccf5 in __GI_abort () at abort.c:79 qemu#2 0x00007fa10489cbc9 in __assert_fail_base (fmt=0x7fa104a03300 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=0x557224014d30 "block.c", line=2240, function=<optimized out>) at assert.c:92 qemu#3 0x00007fa1048aae96 in __GI___assert_fail (assertion=assertion@entry=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=file@entry=0x557224014d30 "block.c", line=line@entry=2240, function=function@entry=0x5572240b5d60 <__PRETTY_FUNCTION__.31620> "bdrv_replace_child_noperm") at assert.c:101 qemu#4 0x0000557223e631f8 in bdrv_replace_child_noperm (child=0x557225b9c980, new_bs=new_bs@entry=0x557225c42e40) at block.c:2240 qemu#5 0x0000557223e68be7 in bdrv_replace_node (from=0x557226951a60, to=0x557225c42e40, errp=0x5572247d6138 <error_abort>) at block.c:4196 qemu#6 0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1731 qemu#7 0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1717 qemu#8 0x0000557223d09013 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x557225cc7d70, errp=errp@entry=0x7ffe704c0c98) at blockdev.c:2360 qemu#9 0x0000557223e32085 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffe704c0d08) at qapi/qapi-commands-transaction.c:44 qemu#10 0x0000557223ee798c in do_qmp_dispatch (errp=0x7ffe704c0d00, allow_oob=<optimized out>, request=<optimized out>, cmds=0x5572247d3cc0 <qmp_commands>) at qapi/qmp-dispatch.c:132 qemu#11 0x0000557223ee798c in qmp_dispatch (cmds=0x5572247d3cc0 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 qemu#12 0x0000557223e06141 in monitor_qmp_dispatch (mon=0x557225c69ff0, req=<optimized out>) at monitor/qmp.c:120 qemu#13 0x0000557223e0678a in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:209 qemu#14 0x0000557223f2f366 in aio_bh_call (bh=0x557225b9dc60) at util/async.c:117 qemu#15 0x0000557223f2f366 in aio_bh_poll (ctx=ctx@entry=0x557225b9c840) at util/async.c:117 qemu#16 0x0000557223f32754 in aio_dispatch (ctx=0x557225b9c840) at util/aio-posix.c:459 qemu#17 0x0000557223f2f242 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 qemu#18 0x00007fa10913467d in g_main_dispatch (context=0x557225c28e80) at gmain.c:3176 qemu#19 0x00007fa10913467d in g_main_context_dispatch (context=context@entry=0x557225c28e80) at gmain.c:3829 qemu#20 0x0000557223f31808 in glib_pollfds_poll () at util/main-loop.c:219 qemu#21 0x0000557223f31808 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 qemu#22 0x0000557223f31808 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 qemu#23 0x0000557223d13201 in main_loop () at vl.c:1828 qemu#24 0x0000557223bbfb82 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1779036 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Coverity reports: *** CID 1419387: Memory - illegal accesses (OVERRUN) /hw/hppa/dino.c: 267 in dino_chip_read_with_attrs() 261 val = s->ilr & s->imr & s->icr; 262 break; 263 case DINO_TOC_ADDR: 264 val = s->toc_addr; 265 break; 266 case DINO_GMASK ... DINO_TLTIM: >>> CID 1419387: Memory - illegal accesses (OVERRUN) >>> Overrunning array "s->reg800" of 12 4-byte elements at element index 12 (byte offset 48) using index "(addr - 2048UL) / 4UL" (which evaluates to 12). 267 val = s->reg800[(addr - DINO_GMASK) / 4]; 268 if (addr == DINO_PAMR) { 269 val &= ~0x01; /* LSB is hardwired to 0 */ 270 } 271 if (addr == DINO_MLTIM) { 272 val &= ~0x07; /* 3 LSB are hardwired to 0 */ *** CID 1419393: Memory - corruptions (OVERRUN) /hw/hppa/dino.c: 363 in dino_chip_write_with_attrs() 357 /* These registers are read-only. */ 358 break; 359 360 case DINO_GMASK ... DINO_TLTIM: 361 i = (addr - DINO_GMASK) / 4; 362 val &= reg800_keep_bits[i]; >>> CID 1419393: Memory - corruptions (OVERRUN) >>> Overrunning array "s->reg800" of 12 4-byte elements at element index 12 (byte offset 48) using index "i" (which evaluates to 12). 363 s->reg800[i] = val; 364 break; 365 366 default: 367 /* Controlled by dino_chip_mem_valid above. */ 368 g_assert_not_reached(); *** CID 1419394: Memory - illegal accesses (OVERRUN) /hw/hppa/dino.c: 362 in dino_chip_write_with_attrs() 356 case DINO_IRR1: 357 /* These registers are read-only. */ 358 break; 359 360 case DINO_GMASK ... DINO_TLTIM: 361 i = (addr - DINO_GMASK) / 4; >>> CID 1419394: Memory - illegal accesses (OVERRUN) >>> Overrunning array "reg800_keep_bits" of 12 4-byte elements at element index 12 (byte offset 48) using index "i" (which evaluates to 12). 362 val &= reg800_keep_bits[i]; 363 s->reg800[i] = val; 364 break; 365 366 default: 367 /* Controlled by dino_chip_mem_valid above. */ Indeed the array should contain 13 entries, the undocumented register 0x82c is missing. Fix by increasing the array size and adding the missing register. CID 1419387 can be verified with: $ echo x 0xfff80830 | hppa-softmmu/qemu-system-hppa -S -monitor stdio -display none QEMU 4.2.50 monitor - type 'help' for more information (qemu) x 0xfff80830 qemu/hw/hppa/dino.c:267:15: runtime error: index 12 out of bounds for type 'uint32_t [12]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/phil/source/qemu/hw/hppa/dino.c:267:15 in 00000000fff80830: 0x00000000 and CID 1419393/1419394 with: $ echo writeb 0xfff80830 0x69 \ | hppa-softmmu/qemu-system-hppa -S -accel qtest -qtest stdio -display none [I 1581634452.654113] OPENED [R +4.105415] writeb 0xfff80830 0x69 qemu/hw/hppa/dino.c:362:16: runtime error: index 12 out of bounds for type 'const uint32_t [12]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior qemu/hw/hppa/dino.c:362:16 in ================================================================= ==29607==ERROR: AddressSanitizer: global-buffer-overflow on address 0x5577dae32f30 at pc 0x5577d93f2463 bp 0x7ffd97ea11b0 sp 0x7ffd97ea11a8 READ of size 4 at 0x5577dae32f30 thread T0 #0 0x5577d93f2462 in dino_chip_write_with_attrs qemu/hw/hppa/dino.c:362:16 #1 0x5577d9025664 in memory_region_write_with_attrs_accessor qemu/memory.c:503:12 #2 0x5577d9024920 in access_with_adjusted_size qemu/memory.c:539:18 #3 0x5577d9023608 in memory_region_dispatch_write qemu/memory.c:1482:13 qemu#4 0x5577d8e3177a in flatview_write_continue qemu/exec.c:3166:23 qemu#5 0x5577d8e20357 in flatview_write qemu/exec.c:3206:14 qemu#6 0x5577d8e1fef4 in address_space_write qemu/exec.c:3296:18 qemu#7 0x5577d8e20693 in address_space_rw qemu/exec.c:3306:16 qemu#8 0x5577d9011595 in qtest_process_command qemu/qtest.c:432:13 qemu#9 0x5577d900d19f in qtest_process_inbuf qemu/qtest.c:705:9 qemu#10 0x5577d900ca22 in qtest_read qemu/qtest.c:717:5 qemu#11 0x5577da8c4254 in qemu_chr_be_write_impl qemu/chardev/char.c:183:9 qemu#12 0x5577da8c430c in qemu_chr_be_write qemu/chardev/char.c:195:9 qemu#13 0x5577da8cf587 in fd_chr_read qemu/chardev/char-fd.c:68:9 qemu#14 0x5577da9836cd in qio_channel_fd_source_dispatch qemu/io/channel-watch.c:84:12 qemu#15 0x7faf44509ecc in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4fecc) qemu#16 0x5577dab75f96 in glib_pollfds_poll qemu/util/main-loop.c:219:9 qemu#17 0x5577dab74797 in os_host_main_loop_wait qemu/util/main-loop.c:242:5 qemu#18 0x5577dab7435a in main_loop_wait qemu/util/main-loop.c:518:11 qemu#19 0x5577d9514eb3 in main_loop qemu/vl.c:1682:9 qemu#20 0x5577d950699d in main qemu/vl.c:4450:5 qemu#21 0x7faf41a87f42 in __libc_start_main (/lib64/libc.so.6+0x23f42) qemu#22 0x5577d8cd4d4d in _start (qemu/build/sanitizer/hppa-softmmu/qemu-system-hppa+0x1256d4d) 0x5577dae32f30 is located 0 bytes to the right of global variable 'reg800_keep_bits' defined in 'qemu/hw/hppa/dino.c:87:23' (0x5577dae32f00) of size 48 SUMMARY: AddressSanitizer: global-buffer-overflow qemu/hw/hppa/dino.c:362:16 in dino_chip_write_with_attrs Shadow bytes around the buggy address: 0x0aaf7b5be590: 00 f9 f9 f9 f9 f9 f9 f9 00 02 f9 f9 f9 f9 f9 f9 0x0aaf7b5be5a0: 07 f9 f9 f9 f9 f9 f9 f9 07 f9 f9 f9 f9 f9 f9 f9 0x0aaf7b5be5b0: 07 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00 0x0aaf7b5be5c0: 00 00 00 02 f9 f9 f9 f9 00 00 00 00 00 00 00 00 0x0aaf7b5be5d0: 00 00 00 00 00 00 00 00 00 00 00 03 f9 f9 f9 f9 =>0x0aaf7b5be5e0: 00 00 00 00 00 00[f9]f9 f9 f9 f9 f9 00 00 00 00 0x0aaf7b5be5f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0aaf7b5be600: 00 00 01 f9 f9 f9 f9 f9 00 00 00 00 07 f9 f9 f9 0x0aaf7b5be610: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00 0x0aaf7b5be620: 00 00 00 05 f9 f9 f9 f9 00 00 00 00 07 f9 f9 f9 0x0aaf7b5be630: f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 07 f9 f9 f9 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==29607==ABORTING Fixes: Covertiy CID 1419387 / 1419393 / 1419394 (commit 1809259) Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20200218063355.18577-3-f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

The virtqueue code sets up MemoryRegionCaches to access the virtqueue guest RAM data structures. The code currently assumes that VRingMemoryRegionCaches is initialized before device emulation code accesses the virtqueue. An assertion will fail in vring_get_region_caches() when this is not true. Device fuzzing found a case where this assumption is false (see below). Virtqueue guest RAM addresses can also be changed from a vCPU thread while an IOThread is accessing the virtqueue. This breaks the same assumption but this time the caches could become invalid partway through the virtqueue code. The code fetches the caches RCU pointer multiple times so we will need to validate the pointer every time it is fetched. Add checks each time we call vring_get_region_caches() and treat invalid caches as a nop: memory stores are ignored and memory reads return 0. The fuzz test failure is as follows: $ qemu -M pc -device virtio-blk-pci,id=drv0,drive=drive0,addr=4.0 \ -drive if=none,id=drive0,file=null-co://,format=raw,auto-read-only=off \ -drive if=none,id=drive1,file=null-co://,file.read-zeroes=on,format=raw \ -display none \ -qtest stdio endianness outl 0xcf8 0x80002020 outl 0xcfc 0xe0000000 outl 0xcf8 0x80002004 outw 0xcfc 0x7 write 0xe0000000 0x24 0x00ffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffab5cffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffab0000000001 inb 0x4 writew 0xe000001c 0x1 write 0xe0000014 0x1 0x0d The following error message is produced: qemu-system-x86_64: /home/stefanha/qemu/hw/virtio/virtio.c:286: vring_get_region_caches: Assertion `caches != NULL' failed. The backtrace looks like this: #0 0x00007ffff5520625 in raise () at /lib64/libc.so.6 #1 0x00007ffff55098d9 in abort () at /lib64/libc.so.6 #2 0x00007ffff55097a9 in _nl_load_domain.cold () at /lib64/libc.so.6 #3 0x00007ffff5518a66 in annobin_assert.c_end () at /lib64/libc.so.6 #4 0x00005555559073da in vring_get_region_caches (vq=<optimized out>) at qemu/hw/virtio/virtio.c:286 #5 vring_get_region_caches (vq=<optimized out>) at qemu/hw/virtio/virtio.c:283 #6 0x000055555590818d in vring_used_flags_set_bit (mask=1, vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:398 #7 virtio_queue_split_set_notification (enable=0, vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:398 #8 virtio_queue_set_notification (vq=vq@entry=0x5555575ceea0, enable=enable@entry=0) at qemu/hw/virtio/virtio.c:451 #9 0x0000555555908512 in virtio_queue_set_notification (vq=vq@entry=0x5555575ceea0, enable=enable@entry=0) at qemu/hw/virtio/virtio.c:444 #10 0x00005555558c697a in virtio_blk_handle_vq (s=0x5555575c57e0, vq=0x5555575ceea0) at qemu/hw/block/virtio-blk.c:775 #11 0x0000555555907836 in virtio_queue_notify_aio_vq (vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:2244 #12 0x0000555555cb5dd7 in aio_dispatch_handlers (ctx=ctx@entry=0x55555671a420) at util/aio-posix.c:429 #13 0x0000555555cb67a8 in aio_dispatch (ctx=0x55555671a420) at util/aio-posix.c:460 #14 0x0000555555cb307e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 #15 0x00007ffff7bbc510 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 #16 0x0000555555cb5848 in glib_pollfds_poll () at util/main-loop.c:219 #17 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 #18 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 #19 0x00005555559b20c9 in main_loop () at vl.c:1683 #20 0x0000555555838115 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4441 Reported-by: Alexander Bulekov <alxndr@bu.edu> Cc: Michael Tsirkin <mst@redhat.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200207104619.164892-1-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

There is a use-after-free possible: bdrv_unref_child() leaves bs->backing freed but not NULL. bdrv_attach_child may produce nested polling loop due to drain, than access of freed pointer is possible. I've produced the following crash on 30 iotest with modified code. It does not reproduce on master, but still seems possible: #0 __strcmp_avx2 () at /lib64/libc.so.6 #1 bdrv_backing_overridden (bs=0x55c9d3cc2060) at block.c:6350 #2 bdrv_refresh_filename (bs=0x55c9d3cc2060) at block.c:6404 #3 bdrv_backing_attach (c=0x55c9d48e5520) at block.c:1063 #4 bdrv_replace_child_noperm (child=child@entry=0x55c9d48e5520, new_bs=new_bs@entry=0x55c9d3cc2060) at block.c:2290 #5 bdrv_replace_child (child=child@entry=0x55c9d48e5520, new_bs=new_bs@entry=0x55c9d3cc2060) at block.c:2320 #6 bdrv_root_attach_child (child_bs=child_bs@entry=0x55c9d3cc2060, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, ctx=<optimized out>, perm=<optimized out>, shared_perm=21, opaque=0x55c9d3c5a3d0, errp=0x7ffd117108e0) at block.c:2424 #7 bdrv_attach_child (parent_bs=parent_bs@entry=0x55c9d3c5a3d0, child_bs=child_bs@entry=0x55c9d3cc2060, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, errp=errp@entry=0x7ffd117108e0) at block.c:5876 #8 in bdrv_set_backing_hd (bs=bs@entry=0x55c9d3c5a3d0, backing_hd=backing_hd@entry=0x55c9d3cc2060, errp=errp@entry=0x7ffd117108e0) at block.c:2576 #9 stream_prepare (job=0x55c9d49d84a0) at block/stream.c:150 #10 job_prepare (job=0x55c9d49d84a0) at job.c:761 #11 job_txn_apply (txn=<optimized out>, fn=<optimized out>) at job.c:145 #12 job_do_finalize (job=0x55c9d49d84a0) at job.c:778 #13 job_completed_txn_success (job=0x55c9d49d84a0) at job.c:832 #14 job_completed (job=0x55c9d49d84a0) at job.c:845 #15 job_completed (job=0x55c9d49d84a0) at job.c:836 #16 job_exit (opaque=0x55c9d49d84a0) at job.c:864 #17 aio_bh_call (bh=0x55c9d471a160) at util/async.c:117 #18 aio_bh_poll (ctx=ctx@entry=0x55c9d3c46720) at util/async.c:117 #19 aio_poll (ctx=ctx@entry=0x55c9d3c46720, blocking=blocking@entry=true) at util/aio-posix.c:728 #20 bdrv_parent_drained_begin_single (poll=true, c=0x55c9d3d558f0) at block/io.c:121 #21 bdrv_parent_drained_begin_single (c=c@entry=0x55c9d3d558f0, poll=poll@entry=true) at block/io.c:114 #22 bdrv_replace_child_noperm (child=child@entry=0x55c9d3d558f0, new_bs=new_bs@entry=0x55c9d3d27300) at block.c:2258 #23 bdrv_replace_child (child=child@entry=0x55c9d3d558f0, new_bs=new_bs@entry=0x55c9d3d27300) at block.c:2320 #24 bdrv_root_attach_child (child_bs=child_bs@entry=0x55c9d3d27300, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, ctx=<optimized out>, perm=<optimized out>, shared_perm=21, opaque=0x55c9d3cc2060, errp=0x7ffd11710c60) at block.c:2424 #25 bdrv_attach_child (parent_bs=parent_bs@entry=0x55c9d3cc2060, child_bs=child_bs@entry=0x55c9d3d27300, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, errp=errp@entry=0x7ffd11710c60) at block.c:5876 #26 bdrv_set_backing_hd (bs=bs@entry=0x55c9d3cc2060, backing_hd=backing_hd@entry=0x55c9d3d27300, errp=errp@entry=0x7ffd11710c60) at block.c:2576 #27 stream_prepare (job=0x55c9d495ead0) at block/stream.c:150 ... Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200316060631.30052-2-vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>

repo-lockdown · 2020-04-08T01:10:40Z

Thank you for your interest in the QEMU project.

This repository is a read-only mirror of the project's master
repostories hosted on https://git.qemu.org/git/qemu.git.
The project does not process merge requests filed on GitHub.

QEMU welcomes contributions of code (either fixing bugs or adding new
functionality). However, we get a lot of patches, and so we have some
guidelines about contributing on the project website:
https://www.qemu.org/contribute/

stefanhaRH and others added 30 commits August 26, 2014 16:46

pc-dimm: fix up error message

948574e

- int should be printed using %d - print actual wrong value for property Cc: qemu-stable@nongnu.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 988eba0) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

repo-lockdown bot locked and limited conversation to collaborators Apr 8, 2020

forky2 pushed a commit to MissingFuzz/qemu that referenced this pull request Mar 2, 2022

fix qemu#20

d73b033

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stable 2.1 #20

Stable 2.1 #20

hkscy commented Mar 12, 2015

repo-lockdown bot commented Apr 8, 2020

Stable 2.1 #20

Stable 2.1 #20

Conversation

hkscy commented Mar 12, 2015

repo-lockdown bot commented Apr 8, 2020