Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blocked for more than 120 seconds on 0.6.5.2 in KVM VM #4065

Closed
poiuty opened this issue Dec 4, 2015 · 4 comments
Closed

blocked for more than 120 seconds on 0.6.5.2 in KVM VM #4065

poiuty opened this issue Dec 4, 2015 · 4 comments

Comments

@poiuty
Copy link

poiuty commented Dec 4, 2015

Debian 8, 3.16.0-4-amd64, qemu-kvm, zfs v0.6.5.2-2, VM in zvol with lz4
zfs => zvol kvm vm => lvm => ext4

Today VM freeze again. Get screenshot from vnc. Check logs (node+vm) - no call trace.

Before (about ~3 months) I use zfs 0.6.4 and dont have this problem.
So. I think problem mb in zfs | or in virtio | or in kernel (vm)

    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source file='/dev/zvol/ssd/kvm640'/>
      <target dev='hda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>

Now I try update vm kernel (3.2 to 3.16)
Any ideas?

also #3955

@kernelOfTruth
Copy link
Contributor

@poiuty please update to 0.6.5.3

Bug Fixes

    Fix CPU hotplug zfsonlinux/spl#482
    Disable dynamic taskqs by default to avoid deadlock zfsonlinux/spl#484
    Don't import all visible pools in zfs-import init script zfsonlinux/zfs#3777
    Fix use-after-free in vdev_disk_physio_completion zfsonlinux/zfs#3920
    Fix avl_is_empty(&dn->dn_dbufs) assertion zfsonlinux/zfs#3865

or disable dynamic taskq's manually, like you already mentioned in #3955 (comment)

@poiuty
Copy link
Author

poiuty commented Dec 4, 2015

@kernelOfTruth - dont help for me (already set 0)

cat /sys/module/spl/parameters/spl_taskq_thread_dynamic 
0

And I cant update to 0.6.5.3
no update for debian 8 (apt-get update && apt-get upgrade)

@ryao
Copy link
Contributor

ryao commented Dec 4, 2015

@poiuty I suspect that this could be a bug in QEMU's virtio code / the guest's virtio code unless there is a hang on the host. 0.6.5.y changed how zvol processing is done to return faster by calling directly into the DMU instead of passing the work to a worker thread, which reduces latencies. It is conceivable that the virtio code cannot reliably handle the faster return. Another possibility is that there is a bug in Linux 3.16 that causes it to sometimes forget to acknowledge IOs when the attempt to send them comes back with completion immediately rather than later. I will try to look into this some more time permitting.

In the meantime, I suggest switching QEMU from virtio to AHCI. If my suspicion is correct, the problem is in the virtio code and will go away by doing this. AHCI is fairly efficient, so I would not expect you to lose much performance by switching to it. There is an example of how to do this with the raw QEMU command here:

https://wiki.gentoo.org/wiki/QEMU/Options#Hard_drive

You likely can translate that into libvirt's XML format, although I am not familiar enough to provide an example.

@poiuty
Copy link
Author

poiuty commented Dec 4, 2015

@ryao, thx, move test VM to AHCI
For other wait hang. Also, host node - no hang/ freeze.

    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/zvol/ssd/kvm703'/>
      <target dev='vda' bus='sata'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
# ps aux | grep kvm703
qemu-system-x86_64 -enable-kvm -name kvm703 -S -machine pc-i440fx-2.1,accel=kvm,usb=off -cpu host -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 8c1fc777-8de1-4984-b826-045e64dc2e5e -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/kvm703.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2 -device ahci,id=ahci0,bus=pci.0,addr=0x3 -drive file=/dev/zvol/ssd/kvm703,if=none,id=drive-sata0-0-0,format=raw,cache=none,aio=native -device ide-hd,bus=ahci0.0,drive=drive-sata0-0-0,id=sata0-0-0,bootindex=1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=02:00:00:fd:6f:8e,bus=pci.0,addr=0x5 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 0.0.0.0:0,password -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on
-device ahci,id=ahci0,bus=pci.0,addr=0x3 -drive file=/dev/zvol/ssd/kvm703,if=none,id=drive-sata0-0-0,format=raw,cache=none,aio=native -device ide-hd,bus=ahci0.0,drive=drive-sata0-0-0,id=sata0-0-0,bootindex=1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@ryao @poiuty @kernelOfTruth and others