-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
txg_sync, zfs blocked for more than 120s on debian jessie/zfs 0.6.5.2-2 #3903
Comments
I see possibly related traces from ZoL 0.6.5.2 on up-to-date Ubuntu 14.04 during a scrub with minor other load:
|
Dynamic taskqs have been implicated in these types of hangs. To that end, they've been disabled by default in the recently tagged 0.6.5.3 version of SPL. @dasjoe In the mean time, if you can reproduce this fairly reliably, could you try setting My current theory is that a deadlock condition similar to that which precipitated openzfs/spl@076821e can occur when dynamic taskqs are enabled. These and related hangs appear to be waiting for zio which never completes and I think it's never completing because it's never launched due to dynamic taskq. I'll also note that even though Illumos does support dynamic taskqs, it does not use them for zfs taskqs. I'm going to try to add some kstats to allow visibility to the taskqs which ought to help find the root cause of these problems. According to my theory, we'll see a taskq with some pending sequential tasks on which the queue's currently-running task depends for its exit condition. Another interesting experiment would be to disable dynamic taskq behavior for ``taskq_dispatch_ent()` which is used for for the zio operations. |
Hi,
I sometimes get this error message in dmesg:
Iostat shows the pool SSDs are not overloaded, there is enough free space (66% utilized). The CPU is also not very loaded.
Versions:
Debian 8.2
zfs 0.6.5.2-2
spl 0.6.5-1
The text was updated successfully, but these errors were encountered: