-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ztest: zthr cancel/resume race in spa_export_common() #7744
Comments
Fixed by #8229 |
I seem to be hitting this, or something similar again while running zloop on master. Let me know if you need more information. These are some of the stacks I saw:
|
Finished diagnosing the issue. It is not exactly the same as it situation gets more involved with how async threads overall (not just zthrs) interact with the rest of the spa_export() code path. For any future reference, I've issued this bug -> #9015 |
System information
Describe the problem you're observing
When running ztest the
ASSERT(!t->zthr_cancel);
inzthr_resume()
may be hit. This is caused by two threads concurrently executingztest_spa_create_destroy()
. The first one cancels then resumes the zthr due to the non-zero reference count caused by the other thread attempting the same operation. Stack traces below.Describe how to reproduce the problem
Locally run
zloop.sh
and eventually you'll hit this case. Increasing the frequency of theztest_spa_create_destroy()
tests may make it more likely. This is most commonly observed by buildbot.Include any warning/errors/backtraces from the system logs
Full logs:
http://build.zfsonlinux.org/artifacts/branch/master/zfs-0.7.0-1480-ge106a7b/CentOS-7-x86_64/ztest/ztest-20180725T173335.tar.xz
Relevant stacks.
@sdimitro when you have a chance would you please look at this. There are lots of ways to handle this but I thought you might have a preferred fix.
The text was updated successfully, but these errors were encountered: