-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
random call trace on heavy load #3148
Comments
That's usually just an information about a stuck process: This can happen when your disks are too busy, preventing the named processes from being switched out by the scheduler in the last If the boxes stay unresponsive and have to be rebooted to be usable again you're seeing a bug, but if the boxes become responsive again by themselves you are getting harmless warnings. |
looks, like it might be related to #3142 and the pull-request #3132 , openzfs/spl#435 on SPL |
@dasjoe You are right. somestime these call traces are harmles. From time to time the "hang" needs several seconds till the box is respondig again. Some of these "hangs" are more impacting. After the call trace sometimes some zfs processes are runnig at 100% CPU. Mostly arc_adapt and/or zfs_input_taskq are at 100 %. The box responses but at extrem low performance. This is persistent till reboot. |
@kernelOfTruth I do not believe that is his issue at all. the issue @tobias-k is experiencing is a bit different. I have actually appeared to reproduce something similar running 0.6.3.1 with Kernel 3.14.4. Heavy work load of SMB traffic with constant ZFS send and rcv running. Below is the out put of my stack traces. Please note I applied #3348 to this system and it did not fully solve the issue although it did stop subsequent ZFS commands from hanging while these initial hangs occured. Apr 30 11:45:10 ip-172-26-0-157 kernel: INFO: task spl_system_task:2153 blocked for more than 120 seconds. |
@behlendorf Houston, we've got a (stack) problem: #675 (?)
|
Also related ? https://www.illumos.org/issues/4820
|
Referencing: #2873 (issue, kernel error on 3.16.7 during zfs send #2873) a168788 (commit, Reduce stack for traverse_visitbp() recursion) #3348 (pull, Fix misuse of input argument in traverse_visitbp #3348) |
@kernelOfTruth I wondered about a stack issue since we've had that issue down this call path before. But I didn't see any conclusive evidence of that in the the stack trace. One fairly easy way to confirm this would be to jump forward to a 3.15 or newer kernel when the default kernel stack size was doubled. If this is a stack issue that will resolve it. |
Closing, this is believed to have been resolved. |
random call traces on heavy load an diffrent boxes on diffrent tasks.
3.13.0-24-generic # 47-Ubuntu
ZFS v0.6.3-5~trusty from ubuntu ppa repo
SPL: Loaded module v0.6.3-3
trustytrusty, ZFS pool version 5000, ZFS filesystem version 5ZFS: Loaded module v0.6.3-5
Box 1
Box 6
Box 7
The text was updated successfully, but these errors were encountered: