-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INFO: possible recursive locking detected, arc_reclaim #3701
Comments
ZoL is not yet compatible with |
@dweeezil this reminds me, do you have any related patches which needs to be merged to ensure lock profiling works properly? As long as they're non-disruptive it would be nice to get them in the tree. |
@behlendorf It's something I definitely am interested in but I don't use lockdep much in my current large-scale testing due to the huge overhead it creates so this has percolated down my to-do list. |
@dweeezil OK, I just suspected you might already have patches for this since I know you've been working in this area. No problem. |
referencing: openzfs/spl#480 add spin_lock_irqsave_nolockdep and mutex type MUTEX_NOLOCKDEP |
@behlendorf and @dweeezil Sorry to hijack this thread, but I had related general questions regarding addressing ZFS/NFS issues that are very painful to those of us running ZOL on dedicated production NFS servers.
I ask this because although I really like the stability and reliability of ZOL, the NFS issues are painful in a production environment, and I wanted to get a sense of how long we need to hang-on for these NFS issues to be completely addressed, and also find out how each of you handles this issue in your own production environments. Thanks! |
Definitely near the top, this is one of the more common configurations for people using ZoL. In fact, the recently tagged 0.6.5.4 release should address the majority of the ZoL+NFS issues. What would be very helpful to us is if you could provide a list of the NFS related issues you're still seeing with this release. https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.6.5.4
Absolutely we depend on it in production 24/7. However, our specific user workloads and hardware configuration don't seem to as easily trigger the issues which have been reported.
Along with NFS we depend heavily on Lustre. |
Thanks for the quick reply. I will update to the 6.5.4 version and see if that resolves the issue with the "ARC_RECLAIM" using 99% CPU utilization and previous to that we had the "ARC_ADAPT" issue which seem related . I was starting to lose faith and considering the illuminos commercial version(Nexenta), but am not familiar enough to know if they have the same issues with NFS. Now that I know a bit more on the commitment, I will hang on. I would prefer to stay on ZOL using CENTOS. PS. would you be able to share your current production configuration and how you use NFS? Thank again! |
@xflou Although completely unrelated to this original issue, as I recently mentioned in another issue recently, certain metadata-heavy workloads can easily cause the arc adapt thread to spin trying to free up metadata. I don't have a good handle on the exact causes yet but any |
@dweeezil once I get the 6.5.4 installed, I'll keep an eye on things and open a separate case. When you say enough RAM are you also referring to the ZFS ram setting for arc_min and max? I have 256G of RAM on the system and during heavy loads I typically see about 126G being used. Should my arc setting be set higher than they are current? Please see below: options zfs zfs_arc_min=10737418240 Thanks for the help!! |
@xflou could you comment on if the latest 0.6.5.4+ releases has improved your NFS/ZFS issues? |
Closing the original reported issue as a duplicate of #3912. |
kernel was loaded with threadirqs
and built with
and a special variant of BFS (cpu scheduler), VRQ 0.5
The text was updated successfully, but these errors were encountered: