-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent stalling of ZFS pool #3979
Comments
ough :( @gvdijnsen could you please post output of
when it's working properly and when those stalls & memory collapses happen ? Could you please attempt to set
to a value, say, 24 or 32 GB ? I doubt that this would help and believe that it would be related to NFS but there has been some issues with stalls and ARC collapsing in the recent past Make sure to set those options in /etc/modprobe.d/zfs.conf (or similar) since setting them via /sys/ oftentimes doesn't seem to stick or even work |
Thank you for extending a helping hand here! I found your answer just a I had rebooted from another crash. One thing I forgot to mention is that at a reboot when this behaviour is displayed, the shutdown sequence hangs with the message "umount2: Device or resource busy" and it needs a reset. Because this is a production system I cannot reboot at will, so I put the suggested change in /etc/modprobe.d/zfs.conf so it will be active at the reboot. I also set them through 'echo 34359738368 > /sys/module/zfs/parameters/zfs_arc_min' so that it may perhaps be active this time around. An immediate efffect seems to be that the memory consumption (at least for now, uptime 1:50) no longer seems to go past 50GB, which is a change with recent behaviour at least. I will keep you updated. |
Ok it crashed again, so I have some more information for you. The arcstats as I dumped them after an uptime of around 15 hrs. Everything was working smoothly at that time: Around one day later, it crashed again, displaying the same behavior as I have mentioned before. This is the arcstats dump: The dmesg errors are different though, So I have included them here: Hopefully you can make something of this! |
@gvdijnsen What exactly is happening during these stalls ? Do these correlate with pre-set cron jobs ? Any special memory-hungry programs running ? rsync ? zfs send (snapshots) ? Also please specify "crash", what still works ? it either looks like stack issues, heavy load related problems (the error message appears different on non-preemptible kernels) - at least now (#3148) previously it looked like it was related to nfs - you don't use nfs4 so that can be ruled out ? (https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1315955) I'd say to disable the dynamic taskqs, but it's already disabled with 0.6.5.3 (Disable dynamic taskqs by default to avoid deadlock openzfs/spl#484) A specific reason you set spl_kmem_cache_slab_limit to 0 ? CC'ing @dweeezil and @behlendorf in case they've seen similar behavior |
@kernelOfTruth During the stalls, nothing out of the ordinary is happening. That is to say, nothing I know of that might cause this specifically. However, the system is in heavy use by potentially over 130 users and processes. This is a heavily loaded production system, 10.000 IOPS on average with a throughput of 100-300 MB/s on average, with peaks of well over 800MB/s on busy moments. The users are often developers checking stuff in or out with git or other tools and the system is also used in automated release builds that may load the system quite heavily. Most access happens through NFS and it is near impossible to track down all that is happening on the system all the time. However I can help you with some of the questions:
Were you able to glean anything from the arcstats I sent you? |
FYI: We had several of these stalls since my last reports. I found out that if we stop nfs-kernel-server and wait a few minutes for all threads to exit, and then start nfs-kernel-server again, the system is restored to working order. This at least indicates that is is related to nfs. This does not always work out though; the nfs threads refuse to die sometimes and then we need a reboot. |
Following up because I noticed my name mentioned in this issue: I've not really had the time to follow this issue too closely but it sounds like increasing |
@dweeezil thanks for your input ! Alright so if I understood it correctly ZFS needs to have a certain minimum for ARC otherwise everything inevitably would stall, come to a grinding halt, since all data structures, etc. are on ZFS and everything else in turn depending on that - interesting ... I remember having read that it's partially similar on one of the other architectures where OpenZFS runs ? But on Linux the situation is aggravated due to the Pagecache, Slab and other mechanisms - so it's kind of hard to strike a (perfect) balance ? I read on some of the issues that settings zfs_arc_min and zfs_arc_max to the same value fixed an issue, can't recall right now what particular @gvdijnsen one other shot you could attempt would be setting the min and max values for ARC and the min and max value for meta(data) to the same value (but of course keeping the ratio of ARC, data to metadata) and see whether that stabilizes things, the ARC max perhaps a little lower than before to have enough RAM for the non-filesystem processes |
Hi all,
@dweeezil I will try your suggestion about setting the ARC min and max value to the same value. Cant try everything at once, as much as I would like ;-). Also, I do not quite understand your remark about "keeping the ratio of ARC, data to metadata". Could you elaborate about that please? Anyway, the following zfs options will now be active on the next reboot: |
@gvdijnsen I meant that you don't make zfs_arc_meta* too large compared to zfs_arc_min and zfs_arc_max I believe I did that error once at the beginning 😏 *fingers crossed * let's see how that works out :) |
@dweeezil @kernelOfTruth From my understanding, it is like this: arc_min <= arc_meta_min <= arc_meta_limit <= arc_max So the only way to be setting this is setting them all to the same value if arc_min=arc_max. I am probably misunderstanding how this works. Please advise: What would you personally choose for settings for a 130GB RAM server ZFS/NFS/Samba server for all these values? |
@gvdijnsen that's also how I understand it, the recommended default is around 1/2 of RAM for ARC and 1/4 of RAM for metadata; so going with data - I'd go a slight bit below half of RAM for ARC and raise the size of metadata for ARC in relation to that (meaning a little more than 25% since metadata appears to be properly used, also there's the l2arc with not so high hit rate but that might improve with longer runtime) : (The topmost values are the suggested values according to my understanding and from the threads that I read so far, if there's additional changes to be made, please go ahead 😄 ) After, recommended (static) arc_meta_min 17270630208 Before (dynamic): meta max before (dynamic) after (dynamic) |
I am sorry for the silence from my side, but there is a reason for that: Over the last 2 weeks we did not have a single crash. Curiously, I did not implement the change suggested by @dweeezil and @kernelOfTruth yet. There may be 2 reasons I can see:
On one hand I am glad that the system is stable now, but I am suspicious since we had periods with weeks of stability before and this perceived stability may well be caused by the 20% drop in IOPS we are seeing. I am waiting for a good opportunity to reboot the system and run with the new ARC settings... |
@gvdijnsen glad it's gotten more stable ! In relation to that raising the |
Hi, just to let you know that it stalled again just before the weekend. I brought it back up with the recommended settings @kernelOfTruth provided. (not the nr_requests yet, that is our following step if this turns out not to work). Now running 5 days steady... |
Stalled after 11 days. I am at a loss what to do... I now started to up some values like: /proc/sys/net/core/rmem_default because they were kinda low, but I am not sure what else to do... The nr_requests is a per disk setting. Any advise on a good value for that? Default seems 128.... |
@gvdijnsen sorry for the long delay, I've been on a move to a different country and slightly lost track of some things nr_requests should be a pretty high value when running with raid arrays or several drives so nr_requests of 1024 would be a good start, mark however that higher values might lead to slightly higher latency Was there any change on your side ? There were meanwhile several locking issues addressed in December which made it into master (I was also affected of some of them), seemingly only the locking inversion fix made it into 0.6.5.4 release Your issue could be related to #4106 ZFS 0.6.5.3 servers hang trying to get mutexes , since your dmesg messages reminds me of similar output in that issue also referencing: #4166 live-lock in arc_reclaim, blocking any pool IO Tl;dr applying patches: and to ZFS #4124 Fix z_hold_mtx deadlock then set zfs_object_mutex_size to 4096 should clearly help with your issue |
On our side nothing much has changed. We stall and reboot every once in a while (about twice a week) and try to keep load as low as possible until we have a solution. Meanwhile, to my dismay, we are looking at an exit scenario to fall back to ext4 :-(. I have looked at the issues you mentioned and although I am quite positive that these look very much like our problem, so the fixes for them should be able to help us out. However, they are not in the current stable release (6.5.4) and I am not bold enough top roll my own and use it on our production system, so we will have to wait until 6.5.5 comes along. I could try to run it on our test system, but that one doesn't crash because we cannot load it in the same manner as we do in production. Hopefully 6.5.5 will come along soon. For now, I have set nr_requests for all disks to 1024 in production and will keep you updated on the results! |
@gvdijnsen sorry I'm a little late to this issue, but if you could post all the stack traces from a system when this happens I could tell you for sure if the #4106 patches will resolve your issue. These changes have just been merged to master so they may make it in to 0.6.5.5. |
@behlendorf Thanks for looking into this, even if it is late in the game. I am gratedul for every help I can get. These are stack traces for a recent stall just before reboot: Hope this is the same problem that is solved with the patch!! |
Actually it seems that the stack trace is too long to submit here, but I am hoping you can work with this. Let's hope you can verify that our problem is solved with this patch... |
Yes, it looks like your issue should already be addressed in the master branch. And we'll cherry pick those fixes back for the next point release. |
@behlendorf Is there a place where I can find a probable release date for 0.6.5.5? I am kind of anxious to fix this on my production server... |
@gvdijnsen if you don't mind rolling your own version I'd suggest pulling the latest code from the spl/zfs-0.6.5-release branches. The fixes you need are applied there and that where we'll tag 0.6.5.5. https://github.com/zfsonlinux/spl/tree/spl-0.6.5-release |
We have had intermittent crashes of our file server, initially spaced weeks apart, but with the recent updates this has become more frequent. Yesterday it crashed 3 times in 24 hours. Some observations:
I have tried several memory/spl tweaks (which mostly come from the issues her in github), disabled snapshotting, disabled replication, disabled scrubbing and upgraded everything to the latest version available. It is probably load related because we have a slave here that mirrors the master with exactly the same setup and it has never failed.
This is a large system with roughly 80TB of data available, of which 30TB is in use. It has 128G RAM. Special boot options (most have come in there in the past because of other crashes or optimizations):
zfs_arc_max=68719476736
zfs_arc_meta_limit=17179869184
zfs_prefetch_disable=1
spl_kmem_cache_slab_limit=0
zfs_vdev_cache_size=1310720
zfs_vdev_cache_max=131072
zfs_vdev_cache_bshift=17
Ubuntu 14.04.3 LTS
Kernel: 3.13.0-66-generic
zfs-dkms / zfsutils: 0.6.5.3-1~trusty
Any help would be appreciated!
The text was updated successfully, but these errors were encountered: