Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deadlock because of zpl_evict_inode requiring more memory #2570

Closed
avg-I opened this issue Aug 5, 2014 · 5 comments
Closed

deadlock because of zpl_evict_inode requiring more memory #2570

avg-I opened this issue Aug 5, 2014 · 5 comments
Milestone

Comments

@avg-I
Copy link
Contributor

avg-I commented Aug 5, 2014

One of our test loads can easily put a system in a deadlocked state where a machine can be pinged but pretty much nothing else works. It seems that this is caused by the following chain of calls:
any call to allocate pages while there is a page shortage -> try_to_free_pages -> super_cache_scan -> prune_icache_sb -> destroy_inode -> zpl_inode_destroy -> iput-> zpl_evict_inode -> zfs_rmnode -> arc_read -> spl_kmem_cache_alloc -> alloc_pages_current.
All active threads seem to be in the above state.

@avg-I
Copy link
Contributor Author

avg-I commented Aug 5, 2014

Output of foreach active bt: https://gist.github.com/avg-I/10f4384486eecf70c903

@behlendorf behlendorf added this to the 0.7.0 milestone Aug 6, 2014
@behlendorf behlendorf added the Bug label Aug 6, 2014
@behlendorf
Copy link
Contributor

Setting the following module options may help with the memory pressure and prevent the issue. They're planned to be become the defaults in the next tag so it would be nice to know if it helps your workload.

options spl spl_kmem_cache_slab_limit=16384
options spl spl_kmem_cache_reclaim=0

Additionally, the patches proposed in #2573 might help by making the last iput in the destroy path asynchronous. They're designed to resolve an entirely different issue but you still might see some benefit.

@ioquatix
Copy link

I've seen this happen recently.

@behlendorf
Copy link
Contributor

@avg-I has this been resolved in newer versions of ZoL or is it still and issue?

@behlendorf
Copy link
Contributor

Closing as stale. There have been several recent fixes to master which should further improve things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants