-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ARM] Kernel NULL pointer dereference in arc_shrink #3517
Comments
@MrStaticVoid Likely triggered in some way by the recent ARC changes. Could you please run gdb on your kernel module and post the output of |
|
@MrStaticVoid OK, thanks. That's where I was originally looking. I have a feeling it is caused by the large block support committed in f1512ee. It requires many more different block sizes to be supported. Assuming you're not using large blocks, it would be interesting to try:
and see what happens. |
@MrStaticVoid I realized that you'll probably want to either add
for consistency. |
I recompiled with
And I am not able to trigger the issue either with the |
@MrStaticVoid The large block support apparently introduced an incompatibility for 32-bit systems. Someone will need to audit all the relevant code for overflows, array sizes, etc. in the context of a 32-bit system. I'm not likely to have the time to do so in the near future. |
@MrStaticVoid @dweeezil nice job quickly running this down to the large block patches. This is exactly the reason why the plan for the next tag involves getting both the large block and ABD, #3441, patches merged. The ABD patches should resolve the additional stress of the virtual memory subsystem cause of supporting large blocks. In fact, it should allow us to properly support all 32-bit arches. @MrStaticVoid it would be great if you could help us shake out and test the ABD patches on ARM. They should be rebased again fairly soon and I'll be going through them carefully myself review and testing the code. |
I may be hitting the same problem on my ARM board (BananaPi): i just upgraded today from 0.6.3 to 0.6.5, this is the error triggered by
Fortunately applying the proposed patch to That being said, i have a couple of other ARM boards (RPi, Odroid) lying around and would like to help test things, but i may need some guidance (i don't even know the meaning of ABD at this point). |
Can I safely close out this issue? Are things now stable with the latest master source? |
I will give it a try. |
I am unable to trigger this bug anymore and my BeagleBone Black has been rock solid. |
Like in #3516, I am playing around with ZFS on a BeagleBone Black board. Using ZFS/SPL 0.6.4, things were mostly stable; using ZFS/SPL HEAD, the
arc_shrink
function dies fairly consistently. The easiest way to trigger it is by doingecho 3 > /proc/sys/vm/drop_caches
, which always produces:The same bug can be triggered a different way by using very memory intensive apps, like git, which causes the kernel to error out like:
Though I should note, the system has plenty of free memory when these bugs occur (100-200 MB free).
As I said, neither of these bugs can be triggered with ZFS 0.6.4. When I get a chance, I will try to do a git bisect and figure out where exactly the bug was introduced unless someone already knows.
The environment is the same as described in #3516.
The text was updated successfully, but these errors were encountered: