-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large Setxattr causes /dev/fuse to close on OSX #42
Comments
Ok, also, copying from the command line gives me the following:
|
Maybe the issue is this file has a really large com.apple.ResourceFork xattr Is there some buffer limits somewhere? |
What you're seeing from Serve is osxfuse slamming the door shut on you. That's just collateral damage from whatever actually went wrong. Why that happens is a tougher question. The I'm pretty badly in the middle of a move, but I promise to look at this better in a few weeks. If you can post a small reproducing example, that will significantly ease my role. You should be able to eliminate the actual |
@tv42 Thanks. Best of luck on the move. I'm narrowing things down now. I'm pretty sure its something to do with large xattr's. I'm going to change the size of this attr and see if it works when its smaller. I'll let you know, thanks! |
@tv42 Ok, I tracked it down. Looks like the issue is the size of https://github.com/bazillion/fuse/blob/master/fuse.go#L291 Here it doesn't check the bounds: https://github.com/bazillion/fuse/blob/master/fuse.go#L393 Any ideas on the best way to fix this? Thanks |
@ryanstout I'm not sure what you mean by that. |
@tv42 So I might be off, but it looks like syscall.Read isn't checking on the size of m.buf, its just copying in. (Buffer overrun?) |
I checked, increasing the buffer made it work for me (for that file) |
This might be related to #12, where I had problems with the buffer receiving data from the kernel when increasing iosize. I can imagine that cp tries to put a chunk into the ResourceFork xattr that's larger than buffer. |
@meeee #12 was caused by me dropping the @ryanstout I am quite convinced your doubts about the Read vs bufSize is a red herring, and you saw it work because of confirmation bias. Your underlying bug sounds more dependent on some sort of a timing issue, probably with the dentry cache timing out at just the wrong time. Right now, I have no guess on the actual cause, but the cp error message about extended attributes sounds totally plausible. I feel like to make progress, we need to isolate the problem into something smaller and more easily reproducible. Is your code open source? |
Hmm large xattrs don't currently have tests, and there's no way for the OS to chunk the setxattr into smaller operations (apart from that abandoned remnant of resource fork, that has a Position).. it might actually be that, in order to support larger xattrs, we need to bump up |
Once I push the xattr value above 128kB (maxWrite): Linux: the syscallx.Setxattr client call sees E2BIG error. OS X:
and a hang. Lovely. What an amazing job of error handling osxfuse is doing there. |
@tv42 basically it looks like it's just osxfuse not respecting the
|
Reading https://github.com/osxfuse/kext/blob/a8a109b0081f9b227e5fd2f551604a22d7afef95/fuse_vnops.c#L3273 tells me osxfuse setxattr will return E2BIG only if the xattr value is >16MB (FUSE_DEFAULT_USERKERNEL_BUFSIZE or adjustable apparently globally with a sysctl; not per fuse mount I think). This test also seems to fail to account for the rest of the message, testing just the attribute size; that's probably an osxfuse bug. I don't quite understand what the Next up, there's a hard limit that prevents attrsize > FUSE_REASONABLE_XATTRSIZE (= FUSE_MIN_USERKERNEL_BUFSIZE = 128kB). That min is the minimum value for the sysctl mentioned above. Once again, this seems to fail to account for the rest of the message, probably another osxfuse bug. I don't know much about darwin, but it seems like the actual killing of /dev/fuse happens in So yeah. OSXFUSE does not respect MaxWrite / iosize for setxattr, and fails to handle the error properly (E2BIG). A workaround would be to bump MaxWrite to 128kB (FUSE_REASONABLE_XATTRSIZE above), but I'm hesitant to do that easily because the serving loop already creates quite a lot of garbage; I feel like it needs a buffer reuse mechanism, before a buffer that large is reasonable. Increasing it less than that would just make this crash more rare. Sorry about trying to pin this on something else, you did figure out the right cause; I just didn't want to believe you, because I didn't want to believe in osxfuse failing this badly. In the linux code, the logic for "request won't fit -> make it crap out logic is universal and trivially simple: https://github.com/torvalds/linux/blob/3a2f22b7d0cc64482a91529e23c2570aa0602fa6/fs/fuse/dev.c#L1237 I'm bummed about osxfuse. |
So given that maxWrite is already 128kB, I can get a Going above that, a ~1.2MB xattr value does cause the same old /dev/fuse close. So the Going above 16MB does cause Even dropping the global tunable buffersize from 16MB to 128kB with Can you figure out what the xattr value you had actually was? Is it >128kB but <16MB? As a workaround, bumping maxWrite to the full 16MB (only on Darwin) will cause a lot more GC churn, but at least it'll be harder to make it fail. |
@tv42 Thanks for looking into it. I'm going to hit you up on IM here in a few. The file I was testing has 883k in a xattr. I tried bumping up the buffer to 16MB, but it makes things really slow. Is it possible to reuse the buffer, or would that cause race conditions? Thanks |
Like would it be possible to have a buffer pool? That might actually make things faster since you wouldn't need to be mallocing so often (which I assume is happening now). |
Yes, I want to use sync.Pool for it, but that means I need to first audit the code to see whether returning the buffer to the pool is safe in |
OSX is kept at 16MB to avoid triggering a bug with large Setxattr calls: #42 While this was already using sync.Pool and in theory reusing the buffers, real world isn't as merciful. Apparently 2*16MB plus some waste from other parts of the system was enough to keep triggering GC runs all too often. Decreasing the buffer size helps keep us below that threshold. New buffer size is the largest Write payload size observed on Linux, and the most common size for write-heavy workloads. Improves performance and decreases memory use on Linux. Thanks to Aaron Jacobs <jacobsa@google.com> and Damien Tournoud <damien@platform.sh>.
For posterity: the OSXFUSE large setxattr bug is reported as macfuse/macfuse#293 |
I'm not really sure what to make of this, any help would be much appreciated. On OSX, I have one file that if I copy in, it causes the fs.Serve to return. I tracked this down to here: https://github.com/bazillion/fuse/blob/master/fuse.go#L403 Basically, the syscall.Read returns n == 0 and an error with "operation not supported by device" as the .Error()
Here's what I get if I enable fuse.debug: Sorry, there's a little bit of my debugging logs in there. I'm not really sure what to make of this.
The file has an xattr of com.apple.ResourceFork. I know this is weird, but if I delete that xattr, then it works. Any ideas? Thanks
The text was updated successfully, but these errors were encountered: