-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Don't stackoverflow on deeply nested expressions #4876
Conversation
37a8f0d
to
88a3799
Compare
57bb371
to
324c53e
Compare
Added back the second, original stack overflow check since the one in parser did not trigger the second test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question about how stacker interacts with cgo. The code looks fine and the depth limits seem appropriate, but I'm worried that dynamic increasing of the stack may not be well documented when happening within a cgo context which is where this code ends up being used.
libflux/flux-core/src/parser/mod.rs
Outdated
None | ||
} else { | ||
// Numbers are arbitrary and were just picked from the stack docs | ||
Some(stacker::maybe_grow(32 * 1024, 1024 * 1024, || f(self))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know if this stacker thing is safe regarding cgo? It seems like this is intended to grow the stack, but that stack was allocated by cgo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah damn, I just added it as an additional safety. Maybe I should just remove it and accept the risk of stack overflow and if we see another crash due to it we just continue to lower the depth limit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really like to learn if we can expand the stack. Please don't throw out this code, but maybe we should separate it and come up with a plan to understand how this interacts with cgo?
cgo does some strange things when it comes to the stack and I'm not sure I understand it in detail enough to know how it interacts with this. In summary, Go runs with pretty small stacks because the go runtime has its own way of growing the stack interactively. Since C code generally doesn't, the Go code pre-allocates a larger stack before calling into cgo code. The Rust code runs in this context.
It might be safe to use this because the stack should mostly look exactly like a C program would see it, but I just don't know enough about it.
I also think that cgo calls run in a different pthread from the goroutine that called it. So I think there's pthreads that are "cgo" pthreads (with a C stack) and goroutines run in their own pthreads. The goroutines are managed by the Go runtime to run within one of the pthreads (goroutines do not spawn new pthreads). The cgo code runs in its own pthread so when you make a cgo call you're either using one of the existing pthreads for cgo or it autospins up a new one. It then runs the code in the pthread and waits for the response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stacker
seems to work as it should when I run the crashing script through go run ./cmd/flux
. Googling a bit I can't find anything that would say otherwise. Looking at the docs for stacker https://docs.rs/stacker/latest/stacker/ it seems to allocate a new stacck and run the closure there, returning to the orignal stack once the closure returns.
So I think it should be fine, I can't find any reason that it shouldn't work (and I wouldn't expect the stacker
path to really execute since we also have the depth to guard against these kinds of expressions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that seems relatively safe. I don't think we need to do any substantial testing if it works by extending the stack with an extra thread.
Seems to be some issues with wasm might just disable it on wasm, though it seems like stacker should work with wasm |
b703ce4
to
24f77a3
Compare
Just in case another environment has a smaller stack size than my local environment we also grow the stack automatically. We still have the depth limit to prevent the stack from growing unbounded
Since this test has nested binary expressions it does not overflow the stack or hit the depth checking path in the parser so we also needed the second check.
24f77a3
to
4a0580c
Compare
Added stacker just in case another environment has a smaller stack size than my local environment we also grow
the stack automatically. We still have the depth limit to prevent the stack from growing unbounded.
The depth limit was chosen such that the added test did not stack overflow prior to adding stacker. This limit should be a decent approximation that any previous script works and is under the limit or if it is above the limit it would have stack overflowed.
Closes #4712