-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
effects: taint :noub for memory operations #52186
Conversation
Just a minor correction that this would not be correct. It would be safe, but it would permit OOB accesses for the Vector without accurate boundschecking |
Thanks @aviatesk ! Here's a small MWE whose segfault appears to be cleaned up by this change: julia> errs1 = rand(500)
julia> partialsortperm(errs1, 1:2; rev=true)[<tab> On my machine, this segfaults on master but is resolved by this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach LGTM
4d5cf33
to
1a8e3f2
Compare
Currently our effects system does not track `:noub` about memory operations. While this sounds unsafe, it's actually not problematic in most contexts. We restrict concrete-evaluation for anything involing memory operations anyway, as mutable values aren't propagated as compile-time constants (except `Symbol`s). However, it turns out that this is actually causing very dangerous situations in REPL completion context, where `REPLInterpreter` enables aggressive concrete-evaluation that propagates mutable values as constants and ignores the `:consisten`-cy requirement that is usually tainted by memory operations. This commit addresses this issue by tainting `:noub` for memory operations. This commit ends up being somewhat extensive because it would cause significant regression in REPL completions if we make memory ops naively taint `:noub`. Complicating this further is the Base's uses of `boundscheck=false` for peak performance of core memory operations, where bounds checking is conducted separately before performing actual memory operations, e.g.: > essentials.jl ```julia function getindex(A::Array, i::Int) @BoundsCheck ult_int(bitcast(UInt, sub_int(i, 1)), bitcast(UInt, length(A))) || throw_boundserror(A, (i,)) memoryrefget(memoryref(getfield(A, :ref), i, false), :not_atomic, false) end ``` Here `boundscheck=false` should generally taint `:noub` immediately, but in this kind of case `:noub` can actually be `NOUB_IF_NOINBOUNDS` since it has `@boundscheck` that asserts safety of the memory ops. Note that we could employ a similar technique to `getfield` if we change the above implementation to something like: ```julia function getindex(A::Array, i::Int) memoryrefget(memoryref(getfield(A, :ref), i, @BoundsCheck), :not_atomic, @BoundsCheck) end ``` although this would end up being slower (check the difference at the code LLVM emits on each implementation if interested). To this end, this commit also introduces new `:noub_if_noinbounds` setting to `@assume_effects` and use it within the base implementation. Now `UInt8` is not enough to represent `EffectsOverride` information, the bit is enlarged to `UInt16`. While this approach might not be ideal, I don't think there is a better way to do this. An alternative I tried before making this commit was to introduce new semantics for `Expr(:boundscheck, ...)` where `Expr(:boundscheck, nothing)` will be constant folded to `false` during codegen while during inference time it does not taint `:noub`. I felt this further complicates the already complex complext `:boundscheck` situation, so I decided to go with the approach to expand the settings set of `@assume_effects`.
1a8e3f2
to
e552d83
Compare
Currently our effects system does not track
:noub
about memory operations. While this sounds unsafe, it's actually not problematic in most contexts. We restrict concrete-evaluation for anything involing memory operations anyway, as mutable values aren't propagated as compile-time constants (exceptSymbol
s). However, it turns out that this is actually causing very dangerous situations in REPL completion context, whereREPLInterpreter
enables aggressive concrete-evaluation that propagates mutable values as constants and ignores the:consisten
-cy requirement that is usually tainted by memory operations.This commit addresses this issue by tainting
:noub
for memory operations. This commit ends up being somewhat extensive because it would cause significant regression in REPL completions if we make memory ops naively taint:noub
. Complicating this further is the Base's uses ofboundscheck=false
for peak performance of core memory operations, where bounds checking is conducted separately before performing actual memory operations, e.g.:Here
boundscheck=false
should generally taint:noub
immediately, but in this kind of case:noub
can actually beNOUB_IF_NOINBOUNDS
since it has@boundscheck
that asserts safety of the memory ops. Note that we could employ a similar technique togetfield
if we change the above implementation to something like:although this would end up being slower (check the difference at the code LLVM emits on each implementation if interested).
To this end, this commit also introduces new
:noub_if_noinbounds
setting to@assume_effects
and use it within the base implementation. NowUInt8
is not enough to representEffectsOverride
information, the bit is enlarged toUInt16
. While this approach might not be ideal, I don't think there is a better way to do this. An alternative I tried before making this commit was to introduce new semantics forExpr(:boundscheck, ...)
whereExpr(:boundscheck, nothing)
will be constant folded tofalse
during codegen while during inference time it does not taint:noub
. I felt this further complicates the already complex complext:boundscheck
situation, so I decided to go with the approach to expand the settings set of@assume_effects
.