effects: taint :noub for memory operations #52186

aviatesk · 2023-11-16T10:49:41Z

Currently our effects system does not track :noub about memory operations. While this sounds unsafe, it's actually not problematic in most contexts. We restrict concrete-evaluation for anything involing memory operations anyway, as mutable values aren't propagated as compile-time constants (except Symbols). However, it turns out that this is actually causing very dangerous situations in REPL completion context, where REPLInterpreter enables aggressive concrete-evaluation that propagates mutable values as constants and ignores the :consisten-cy requirement that is usually tainted by memory operations.

This commit addresses this issue by tainting :noub for memory operations. This commit ends up being somewhat extensive because it would cause significant regression in REPL completions if we make memory ops naively taint :noub. Complicating this further is the Base's uses of boundscheck=false for peak performance of core memory operations, where bounds checking is conducted separately before performing actual memory operations, e.g.:

essentials.jl

function getindex(A::Array, i::Int)
    @boundscheck ult_int(bitcast(UInt, sub_int(i, 1)), bitcast(UInt, length(A))) || throw_boundserror(A, (i,))
    memoryrefget(memoryref(getfield(A, :ref), i, false), :not_atomic, false)
end

Here boundscheck=false should generally taint :noub immediately, but in this kind of case :noub can actually be NOUB_IF_NOINBOUNDS since it has @boundscheck that asserts safety of the memory ops. Note that we could employ a similar technique to getfield if we change the above implementation to something like:

function getindex(A::Array, i::Int)
    memoryrefget(memoryref(getfield(A, :ref), i, @_boundscheck), :not_atomic, @_boundscheck)
end

although this would end up being slower (check the difference at the code LLVM emits on each implementation if interested).

To this end, this commit also introduces new :noub_if_noinbounds setting to @assume_effects and use it within the base implementation. Now UInt8 is not enough to represent EffectsOverride information, the bit is enlarged to UInt16. While this approach might not be ideal, I don't think there is a better way to do this. An alternative I tried before making this commit was to introduce new semantics for Expr(:boundscheck, ...) where Expr(:boundscheck, nothing) will be constant folded to false during codegen while during inference time it does not taint :noub. I felt this further complicates the already complex complext :boundscheck situation, so I decided to go with the approach to expand the settings set of @assume_effects.

vtjnash · 2023-11-16T12:46:13Z

Note that we could employ a similar technique to getfield if we change the above implementation to something like:

Just a minor correction that this would not be correct. It would be safe, but it would permit OOB accesses for the Vector without accurate boundschecking

topolarity · 2023-11-16T18:35:27Z

Thanks @aviatesk !

Here's a small MWE whose segfault appears to be cleaned up by this change:

julia> errs1 = rand(500)
julia> partialsortperm(errs1, 1:2; rev=true)[<tab>

On my machine, this segfaults on master but is resolved by this PR.

vtjnash

SGTM

Keno

Approach LGTM

@BoundsCheck

Currently our effects system does not track `:noub` about memory operations. While this sounds unsafe, it's actually not problematic in most contexts. We restrict concrete-evaluation for anything involing memory operations anyway, as mutable values aren't propagated as compile-time constants (except `Symbol`s). However, it turns out that this is actually causing very dangerous situations in REPL completion context, where `REPLInterpreter` enables aggressive concrete-evaluation that propagates mutable values as constants and ignores the `:consisten`-cy requirement that is usually tainted by memory operations. This commit addresses this issue by tainting `:noub` for memory operations. This commit ends up being somewhat extensive because it would cause significant regression in REPL completions if we make memory ops naively taint `:noub`. Complicating this further is the Base's uses of `boundscheck=false` for peak performance of core memory operations, where bounds checking is conducted separately before performing actual memory operations, e.g.: > essentials.jl ```julia function getindex(A::Array, i::Int) @BoundsCheck ult_int(bitcast(UInt, sub_int(i, 1)), bitcast(UInt, length(A))) || throw_boundserror(A, (i,)) memoryrefget(memoryref(getfield(A, :ref), i, false), :not_atomic, false) end ``` Here `boundscheck=false` should generally taint `:noub` immediately, but in this kind of case `:noub` can actually be `NOUB_IF_NOINBOUNDS` since it has `@boundscheck` that asserts safety of the memory ops. Note that we could employ a similar technique to `getfield` if we change the above implementation to something like: ```julia function getindex(A::Array, i::Int) memoryrefget(memoryref(getfield(A, :ref), i, @BoundsCheck), :not_atomic, @BoundsCheck) end ``` although this would end up being slower (check the difference at the code LLVM emits on each implementation if interested). To this end, this commit also introduces new `:noub_if_noinbounds` setting to `@assume_effects` and use it within the base implementation. Now `UInt8` is not enough to represent `EffectsOverride` information, the bit is enlarged to `UInt16`. While this approach might not be ideal, I don't think there is a better way to do this. An alternative I tried before making this commit was to introduce new semantics for `Expr(:boundscheck, ...)` where `Expr(:boundscheck, nothing)` will be constant folded to `false` during codegen while during inference time it does not taint `:noub`. I felt this further complicates the already complex complext `:boundscheck` situation, so I decided to go with the approach to expand the settings set of `@assume_effects`.

aviatesk requested review from vtjnash and Keno November 16, 2023 10:49

Base automatically changed from avi/memoryop-tfunc-robustness to master November 16, 2023 16:51

vtjnash reviewed Nov 16, 2023

View reviewed changes

Keno approved these changes Nov 17, 2023

View reviewed changes

aviatesk force-pushed the avi/memoryop-noub branch from 4d5cf33 to 1a8e3f2 Compare November 17, 2023 06:29

aviatesk force-pushed the avi/memoryop-noub branch from 1a8e3f2 to e552d83 Compare November 17, 2023 07:15

aviatesk merged commit 1134315 into master Nov 17, 2023

aviatesk deleted the avi/memoryop-noub branch November 17, 2023 16:21

topolarity mentioned this pull request Jan 6, 2024

REPL becomes unresponsive during completion inference #52763

Closed

aravindh-krishnamoorthy mentioned this pull request Jan 17, 2024

Implement circshift(::Tuple, ::Integer) #52438

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

effects: taint :noub for memory operations #52186

effects: taint :noub for memory operations #52186

aviatesk commented Nov 16, 2023 •

edited

Loading

vtjnash commented Nov 16, 2023

topolarity commented Nov 16, 2023

vtjnash left a comment

Keno left a comment

effects: taint :noub for memory operations #52186

effects: taint :noub for memory operations #52186

Conversation

aviatesk commented Nov 16, 2023 • edited Loading

vtjnash commented Nov 16, 2023

topolarity commented Nov 16, 2023

vtjnash left a comment

Choose a reason for hiding this comment

Keno left a comment

Choose a reason for hiding this comment

aviatesk commented Nov 16, 2023 •

edited

Loading