-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly guard UpsilonNode unboxed store #51853
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm. Seems like a couple extraneous return statements added to codegen? But otherwise a good cleanup and fix.
src/codegen.cpp
Outdated
@@ -2320,6 +2321,9 @@ static jl_cgval_t convert_julia_type(jl_codectx_t &ctx, const jl_cgval_t &v, jl_ | |||
if (v.TIndex && !jl_is_pointerfree(typ)) { | |||
// discovered that this union-split type must actually be isboxed | |||
if (v.Vboxed) { | |||
if (skip) { | |||
*skip = ctx.builder.CreateNot(emit_exactly_isa(ctx, v, (jl_datatype_t*)typ)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't Vboxed potentially nullptr at runtime here? Maybe test that tindex==0x80 instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is. However, tindex==0x80
check isn't sufficient here either. Needs an extra case in emit_exactly_isa to check for null (in this case only).
The underlying inference bug is still present right? |
why is this on the backport list? we're pretty sure what this was a big introduced a few weeks ago |
This appears to resolve the more standalone MRE (#51852 (comment)) but not the original one in the issue: function f(;kwargs...)
try
kwargs = rand((values(kwargs), (progress=1.0,)))
catch
end
GC.gc()
return kwargs
end
f(; progress="0.5")
f(; progress="0.5")
f(; progress="0.5")
f(; progress="0.5")
f(; progress="0.5") This segfaults for me very reliably locally, with the same signature. |
Indeed, although it takes more than one execution for me, so ... progress? I'll take a look. |
since it calls |
Fair observation. |
Updated MRE:
|
I managed to fix the MRE, but I found a test case (after fixing a related inference bug that was hiding it) for @vtjnash's comment above, so I'll need to fix that tomorrow before we can merge this. |
I'm gonna remove the backport label here. The bug is technically present on 1.10, but we don't have the inference precision to actually see it and I think there is some risk in destabilizing things, esp, since I included that inference precision fix (so it wouldn't backport cleanly anyway). |
I thought about it and it was easier than I thought, so I just did it now. |
What is the inference fix here for? Is it a required part of the change? |
The change is in the details of effect convergence. |
It's a fix to the logic of the previous phic narrowing PR. It's not required, but without it's not possible to write a test for one of the branches of the codegen fix because it's hidden by the inference issue, so I fixed that too. |
CI seems very unhappy about this currently. |
@nanosoldier |
The package evaluation job you requested has completed - possible new issues were detected. |
Another day, another MWE. It looks like the FLoops.jl failure is relevant. Reduced to: julia> foo() = try (a = a, b = 2) catch end
foo (generic function with 1 method)
julia> foo()
ERROR: UndefVarError: `a` not defined
Stacktrace:
[1] foo()
@ Main ./REPL[1]:1
[2] top-level scope
@ REPL[2]:1 Seems like we're incorrectly inferring that the catch block is unreachable. We then illegally optimize away the try-catch: julia> @code_typed optimize=false foo()
CodeInfo(
1 ─ %1 = $(Expr(:enter, #4))
2 ─ %2 = (:a, :b)::Core.Const((:a, :b))
│ %3 = Core.apply_type(Core.NamedTuple, %2)::Core.Const(NamedTuple{(:a, :b)})
│ %4 = Core.tuple(Main.a, 2)::Core.PartialStruct(Tuple{Any, Int64}, Any[Any, Core.Const(2)])
│ %5 = (%3)(%4)::NamedTuple{(:a, :b), <:Tuple{Any, Int64}}
└── $(Expr(:leave, :(%1)))
3 ─ return %5
4 ┄ Core.Const(:($(Expr(:leave, :(%1)))))::Union{}
│ Core.Const(:($(Expr(:pop_exception, :(%1)))))::Union{}
└── Core.Const(:(return nothing))::Union{}
) => NamedTuple{(:a, :b), <:Tuple{Any, Int64}}
julia> @code_typed optimize=true foo()
CodeInfo(
1 ─ nothing::Nothing
│ %2 = Main.a::Any
│ %3 = Core.tuple(%2, 2)::Tuple{Any, Int64}
│ %4 = Core.typeof(%3)::Type{<:Tuple{Any, Int64}}
│ %5 = Core.apply_type(Core.NamedTuple, (:a, :b), %4)::Type{NamedTuple{(:a, :b), var"#s179"}} where var"#s179"<:Tuple{Any, Int64}
│ %6 = %new(%5, %2, 2)::NamedTuple{(:a, :b), <:Tuple{Any, Int64}}
└── return %6
) => NamedTuple{(:a, :b), <:Tuple{Any, Int64}} |
Ok, that should a quick fix. Looks like we're forgetting that the GlobalRef is not nothrow. |
Introduce UNION_BOX_MARKER, to make it easier to grep for all the places where this is being looked at and distinguish them from other uses of 0x80 as a constant.
if isdefined_globalref(g) && isconst(g) | ||
return Const(ccall(:jl_get_globalref_value, Any, (Any,), g)) | ||
end | ||
ty = ccall(:jl_get_binding_type, Any, (Any, Any), g.mod, g.name) | ||
ty === nothing && return Any | ||
return ty | ||
end | ||
abstract_eval_global(M::Module, s::Symbol) = abstract_eval_globalref(GlobalRef(M, s)) | ||
abstract_eval_global(M::Module, s::Symbol) = abstract_eval_globalref_type(GlobalRef(M, s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
abstract_eval_global(M::Module, s::Symbol) = abstract_eval_globalref_type(GlobalRef(M, s)) | |
abstract_eval_global_type(M::Module, s::Symbol) = abstract_eval_globalref_type(GlobalRef(M, s)) |
to reflect the fact that it returns type only? I don't think this definition is being used anywhere though.
In #51852, we are coercing a boxed `Union{@NamedTuple{progress::String}, @NamedTuple{progress::Float64}}` to `@NamedTuple{progress::String}` via convert_julia_type. This results in a jl_cgval_t that has a Vboxed that points to a boxed `@NamedTuple{progress::Float64}` but with a `@NamedTuple{progress::String}` type tag that the up upsilonnode code then tries to unbox into the typed PhiC slot. This ends up treating the Float64 as a pointer and crashing in GC. Avoid this by adding a runtime check that the converted value is actually compatible (we already had this kind of check for the non-boxed cases) and then making the unboxing runtime-conditional on the type of the union matching the type of the phic. Fixes #51852
Value *skip = NULL; | ||
rval_info = convert_julia_type(ctx, rval_info, slot_type, &skip); | ||
if (!allow_mismatch && skip) { | ||
CreateTrap(ctx.builder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this (incorrectly) trap unconditionally trap in the new (v.Vboxed && (v.isboxed || mustbox_union))
case?
This branch seems to be dead at least when compiling the sysimage + stdlibs, but perhaps it'd be a bug if it were exercised?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try to come up with a testcase (or if you can't, you can try pkgevaling this with a compile-time error to find if there's one in the wild).
Currently [1] it is illegal [2] in IRCode to have a GlobalRef in value position that could potentially throw. This is because in IRCode, we want to assign flags to every statement and if there are multiple things with effects in a statement, we lose precision in tracking which they apply to. However, we currently do allow this in `CodeInfo`. Now that we're starting to make more use of flags in inference also, this is becoming annoying (as it did for IRCode), so I would like to do this transformation earlier. This is an attempt to do this during lowering. It is not entirely clear that this is precisely the correct place for it. We could alternatively consider doing it during the global resolve pass in method.c, but that currently does not renumber SSAValues, so doing it during the renumbering inside lowering may be easier. N.B.: This is against #51853, because this needs some of the inference precision improvements in that PR to avoid regressing the try/catch elision tests (which before that PR, we were incorrectly computing effects for statement-position GlobalRefs). [1] 39c278b [2] https://github.com/JuliaLang/julia/blob/2f63cc99fb134fb4adb7f11ba86a4e2ab5adcd48/base/compiler/ssair/verify.jl#L54-L58 --------- Co-authored-by: Jeff Bezanson <jeff.bezanson@gmail.com> Co-authored-by: Oscar Smith <oscardssmith@gmail.com>
In #51852, we are coercing a boxed
Union{@NamedTuple{progress::String}, @NamedTuple{progress::Float64}}
to
@NamedTuple{progress::String}
via convert_julia_type. This results in a jl_cgval_t that hasa Vboxed that points to a boxed
@NamedTuple{progress::Float64}
but with a@NamedTuple{progress::String}
type tag that the up upsilonnode code then tries to unbox into the typed PhiC slot. This ends up treating
the Float64 as a pointer and crashing in GC. Avoid this by adding a runtime check that the converted value
is actually compatible (we already had this kind of check for the non-boxed cases) and then making the
unboxing runtime-conditional on the type of the union matching the type of the phic.