Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression bug in Base.tuple_type_head (1.6.0-RC1 vs 1.5.3) #39988

Closed
rryi opened this issue Mar 11, 2021 · 15 comments · Fixed by #40067
Closed

regression bug in Base.tuple_type_head (1.6.0-RC1 vs 1.5.3) #39988

rryi opened this issue Mar 11, 2021 · 15 comments · Fixed by #40067
Assignees
Labels
regression Regression in behavior compared to a previous version
Milestone

Comments

@rryi
Copy link

rryi commented Mar 11, 2021

I am working on package BitStructs and encounter a strange error running my benchmarks with julia 1.6.0-RC1.

Environment:

System Ryzen 1700, 16 GB RAM, Windows 10 Pro, version 20H2
Julia 64 bit 1.6.0-rc1 (2021-02-06)
Package https://github.com/rryi/BitStructs.jl, commit 50ea2e46b77d098ba61ab65d15e31aa289848e24

Steps to reproduce:

  • use a Win10 machine
  • install julia 1.6.0 RC1 64 bit
  • download/clone package BitStructs.jl to a local directory
  • enter that directory in a CMD shell
  • start julia
  • activate package dir
  • add BenchmarkTools (I did it some days ago with v0.5.0. Retry with version 0.6.0 had the same result)
  • exit julia
  • execute in CMD shell: julia test\crash.jl

My log for that, showing the crash under 1.6.0RC1

Microsoft Windows [Version 10.0.19042.867]
(c) 2020 Microsoft Corporation. Alle Rechte vorbehalten.

C:\Users\RR\julia\BitStructs>julia test\crash.jl
  Activating environment at `C:\Users\RR\julia\BitStructs\Project.toml`
S(S_RUNNING, BIGMINUS, Float16(-1.0), true, false, false, true, 0, 'a', 'c', 0x0001, 0x0002, 3, 4)BitStruct{NamedTuple{(:status, :strange, :sign, :flag1, :flag2, :flag3, :flag4, :bit1, :ac, :lc, :id1, :id2, :delta1, :delta2), Tuple{ProcStatus, Strange, Sign, Bool, Bool, Bool, Bool, BInt{1}, AsciiChar, Latin1Char, BUInt{9}, BUInt{12}, BInt{9}, BInt{9}}}} 0x0200c00802c78531
  status::ProcStatus = S_RUNNING
  strange::Strange = BIGMINUS
  sign::Sign = -1.0
  flag1::Bool = true
  flag2::Bool = false
  flag3::Bool = false
  flag4::Bool = true
  bit1::BInt{1} = 0
  ac::AsciiChar = 'a'
  lc::Latin1Char = 'c'
  id1::BUInt{9} = 0x0000000000000001
  id2::BUInt{12} = 0x0000000000000002
  delta1::BInt{9} = 3
  delta2::BInt{9} = 4
end
struct/BitStruct simple field access ENTER:
  2.700 ns (0 allocations: 0 bytes)
  2.700 ns (0 allocations: 0 bytes)
struct/BitStruct field access in @noinline function ENTER:
  2.600 ns (0 allocations: 0 bytes)
  9.500 ns (0 allocations: 0 bytes)
set 2 fields on struct then BitStruct ENTER:
  2.600 ns (0 allocations: 0 bytes)
  358.571 ns (3 allocations: 48 bytes)
set 2 fields on struct then BitStruct (large struct) ENTER:
  2.600 ns (0 allocations: 0 bytes)
the statements executed by set2fields run, if executed directly
BitStruct{NamedTuple{(:status, :strange, :sign, :flag1, :flag2, :flag3, :flag4, :bit1, :ac, :lc, :id1, :id2, :delta1, :delta2), Tuple{ProcStatus, Strange, Sign, Bool, Bool, Bool, Bool, BInt{1}, AsciiChar, Latin1Char, BUInt{9}, BUInt{12}, BInt{9}, BInt{9}}}} 0x0200c00804c78511
  status::ProcStatus = S_RUNNING
  strange::Strange = BIGMINUS
  sign::Sign = -1.0
  flag1::Bool = false
  flag2::Bool = false
  flag3::Bool = false
  flag4::Bool = true
  bit1::BInt{1} = 0
  ac::AsciiChar = 'a'
  lc::Latin1Char = 'c'
  id1::BUInt{9} = 0x0000000000000002
  id2::BUInt{12} = 0x0000000000000002
  delta1::BInt{9} = 3
  delta2::BInt{9} = 4
end
Calling set2fields(bs) causes a crash

Unreachable reached at 00000000610aa1b6

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ILLEGAL_INSTRUCTION at 0x610aa1b6 -- set2fields at C:\Users\RR\julia\BitStructs\test\benchmark.jl:293
in expression starting at C:\Users\RR\julia\BitStructs\test\benchmark.jl:316
set2fields at C:\Users\RR\julia\BitStructs\test\benchmark.jl:293
unknown function (ip: 00000000610aa1d9)
jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1703 [inlined]
do_call at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:115
eval_value at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:204
eval_stmt_value at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:155 [inlined]
eval_body at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:575
jl_interpret_toplevel_thunk at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:669
jl_toplevel_eval_flex at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:879
jl_toplevel_eval_flex at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:827
jl_toplevel_eval at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:888 [inlined]
jl_toplevel_eval_in at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:931
eval at .\boot.jl:360 [inlined]
include_string at .\loading.jl:1090
_include at .\loading.jl:1144
include at .\client.jl:444
jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1703 [inlined]
do_call at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:115
eval_value at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:204
eval_stmt_value at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:155 [inlined]
eval_body at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:575
jl_interpret_toplevel_thunk at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:669
jl_toplevel_eval_flex at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:879
jl_toplevel_eval_flex at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:827
jl_toplevel_eval at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:888 [inlined]
jl_toplevel_eval_in at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:931
eval at .\boot.jl:360 [inlined]
include_string at .\loading.jl:1090
_include at .\loading.jl:1144
include at .\Base.jl:386
exec_options at .\client.jl:285
_start at .\client.jl:485
jfptr__start_33914.clone_1 at C:\RRtool\julia-1.6.0-RC1\lib\julia\sys.dll (unknown line)
jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1703 [inlined]
true_main at /cygdrive/c/buildbot/worker/package_win64/build/src\jlapi.c:557
repl_entrypoint at /cygdrive/c/buildbot/worker/package_win64/build/src\jlapi.c:699
mainCRTStartup at /cygdrive/c/buildbot/worker/package_win64/build/cli\loader_exe.c:51
BaseThreadInitThunk at C:\WINDOWS\System32\KERNEL32.DLL (unknown line)
RtlUserThreadStart at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
Allocations: 19362754 (Pool: 19359441; Big: 3313); GC: 70

C:\Users\RR\julia\BitStructs>

here the run with julia 1.5.3, no crash:

C:\Users\RR\julia\BitStructs>C:\RRtool\julia-1.5.3\bin\julia.exe test\crash.jl
 Activating environment at `C:\Users\RR\julia\BitStructs\Project.toml`
S(S_RUNNING, BIGMINUS, Float16(-1.0), true, false, false, true, 0, 'a', 'c', 0x0001, 0x0002, 3, 4)BitStruct{NamedTuple{(:status, :strange, :sign, :flag1, :flag2, :flag3, :flag4, :bit1, :ac, :lc, :id1, :id2, :delta1, :delta2),Tuple{ProcStatus,Strange,Sign,Bool,Bool,Bool,Bool,BInt{1},AsciiChar,Latin1Char,BUInt{9},BUInt{12},BInt{9},BInt{9}}}} 0x0200c00802c78531
  status::ProcStatus = S_RUNNING
  strange::Strange = BIGMINUS
  sign::Sign = -1.0
  flag1::Bool = true
  flag2::Bool = false
  flag3::Bool = false
  flag4::Bool = true
  bit1::BInt{1} = 0
  ac::AsciiChar = 'a'
  lc::Latin1Char = 'c'
  id1::BUInt{9} = 0x0000000000000001
  id2::BUInt{12} = 0x0000000000000002
  delta1::BInt{9} = 3
  delta2::BInt{9} = 4
end
struct/BitStruct simple field access ENTER:
  2.699 ns (0 allocations: 0 bytes)
  397.995 ns (1 allocation: 32 bytes)
struct/BitStruct field access in @noinline function ENTER:
  2.399 ns (0 allocations: 0 bytes)
  364.423 ns (1 allocation: 32 bytes)
set 2 fields on struct then BitStruct ENTER:
  2.399 ns (0 allocations: 0 bytes)
  8.733 µs (5 allocations: 96 bytes)
set 2 fields on struct then BitStruct (large struct) ENTER:
  2.399 ns (0 allocations: 0 bytes)
the statements executed by set2fields run, if executed directly
BitStruct{NamedTuple{(:status, :strange, :sign, :flag1, :flag2, :flag3, :flag4, :bit1, :ac, :lc, :id1, :id2, :delta1, :delta2),Tuple{ProcStatus,Strange,Sign,Bool,Bool,Bool,Bool,BInt{1},AsciiChar,Latin1Char,BUInt{9},BUInt{12},BInt{9},BInt{9}}}} 0x0200c00804c78511
  status::ProcStatus = S_RUNNING
  strange::Strange = BIGMINUS
  sign::Sign = -1.0
  flag1::Bool = false
  flag2::Bool = false
  flag3::Bool = false
  flag4::Bool = true
  bit1::BInt{1} = 0
  ac::AsciiChar = 'a'
  lc::Latin1Char = 'c'
  id1::BUInt{9} = 0x0000000000000002
  id2::BUInt{12} = 0x0000000000000002
  delta1::BInt{9} = 3
  delta2::BInt{9} = 4
end
Calling set2fields(bs) causes a crash
b1 set field in loop by /= ENTER:
  2.122 µs (0 allocations: 0 bytes)
b2 set field in loop by set(..) ENTER:
  2.133 µs (0 allocations: 0 bytes)
struct/BitStruct: access 4 fields in a loop ENTER:
  103.389 ns (0 allocations: 0 bytes)
  66.400 µs (70 allocations: 1.09 KiB)
struct/BitStruct: write 4 fields in a loop ENTER:
  173.947 ns (0 allocations: 0 bytes)
struct/BitStruct: access 4 fields in a loop, large struct ENTER:
  104.979 ns (0 allocations: 0 bytes)
  105.200 µs (0 allocations: 0 bytes)
struct/BitStruct: write 4 fields in a loop, large struct ENTER:
  175.520 ns (0 allocations: 0 bytes)
struct/BitStruct: access 4 fields, direct code ENTER:
  1.799 ns (0 allocations: 0 bytes)
  1.070 µs (1 allocation: 32 bytes)
function parameter benchmark: struct/BitStruct/parameterlist ENTER:
  1.590 µs (17 allocations: 1.38 KiB)
  1.610 µs (18 allocations: 1.39 KiB)

C:\Users\RR\julia\BitStructs>
@rryi
Copy link
Author

rryi commented Mar 11, 2021

Tracking down the error

Crash happens calling method set2fields(bs), and also on several other statements (commented out in benchmark.js with an annotation that it crashes).

I extracted the statements which are executed in set2fields(bs) and run them directly - no crash. Is included in the current benchmark.jl version.

I used the VScode debugger and found an annoying exception. Steps to reproduce:

  • start VScode with julia extension
  • open folder of BitStructs (and activate julia environment there)
  • comment out line 316 in benchmark.jl
  • run benchmark.jl (should show no errors)
  • enter in VScode REPL: @Enter set2fields(bs)
  • step through until you arrive here:

debug error 2021-03-11 23_12_35-deprecated jl - BitStructs - Visual Studio Code

I do not understand that error message, To me, it looks like a syntactically correct call with a type as parameter.
I suppose it is closely related to the crash when running without debugger.

@rryi
Copy link
Author

rryi commented Mar 11, 2021

Some words on my usage of tuple_type_head

I was looking for a way to persuade julia compiler to do (better) constant propagation. I want my method _fielddescr to "compile away", it is a pure function (as long as the functions it calls are also pure and not redefined), and has only type parameters. My expectation was the compiler will replace its body by a constant expression. Pure is not Base.@pure - I tried it, the discussion on it lead to a PR currently in work. See here

My attempt to use tuple_type_head is based on an idea of tim.holy recently published here:

[quote="tim.holy, post:53, topic:55278"]
You could try doing the recursion in the type domain, Tuple{:a, :b, :c} using Base.tuple_type_head and Base.tuple_type_tail .
[/quote]

First attempt using julia 1.5.3 did not succeed, but with julia 1.6.0RC1, it worked as expected (until the crash). The alternative @generated turned out to be invalid, due to a "new method not available in old world" problem. Code is still there (function @generated function __fielddescr) and even faster, but fails if new types are introduced and referenced in __fielddescr.

To me, it was really surprising that the julia compiler does better with a recursive formulation than an iteration-based one - being an old school programmer who learned ages ago "for performance, try to replace recursion by iteration". Congratulations to the julia core team for its brilliant work!

tuple_type_head is marked deprecated now. Any idea what I could try to replace tuple_type_head and tuple_type_tail in my context?

@JeffBezanson JeffBezanson added the regression Regression in behavior compared to a previous version label Mar 12, 2021
@JeffBezanson
Copy link
Member

Thanks for the detailed report. I can reproduce this, but it's not immediately obvious what's going on. Could you try to make a self-contained minimal reproducer? That often helps accelerate the process while I continue to try to dig into it.

@rryi
Copy link
Author

rryi commented Mar 12, 2021

minimal reproduction

I tried to isolate the crash.

please clone https://github.com/rryi/BitStructs.jl, commit 82d8a0a

The most simplified version to reproduce the crash is in crashminimal.jl
Just run crashminimal.jl instead of crash.jl, all other steps as in my 1st post.

results of attempts to further reduce crash producer

I tried to simplify even more, but all steps resulted in code which did not crash any more. I think these steps are of interest in your analysis, because they give hints on the critical construct in the code. I documented them in crash2.jl

Removing one of the two expressions in set2fields4 also removes the crash. See set2fields5 and set2fields5. My next try was so substitute the syntactical sugar formulation bs /= :id1, bs.id2 by its definition bs = BitStruct.set(bs, :id1, bs.id2), and that became very interesting and is the result running crash2.jl, here a log:

BitStruct{NamedTuple{(:flag1, :flag2, :id1, :id2), Tuple{Bool, Bool, BUInt{9}, BUInt{12}}}}
2     
UInt64
ERROR: LoadError: type UnionAll has no field set
Stacktrace:
 [1] getproperty(x::Type, f::Symbol)
   @ Base .\Base.jl:28
 [2] set2fields8(bs::BitStruct{NamedTuple{(:flag1, :flag2, :id1, :id2), Tuple{Bool, Bool, BUInt{9}, BUInt{12}}}})
   @ Main ~\julia\BitStructs\test\crash2.jl:66
 [3] top-level scope
   @ ~\julia\BitStructs\test\crash2.jl:95
in expression starting at c:\Users\RR\julia\BitStructs\test\crash2.jl:95

The code producing it is a call of set2fields8:

function set2fields8(bs::T) where T <: BitStruct
    println(typeof(bs))
    println(bs.id2)
    println(typeof(bs.id2))
    bs = BitStruct.set(bs, :id1, bs.id2) # this line causes the error 
    bs /= :flag1, bs.flag2
end

My analysis on crash2.jl error

The expression bs.id2 inside of BitStruct.set(bs, :id1, bs.id2) was compiled to a call of getproperty defined in Base.jl.
It fails, because typeof(bs) is no struct but a primitive type.

My expectation is that bs.id2 gets compiled to a call of
function Base.getproperty(x::BitStruct{T},s::Symbol) where T<:NamedTuple
in bitstruct.jl, line 129

typeof(bs) is reported in stack trace, I double checked with the println statement inside set2fields8.
Both protocol its a BitStruct{NamedTuple{...}}, that should qualify the type signature of my custom getproperty.

To check that, I put bs.id2 in a println call just above the statement on error, there it is compiled as expected, the correct value is printed.

A vague guess: type inference problem in combination with syntactical transformation from bs.id2 to getproperty(bs,:id2)? BitStruct.set has the parameter signature (simplified) (BitStruct,Any). Is that Any promoted as inferred type of bs to getproperty(bs,:id2) when compiling the actual parameter bs.id2 in BitStruct.set(bs, :id1, bs.id2) ?

@vtjnash
Copy link
Member

vtjnash commented Mar 12, 2021

BitStruct appears to be a type, which doesn't have the field set (as the error message says), and it would be type-piracy to add one. Did you mean the package name BitStructs there?

@rryi
Copy link
Author

rryi commented Mar 12, 2021

You are right, it was a typo. I will continue with drilldown.

@rryi
Copy link
Author

rryi commented Mar 12, 2021

I have corrected the typo in set2fields8, and I could replace the 2nd assignment by manually inlined code.
A new "minimal crash" version is in https://github.com/rryi/BitStructs.jl, commit 43cb309

plz execute ´julia.exe test\crash.jl´ to reproduce the crash.

Have a look at test\nocrash.jl as a documentation of experiments to narrow down error preconditions.

function set2fields8(bs::T) where T <: BitStruct
    bs = BitStructs.set(bs, :id1, bs.id2)
    bs = BitStructs.set(bs, :flag1, bs.flag2) # crash
end

is the best I could do to isolate the bug. Removing the 1st line in it removes the crash. Replacing the 2nd line with manually inlined code of the call of BitStructs.set also removes the crash. This is coded in file test\nocrash.jl and verified by a run of it..

Crash log:

C:\Users\RR\julia\BitStructs>C:\RRtool\julia-1.6.0-rc1\bin\julia.exe test\crash.jl
  Activating environment at `C:\Users\RR\julia\BitStructs\Project.toml`

Unreachable reached at 0000000060ff45a4

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ILLEGAL_INSTRUCTION at 0x60ff45a4 -- set2fields8 at C:\Users\RR\julia\BitStructs\test\crash.jl:14
in expression starting at C:\Users\RR\julia\BitStructs\test\crash.jl:18
set2fields8 at C:\Users\RR\julia\BitStructs\test\crash.jl:14
unknown function (ip: 0000000060ff45c9)
jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1703 [inlined]
do_call at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:115
eval_value at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:204
eval_stmt_value at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:155 [inlined]
eval_body at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:575
jl_interpret_toplevel_thunk at /cygdrive/c/buildbot/worker/package_win64/build/src\interpreter.c:669
jl_toplevel_eval_flex at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:879
jl_toplevel_eval_flex at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:827
jl_toplevel_eval at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:888 [inlined]
jl_toplevel_eval_in at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:931
eval at .\boot.jl:360 [inlined]
include_string at .\loading.jl:1090
_include at .\loading.jl:1144
include at .\Base.jl:386
exec_options at .\client.jl:285
_start at .\client.jl:485
jfptr__start_33914.clone_1 at C:\RRtool\julia-1.6.0-rc1\lib\julia\sys.dll (unknown line)
jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1703 [inlined]
true_main at /cygdrive/c/buildbot/worker/package_win64/build/src\jlapi.c:557
repl_entrypoint at /cygdrive/c/buildbot/worker/package_win64/build/src\jlapi.c:699
mainCRTStartup at /cygdrive/c/buildbot/worker/package_win64/build/cli\loader_exe.c:51
BaseThreadInitThunk at C:\WINDOWS\System32\KERNEL32.DLL (unknown line)
RtlUserThreadStart at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
Allocations: 710918 (Pool: 710752; Big: 166); GC: 1

C:\Users\RR\julia\BitStructs>

Hope that helps.

@JeffBezanson
Copy link
Member

Thank you, this should be very helpful.

@rryi
Copy link
Author

rryi commented Mar 12, 2021

I had another debugger session. After executing nocrash.jl, I entered in VScode REPL ´julia> @Enter set2fields8(bs)´.

I stepped in set2fields8, tried to step over the first line in the method, and got
image

tuple_type_head is called with a tuple type, but it is not a subtype of NTuple{N,DataType}, it is a subtype of NTuple{N,Symbol}. Is there a change in Release 1.6.0 which requires a stricter parameter definition for tuple_type_head, which is not reflected in the function signature? tuple_type_head is not exported and not documented in the API doc, so changing its behavior is formally a nonbreaking change in a minor release.

Function definition of tuple_type_head has this signature, in 1.6.0 as well as in 1.5.3:

function tuple_type_head(T::Type) end

The function name suggests more narrow:

function tuple_type_head(t::Type{T}) where T <: Tuple end

This is what I had in mind, and is fulfilled in all my calls. And in 1.5.3, it is required for calling fieldtype:

tuple_type_head(T::Type) = (@_pure_meta; fieldtype(T::Type{<:Tuple}, 1))

In 1.6.0, implementation is relaxed a lot:

tuple_type_head(T::Type) = fieldtype(T, 1)

The error message indicates it is required something like that:

function tuple_type_head(t::Type{T}) where T <: NTuple{N,DataType} where N end

This signature is violated by my call, and I suppose there are more such cases. At least one is documented here, it was my inspiration and template. My impression from the formulation there was using the tuple_type_* functions is a common practice in julia itself and in external packages.

Here the debugger view just before the error is shown:
2021-03-12 22_01_26-bitstruct jl - BitStructs - Visual Studio Code
I tried to step in further, to see what happens in fieldtype(T,1). It does not work. Next step is the TypeError shown in my 1st screenshot. Why can I not step into fieldtype? Is fieldtype an intrinsic function?

Next, very important question: why is the TypeError exception shown only in the debugger and not propagated upwards in the chain of stack frames of active calls? I have no catch anywhere in my code. An exception thrown from any code I call, should stop the program and show up in REPL/console. There is nothing. Is it an indication that runtime stack is already corrupt? That would explain the crash.

Last question: why does my program work (as far as I can see) correctly in so many calls of tuple_type_head with the signature ´tuple_type_head(t::T) where T <: NTuple{N,Symbol} where N´?

@JeffBezanson
Copy link
Member

Symbols as type parameters of Tuple are a bit of a strange case --- the parameters of Tuple{...} are supposed to be the types of elements, but a symbol is not a type (i.e. there is no value x such that typeof(x) isa Symbol holds). So for example Tuple{:a} is not a subtype of Tuple{Symbol}; :a is an instance of Symbol, not a subtype of it.

I'm not sure where the TypeError is coming from; possibly the debugger itself. Will look into it.

@vtjnash
Copy link
Member

vtjnash commented Mar 12, 2021

I think it is a missing case for Tuple/NamedTuple from this recent PR:

commit 8c11d3c
Author: Keno Fischer keno@juliacomputing.com
Date: Mon Oct 19 18:38:47 2020 -0400

tfuncs: Be more robust in the face of uninhabited types (#37945)
Fixes #37943

julia> struct A{T}
         b::T
       end

julia> A{:a}
A{:a}

julia> fieldtype(ans, 1)
Union{}

julia> fieldtype(Tuple{:a}, 1)
:a

@rryi
Copy link
Author

rryi commented Mar 13, 2021

@JeffBezanson

Symbols as type parameters of Tuple are a bit of a strange case

Maybe this is the root of the problem. Looks like I did abuse tuple type mechanics.

Tuple{:a,:b,:c} is meant "pure meta", nothing to be instantiated. It-s just a vehicle to lift a tuple of symbols onto type parameter level, so that it is used for dispatch. It-s not my idea, I just adopted it without much thinking about it.

@vtjnash

julia> fieldtype(Tuple{:a}, 1)
:a

fieldtype respective tuple_type_head extract a parameter of a tuple type. They don't care about the type of the parameter, as long as it is isbits. Some more examples:

julia> for i in 1:4 println(fieldtype(Tuple{true,2,3.0,(1,2,3,4)},i)) end
true
2
3.0
(1, 2, 3, 4)

I assume it is an undocumented feature. It just works ... well, most of the time, if the crash is related to it.

@rryi
Copy link
Author

rryi commented Mar 13, 2021

@JeffBezanson

I'm not sure where the TypeError is coming from; possibly the debugger itself.

Hmm ... very good point. Using the debugger is my standard approach to inspect what happens nearby a unexpected error, to get close to the exact location of a strange interruption of normal program flow. The crash falls into that category - julia stack trace ends with my method, which looks pretty harmless at first glance. Within the debugger, I found that TypeError, It suggested to me it might be the cause of problems.

Using the debugger might be completely inadequate in this case. I was stepping into code which I expect and intend that it is never executed at runtime in compiled code. Have a look at this:

@inline function _fielddescr(::Type{BitStruct{T}},::Val{s}) where {T<:NamedTuple,s}
    _fielddescr(Tuple{T.parameters[1]...}, T.parameters[2],Val(s),0)
end

The whole construction is an attempt to persuade the compiler to completely compile it away, doing recursive constant propagation. It-s a constant expression. The call inside is designed as a pure function. In a former version I even tried to tell it the compiler the hard way, using Base.@pure. That was put in question if it is correct use, see discussion here which lead to PR 39954..

In julia 1.6.0, compiler agrees and does constant propagation. Proof in REPL after running nocrash.jl:

julia> BitStructs._fielddescr(BS,Val(:id1))
(BUInt{9}, 2, 9)

julia> @code_llvm(BitStructs._fielddescr(BS,Val(:id1)))
;  @ C:\Users\RR\julia\BitStructs\src\bitstruct.jl:113 within `_fielddescr'
; Function Attrs: uwtable
define void @julia__fielddescr_3767({ {}*, i64, i64 }* noalias nocapture sret %0, {}** noalias nocapture %1) #0 {
top:
;  @ C:\Users\RR\julia\BitStructs\src\bitstruct.jl:114 within `_fielddescr'
  store {}* inttoptr (i64 460809232 to {}*), {}** %1, align 8
  %.repack = getelementptr inbounds { {}*, i64, i64 }, { {}*, i64, i64 }* %0, i64 0, i32 0
  store {}* inttoptr (i64 460809232 to {}*), {}** %.repack, align 8
  %.repack1 = getelementptr inbounds { {}*, i64, i64 }, { {}*, i64, i64 }* %0, i64 0, i32 1
  %2 = bitcast i64* %.repack1 to <2 x i64>*
  store <2 x i64> <i64 2, i64 9>, <2 x i64>* %2, align 8
  ret void
}

julia> @code_native(BitStructs._fielddescr(BS,Val(:id1)))
        .text
; ┌ @ bitstruct.jl:113 within `_fielddescr'
        pushq   %rbp
        movq    %rsp, %rbp
; │ @ bitstruct.jl:114 within `_fielddescr'
        movq    $460809232, (%rdx)              # imm = 0x1B776410
        movabsq $.rodata.cst16, %rdx
        movq    %rcx, %rax
        vmovaps (%rdx), %xmm0
        movq    $460809232, (%rax)              # imm = 0x1B776410
        vmovups %xmm0, 8(%rax)
        popq    %rbp
        retq
        nopw    (%rax,%rax)
; └

My impression from benchmarking BitStructs is, that constant propagation works in simple call scenarios, but not in more complex ones. This was under investigation, but literally interrupted by the crash.

Crash caused by too much stress for the compiler with constant propagation, producing corrupted code?

@rryi
Copy link
Author

rryi commented Mar 13, 2021

I tried after running nocrash.jl

@code_llvm set2fields8(bs)

listing (120 lines) ends with this:

; │ @ nocrash.jl:66 within `set2fields8'
        movl    $294688752, %ecx                # imm = 0x119097F0
        movq    %rsi, %rdx
        movl    $2, %r8d
        movq    %rdi, 32(%rsp)
        movq    %rdi, 88(%rsp)
        movq    $494170488, 40(%rsp)            # imm = 0x1D747178
        callq   *%rbx
        movl    $423862888, %ecx                # imm = 0x1943A268
        movq    %rsi, %rdx
        movl    $3, %r8d
        movq    %rdi, 32(%rsp)
        movq    $494170448, 40(%rsp)            # imm = 0x1D747150
        movq    %rax, 80(%rsp)
        movq    %rax, 48(%rsp)
        callq   *%rbx
        ud2
        nopw    %cs:(%rax,%rax)
; └

Cited from wikipedia: ´UD2 Generates an invalid opcode exception. This instruction is provided for software testing to explicitly generate an invalid opcode. The opcode for this instruction is reserved for this purpose.´

A quick scan shows: all methods set2field* marked as crashing generate code ending with ud2, all marked as not crashing do not have ud2 in their @code_native output.

A quick web check verifies ud2 is generated by LLVM in special situations. Maybe this not so old post is interesting. cited from it:

[UD2] can be generated by Clang/LLVM (and gcc) when compiling with flags like -fsanitize=undefined -fsanitize-trap=all.
...
The UD2 instruction is invoked if the double is outside of the bounds of a signed int64.

@code_llvm set2fields8(bs) ends with

;  @ c:\Users\RR\julia\BitStructs\test\nocrash.jl:66 within `set2fields8'
  store {}* %30, {}** %.sub, align 8
  store {}* inttoptr (i64 494170488 to {}*), {}** %26, align 8
  %31 = call nonnull {}* @jl_apply_generic({}* inttoptr (i64 294688752 to {}*), {}** nonnull %.sub, i32 2)
  store {}* %31, {}** %25, align 16
  store {}* %30, {}** %.sub, align 8
  store {}* inttoptr (i64 494170448 to {}*), {}** %26, align 8
  store {}* %31, {}** %27, align 8
  %32 = call nonnull {}* @jl_apply_generic({}* inttoptr (i64 423862888 to {}*), {}** nonnull %.sub, i32 3)
  call void @llvm.trap()
  unreachable
}

I assume 'call void @llvm.trap()' is compiled to UD2 in machine code, Does julia compiler do such things as "last resort"?

@vtjnash
Copy link
Member

vtjnash commented Mar 13, 2021

Yes, it is optimized assuming that the return value is Union{}, and thus is not prepared to handle a Symbol. See the previous PR fix and issue report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression Regression in behavior compared to a previous version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants