Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new macro @outline, and use it in @assert. #57122

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

NHDaly
Copy link
Member

@NHDaly NHDaly commented Jan 21, 2025

Adds a new macro @outline, and uses it in @assert.

This implements the simplest suggestion in #29688.
It would still be excellent if the compiler would automatically perform this optimization for any throw() statement, but it sounds like introducing automatic outlining for such cases would be a difficult implementation task in the Compiler.

And in either case, even if we had such an optimization, there may be cases where a human will want to perform this optimization themselves for a number of reasons, including code-size reduction, improving type stability, outlining rare branches, etc, and having this in a macro is a nice way to automate it, and make this optimization more accessible. 😊


Example macro usage:

  @boundscheck if !(i > 1 && i <= length(x))
      @outline throw(BoundsError(x, i))
  end

This commit applies this new macro @outline to @assert, producing essentially:

# @assert x == y
x == y ? nothing : @outline(throw(AssertionError("x == y"))

or in a more complete example:

julia> @macroexpand @assert x != x "x == x: $x"
:(if x != x
      nothing
  else
      #= REPL[3]:36 =#
      var"#17#outline"(x) = begin
              $(Expr(:meta, :noinline))
              #= REPL[3]:36 =#
              (throw)((AssertionError)(((Main).Base.inferencebarrier((Main).Base.string))("x == x: $(x)")))
          end
      #= REPL[3]:38 =#
      var"#17#outline"(x)
  end)

This can improve performance for fast code that uses assertions. For example, take this definition of getindex for WithoutMissingVector:

julia/base/sort.jl

Lines 597 to 601 in 5058dba

Base.@propagate_inbounds function Base.getindex(v::WithoutMissingVector, i::Integer)
out = v.data[i]
@assert !(out isa Missing)
out::eltype(v)
end

Before:

julia> @btime Base.Sort.WithoutMissingVector($(Any[1]))[$1]
  3.041 ns (0 allocations: 0 bytes)
1

After:

julia> @btime Base.Sort.WithoutMissingVector($(Any[1]))[$1]
  2.250 ns (0 allocations: 0 bytes)
1

The number of instructions in that function according to @code_native (on an aarch64 M2 MacBook) reduced from ~90 to ~40.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this file was the right place to put the @outline tests. It seems decent, but a bit oxymoronic. :D Please let me know if you have any better suggestions!

NHDaly and others added 3 commits January 21, 2025 18:06
Macro usage:
```julia
  @BoundsCheck i > 1 && i <= len || @outline throw(BoundsError(x, i))
```

This commit applies the above to Assertions, e.g.:
```julia
julia> @macroexpand @Assert x != x "x == x: $x"
:(if x != x
      nothing
  else
      #= REPL[3]:36 =#
      var"#17#outline"(x) = begin
              $(Expr(:meta, :noinline))
              #= REPL[3]:36 =#
              (throw)((AssertionError)(((Main).Base.inferencebarrier((Main).Base.string))("x == x: $(x)")))
          end
      #= REPL[3]:38 =#
      var"#17#outline"(x)
  end)
```

This can improve performance for fast code that uses assertions, e.g.:

Before:
```julia
julia> @Btime Base.Sort.WithoutMissingVector($(Any[1]))[$1]
  3.041 ns (0 allocations: 0 bytes)
1
```
After:
```julia
julia> @Btime Base.Sort.WithoutMissingVector($(Any[1]))[$1]
  2.250 ns (0 allocations: 0 bytes)
1
```

The number of instructions in that function according to `@code_native`
(on an aarch64 M2 MacBook) reduced from ~90 to ~40.
Co-authored-by: adienes <51664769+adienes@users.noreply.github.com>
end
_free_vars(s::Symbol) = [s]
_free_vars(_) = []
_free_vars(e::Expr) = isempty(e.args) ? [] : unique!(mapreduce(_free_vars, vcat, e.args))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think unique! is not defined yet so this fails to compile for me

although, what happens if you use outline as a zero-arg closure and don't pass the vars in explicitly at all? like

macro outline2(expr)
    local fname = gensym(:outlined_expr)
    quote
        @noinline $(fname)() = $(esc(expr))
        $(fname)()
    end
end

comparing the @code_native of

foo1() = @outline rand() > 0.5 || throw(AssertionError("abc"))
foo2() = @outline2 rand() > 0.5 || throw(AssertionError("abc"))

it seems a lot simpler, but I'm not an expert at these things so maybe I'm missing something

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One difference is in how they handle type instability:

julia> function bar()
           x = rand(Bool) ? 7 : 7.0
           @outline x*x+x*x+x*x*x+x/x-x+x
       end
bar (generic function with 1 method)

julia> function bar2()
           x = rand(Bool) ? 7 : 7.0
           @outline2 x*x+x*x+x*x*x+x/x-x+x
       end
bar2 (generic function with 1 method)

julia> @b bar
6.373 ns

julia> @b bar2
75.544 ns (2.40 allocs: 38.346 bytes)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the issue with 0-arg closures is that they can box their arguments in a bunch of caess. I was hoping to avoid that by taking advantage of macro-magic. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the bug report though - shouldn't be too hard to fix. I didn't notice working locally with Revise.

@KristofferC
Copy link
Member

This implements #21925 I think.

Copy link
Member

@LilithHafner LilithHafner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_free_vars extracts all vars, not just free vars. This is an issue when the outlined body defines bound variables:

julia> function bar(n)
           x = rand(Bool) ? 7 : 7.0
           @outline sum(x^2 for _ in 1:n)
       end
ERROR: syntax: all-underscore identifiers are write-only and their values cannot be used in expressions around REPL[1]:6
Stacktrace:
 [1] top-level scope
   @ REPL[13]:1

julia> function bar(n)
           x = rand(Bool) ? 7 : 7.0
           @outline sum(x^2 for this_name_not_defined_in_main in 1:n)
       end
bar (generic function with 1 method)

julia> bar(4)
ERROR: UndefVarError: `this_name_not_defined_in_main` not defined in `Main`
Suggestion: check for spelling errors or missing imports.
Stacktrace:
 [1] macro expansion
   @ ./REPL[1]:6 [inlined]
 [2] bar(n::Int64)
   @ Main ./REPL[40]:3
 [3] top-level scope
   @ REPL[41]:1

Also tagging triage to discuss a significant new feature.

@LilithHafner LilithHafner added the triage This should be discussed on a triage call label Jan 22, 2025
Co-authored-by: Lilith Orion Hafner <lilithhafner@gmail.com>
@@ -235,14 +235,14 @@ macro assert(ex, msgs...)
# message is an expression needing evaluating
# N.B. To reduce the risk of invalidation caused by the complex callstack involved
# with `string`, use `inferencebarrier` here to hide this `string` from the compiler.
msg = :(Main.Base.inferencebarrier(Main.Base.string)($(esc(msg))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this esc needed?

@JeffBezanson
Copy link
Member

  • Computing free variables manually from a macro is going to be very difficult and I would not attempt it.
  • Making a new function for every occurrence of assert seems pretty inefficient. We could use a single noinline function instead I believe?

@LilithHafner
Copy link
Member

Triage likes the idea; nice feature to have if we can get a good implementation.

@LilithHafner LilithHafner removed the triage This should be discussed on a triage call label Jan 30, 2025
@NHDaly
Copy link
Member Author

NHDaly commented Jan 31, 2025

Thanks all!

  • Computing free variables manually from a macro is going to be very difficult and I would not attempt it.

Ah, yeah interesting. Thanks for the explanation of the issue here, @LilithHafner and @JeffBezanson. It makes sense to me now.

I originally considered a 0-arg closure, like @adienes suggested, but as discussed above, julia closures currently have a bunch of perf landmines that makes this not a great option.

I'm not sure if this is something that can be resolved alone by a macro without perf impact, then. Maybe this would need compiler support, to be able to identify truly free variables?

EDIT: @LilithHafner 🤔 is this not something that is statically determinable from the syntax? I.e. is this not something that JuliaSyntax could answer for me? For example in your two examples, it's clear that those identifiers are introducing a name, not referencing one.
EDIT 2: Hrm too bad, i just did some digging and it doesn't look like JuliaSyntax.jl does keep track of this kind of thing. I guess that is delegated to lowering. But still, this does seem like something we should be able to statically interrogate from the Compiler?

@NHDaly
Copy link
Member Author

NHDaly commented Jan 31, 2025

  • Making a new function for every occurrence of assert seems pretty inefficient. We could use a single noinline function instead I believe?

@JeffBezanson i'm not 100% sure, but i don't think so: The goal of this change is to (as @StefanKarpinski suggested in #29688) outline not just the actual throw statement itself, but to also outline all the logic used to generate the exception message.

As a simple example, if the user writes:

@assert x > 0 && x < len(foo) && foo[x] < 0 "Either $x is out of bounds or $(foo)[$x] = $(foo[x]) is not negative"

this generates:

if x > 0 && x < len(foo) && foo[x] < 0
    throw(AssertionError("Either $x is out of bounds or $(foo)[$x] = $(foo[x]) is not negative"))
end

and the function body is going to blow up with all the code to pretty-print x, foo, and foo[x].

So we are already going to compile all this code anyway, but currently we're going to compile it into the callsite.
What we want to do is outline the generated code from that compilation, but I don't think we are changing the total amount of code being compiled. So I think it doesn't seems particularly inefficient to generate a new function for this outlining? And I don't think it's possible to move the code into a single noinline function? But maybe i'm not understanding the crux of your suggestion! :)
Does this make sense? Or can you elaborate if I'm misunderstanding?

@JeffBezanson
Copy link
Member

OK, I was thinking the goal was just to outline the allocation and throw. It's different if the assertion has a complex message expression, which not every assertion does.

But I still think it's too much to make a function for every single assertion. Outlining this code is a micro-optimization that is not always needed. Doing it where needed is fine. But scaling this over every assertion expression can indeed put a lot of strain on the compiler and just feels very bulky for something so simple.

I think a good compromise is to evaluate the message expression (which is often a constant!) in-line and call a @noinline assert_fail(msg) = throw(AssertionError(msg)) if that is helpful.

@oscardssmith
Copy link
Member

What we would ideally do is move all the code on the error path to a part of the assembly such that it didn't get branch predicted which should be the best of both worlds. Doing so, of course would require teaching LLVM a lot more about Julia exceptions.

@NHDaly
Copy link
Member Author

NHDaly commented Feb 4, 2025

Oscar that would definitely help, although in practice I think some of the perf issues also come from just having a bloated LLVM IR from the error message string construction and the throw, rather than having a nice tight few-instructions function, which can get inlined, and is friendlier on instruction caches, etc.

For example, checked_add is very sensitive to this outlining.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants