Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline StepRange construction #49270

Merged
merged 5 commits into from
May 4, 2023
Merged

Inline StepRange construction #49270

merged 5 commits into from
May 4, 2023

Conversation

Zentrik
Copy link
Member

@Zentrik Zentrik commented Apr 5, 2023

This can improve performance when iterating over a range with a step as currently steprange_last is not inlined.

using BenchmarkTools

x = rand(Float32, 80)
a = rand(round(Int, length(x) / 2):length(x), 10^6)

function test(x, a)
    c = zero(Float32)

    for j in a
        for i in 1:8:j
            c += x[i]
        end
    end

    return c
end

On 1bf65b9 (similar results on 1.8.5),

julia> @benchmark test($x, $a)
BenchmarkTools.Trial: 380 samples with 1 evaluation.
 Range (min  max):  12.714 ms   20.363 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     13.091 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   13.141 ms ± 470.076 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

        ▄▄▁  ▁▂█▃▂▅▃▆▇▁▂ ▂                                      
  ▃▁▁▃▃▆████▇██████████████▇▆▄▅▄▃▃▃▃▃▁▃▃▁▁▃▁▁▁▁▁▃▁▁▁▁▁▁▁▃▁▃▃▁▃ ▄
  12.7 ms         Histogram: frequency by time         14.1 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

With this change

julia> @benchmark test($x, $a)
BenchmarkTools.Trial: 669 samples with 1 evaluation.
 Range (min  max):  7.376 ms   8.252 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     7.459 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   7.465 ms ± 57.103 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

              ▂▆▂▆▇█▆█▆█▂▇▆▄▇▄▆█▄ ▂▂▂▁▄                       
  ▃▃▂▂▄▄▁▄▆▆████████████████████████████▄▅▄▄▄▄▅▄▄▃▃▄▂▄▂▃▄▂▁▂ ▅
  7.38 ms        Histogram: frequency by time        7.58 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

While this is a contrived example, if using SIMD.jl you want to loop over 1:8:length(x) and if that's in an inner loop that can't be inlined or if the length is changing this will improve performance.

See https://discourse.julialang.org/t/iterating-over-range-is-slower-than-while-loop/96991 for investigation.

@Zentrik Zentrik changed the title Inline StepRange constructors Inline StepRange construction Apr 6, 2023
@gbaraldi gbaraldi requested a review from oscardssmith April 6, 2023 12:39
Zentrik added 3 commits April 6, 2023 13:46
This can improve performance when iterating over a range with a step as currently steprange_last is not inlined. About a 12ns improvement
I don't think this should slow anything down as at worst you're making a call which is what you'd have to do without inlining. Also, it should be unlikely that branch is taken.
This should allow for better automatic inlining.
@KristofferC
Copy link
Member

Some more conversation in #26770.

@inkydragon inkydragon added performance Must go faster ranges Everything AbstractRange labels Apr 8, 2023
@KristofferC KristofferC merged commit 2e80c0d into JuliaLang:master May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster ranges Everything AbstractRange
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants