-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize get_ith #286
optimize get_ith #286
Conversation
Nightly failures unrelated |
We just met a Vararg length limitation here. Tuple reducer using It should be noticed that the current version gives a stronger promising on inlining. As There's also some concern on inferability on nested |
Could you please suggest a few
Are you talking about the |
I believe map itself is not designed for handling this kind of long tuples though. It would give up rescusion and fallback to a collect to As for inlining, the current code would has more possibility to be inlined. In fact, |
Indeed, sounds like And would be really great if you could suggest more inference or related tests for StructArrays, so that we could catch (some) of these potential issues automatically in the future. |
In fact, the limitation comes from As for inference, I believe a nested |
I also seem to remember we switched to this version as a |
Pushed a straightforward Don't know how to add something relevant to tests, but here are benchmarks: A = StructArray([SMatrix{6,6}(rand(6,6)) for _ in 1:10])
@btime $A[5]
# before:
# 6.948 μs (155 allocations: 5.36 KiB)
# after:
# 1.021 μs (39 allocations: 1.45 KiB)
B = StructArray(a=StructArray(b=StructArray(c=StructArray(d=StructArray(e=StructArray(f=1:10))))))
@btime $B[5]
# before:
# 1.500 ns (0 allocations: 0 bytes)
# after:
# 1.500 ns (0 allocations: 0 bytes)
C = StructArray(;[Symbol("x$i") => 1:10 for i in 1:50]...)
@btime $C[5]
# before:
# 42.375 μs (854 allocations: 45.86 KiB)
# after:
# 1.092 μs (3 allocations: 1.31 KiB) |
So there are still remaining allocations? Perhaps it's another similar problem caused by splatted Tuple. I guess you need to fix that too to make your case fully optimized. If we design to optimize this package for eltype with big struct, it would be good to ensure it works in more cases. Just for example, apparently inference would be blocked by |
That's a bit above my head – tried to specialize more functions involved, but it didn't help. Suggestions welcome! And this PR is a strict improvement anyway, with more than an order of magnitude speedup. This can easily turn the overhead from "dominates the profview" to "barely visible there". |
I haven't tested it but the new |
I also like the generated function approach, it does seem very simple and elegant. I've left a couple minor suggestions, but overall looks good to me. |
Co-authored-by: Pietro Vertechi <pietro.vertechi@protonmail.com>
Followed those comments, should be ready! |
gentle bump... |
bump |
I'll take the liberty to merge this, following an advice on slack #arrays. The PR didn't get a strong "no" from maintainers, changes are nonbreaking, and quite local (ie not an overhaul). |
Makes getindex and related functions much faster on arrays with a few tens of components.
Narrow arrays:
Intermediate arrays:
I guess originally there was some reason why the current implementation was chosen instead of a more naive one? The latter turns out to be faster now.