-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird slowdowns when grouping on PooledArray #2564
Comments
I think it is a benchmarking artifact. A small thing is that we could replace Finally - and I believe this is a major potential gain: if Finally in DataAPI.jl we could add a function that for a given array returns information if Sorry for a long list of things, but in combination they potentially will improve speed in the most common case (my benchmarks show that at least ~10x speedup is to be gained in the common scenarios) + possibly also compile time latency. |
You mean that allocating a vector really takes about 25ms?
Sure we can do that.
Unfortunately in my tests I couldn't find a faster implementation than the current one for the example above (i.e. with only 10 groups). This is what prompted me to file this issue. Even removing all the code handling multiple columns,
Yes that's #2558 right? |
No I mean that I have checked several options with running
This is the speedup:
So:
Right - I just wanted to keep all things in one place. |
Update! I have checked PooledArrays.jl#master (which has
so indeed As you can see we can win ~50% of time (just by saving space with |
We could even go for:
and set The only challenge will be testing against What do you think? |
Note that We could imagine adding a field to One issue with using |
agreed, but I assumed it would be fast (maybe I am wrong).
I thought about it, but this is problematic, also with views (see JuliaData/PooledArrays.jl#44 as a related issue).
This is not a problem I think. We can check how many rows a data frame has. If it is less than Having said that - I think that the current code is quite fast. However, I believe that even if we keep the current code but switch it to |
OK, it seems that using
I've pushed changes to test |
This is fantastic - I will have a look and report back. Also I will test how later |
For your
can you please cross check that? If this is not some bug on my side this means that we have a significant problem in fast reductions implementation. EDIT: the performance problem is with |
@nalimilan - is this issue boiled down to using |
Yes I think so. Indeed it's not breaking so it can happen after 1.0. Though it could be worth implementing now in order to look good in benchmarks. I had missed your previous comment. Indeed I also see the very bad performance of |
OK - thank you for looking into this (both changing to At some point same thing can be done for joins, see #2688. |
I resolve the slowdown part in #2708. The |
I have a mostly working implementation in the (updated) nl/UInt32 branch. But actually I'm not sure it's really worth the increased complexity. It's about 30% faster for using Revise, DataFrames, BenchmarkTools, PooledArrays
N = 100_000_000;
df = DataFrame(x=PooledArray(rand(1:10, N), UInt32), y=rand(N));
# main
julia> @btime groupby(df, :x);
29.111 ms (40 allocations: 76.30 MiB)
julia> @btime combine(groupby(df, :x), :y => sum);
61.132 ms (240 allocations: 76.31 MiB)
# branch:
julia> @btime groupby(df, :x);
21.233 ms (45 allocations: 38.15 MiB)
julia> @btime combine(groupby(df, :x), :y => sum);
49.567 ms (244 allocations: 38.16 MiB) |
This was also my fear. Adding So what do we do. Close it for now and the same with #2688, or we keep them open for the future decision (then I would rename this issue to clearly indicate that using |
It seems
row_group_slots
for arrays implementingDataAPI.refpool
is much slower than it should. Profiling reveals that most of the time is spent on thegroups[i] = j
line. Yet, callingrow_group_slots
directly is twice faster than usinggroupby
. The difference seems to be due to the allocation of thegroups
vector, but allocating the array separately is very fast. Not sure whether that's a benchmarking artifact.This is not new as it also happens both on master and on 0.21.8, and both with Julia 1.5 and 1.6.
The text was updated successfully, but these errors were encountered: