Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve aliasing problems in getindex #60

Merged
merged 3 commits into from
Mar 1, 2021
Merged

Resolve aliasing problems in getindex #60

merged 3 commits into from
Mar 1, 2021

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented Mar 1, 2021

Life is hard. I am not sure how to write a fully generic code that would work on all supported versions of Julia and be fast.

Therefore I propose half-measure that resolves the following problem:

Timing on 1.2.0 release

julia> @time x = PooledArray(1:10^5);
  0.007226 seconds (59 allocations: 6.718 MiB)

julia> @time v = view(x, [1:10^5;]);
  0.000352 seconds (3 allocations: 781.375 KiB)

julia> @time s = zeros(10^5);
  0.000396 seconds (2 allocations: 781.328 KiB)

julia> @time copyto!(s, v);
  2.714555 seconds (298.98 k allocations: 7.614 MiB)

and this is timing with this PR:

julia> @time x = PooledArray(1:10^5);
  0.006874 seconds (59 allocations: 6.718 MiB)

julia> @time v = view(x, [1:10^5;]);
  0.000355 seconds (3 allocations: 781.375 KiB)

julia> @time s = zeros(10^5);
  0.000390 seconds (2 allocations: 781.328 KiB)

julia> @time copyto!(s, v);
  0.000191 seconds

What is the problem - we cannot use view for DataAPI.refarray in general if we want to be fast as if indices are a general AbstractVector then view must do unaliasing and this is extremely slow.

Therefore what I propose is that we use fast path for PooledArray or its view only if the first index in getindex is AbstractVector (it should cover 99.99% of use-cases I can think of - especially given that PooledArray did not work correctly for more than 1 dimension anyway in the past an no-one reported it before). In other cases we fall-back to mechanics of Julia Base that avoid allocations (and rewriting them in PooledArrays.jl would be an overkill especially that internals changed since Julia 1.0).

@bkamins bkamins requested review from nalimilan and quinnj March 1, 2021 15:28
@codecov
Copy link

codecov bot commented Mar 1, 2021

Codecov Report

Merging #60 (b523437) into main (040053c) will increase coverage by 0.06%.
The diff coverage is 90.90%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #60      +/-   ##
==========================================
+ Coverage   88.04%   88.10%   +0.06%     
==========================================
  Files           1        1              
  Lines         276      269       -7     
==========================================
- Hits          243      237       -6     
+ Misses         33       32       -1     
Impacted Files Coverage Δ
src/PooledArrays.jl 88.10% <90.90%> (+0.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 040053c...d1983de. Read the comment docs.

Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bkamins
Copy link
Member Author

bkamins commented Mar 1, 2021

I need to resolve one more thing before merging. I will report.

@bkamins
Copy link
Member Author

bkamins commented Mar 1, 2021

I incorrectly used extrema for empty PooledArray - a case that was not tested.

@bkamins bkamins linked an issue Mar 1, 2021 that may be closed by this pull request
@bkamins
Copy link
Member Author

bkamins commented Mar 1, 2021

@nalimilan - I will merge it to fix #61. Let us then later discuss if we can have a cleaner code please.

@bkamins bkamins merged commit 080771a into main Mar 1, 2021
@bkamins bkamins deleted the bk/faster_getindex branch March 1, 2021 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

similar(pa, 0) stopped working
2 participants