-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding an example of an array type wrapper without loose of performance. #33420
Conversation
Sorry, I have no idea what white spaces are meant by the test |
The whitespace CI in the logs is from the previous commit. Not sure why a new one didn't start. |
It might also be worth mentioning |
|
||
julia> Base.size(A::MyArray) = size(A.a) | ||
|
||
julia> Base.@propagate_inbounds Base.getindex(A::MyArray, i...) = getindex(A.a, i...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this should encourage the passing on of keywords, for named A[i=3]
type indexing?
julia> Base.@propagate_inbounds Base.getindex(A::MyArray, i...) = getindex(A.a, i...) | |
julia> Base.@propagate_inbounds Base.getindex(A::MyArray, i...; kw...) = getindex(A.a, i...; kw...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keyword arguments often cause various performance problems. Just look at this for example: JuliaArrays/StaticArrays.jl#540 . I don't this it should be encouraged in a section dedicated to fast wrappers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. But is this a concern with unused kw...
being passed along, or only when they are actually used? I can’t detect any effect on things I tried.
(And, is there a good explanation of this issue somewhere?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem occurs when you call a function with keyword arguments. It's OK to have default keyword arguments though but even passing these default kwargs slows things down. That's why in some places keyword arguments are forwarded as normal arguments (as named tuples). There might be some exceptions that can be optimized by the compiler but from my experience it's a good rule of thumb.
(And, is there a good explanation of this issue somewhere?)
I don't know, it seems to generally be hard to find good explanations of such compiler details.
Hm, I'm not the biggest fan of encouraging |
However note, that it leads to problems only after explicitly using It simple just not should be the case that a wrapped array has speed drawbacks. |
My point is that I want to encourage folks to use julia> module M
using Random
struct ShuffledVector{A,T} <: AbstractVector{T}
data::A
shuffle::Vector{Int}
end
ShuffledVector(A::AbstractVector{T}) where {T} = ShuffledVector{typeof(A), T}(A, randperm(length(A)))
Base.size(A::ShuffledVector) = size(A.data)
Base.@propagate_inbounds function Base.getindex(A::ShuffledVector, i::Int)
A.data[A.shuffle[i]]
end
end
julia> s = M.ShuffledVector(1:4);
julia> s[5]
ERROR: BoundsError: attempt to access 4-element Array{Int64,1} at index [5]
Stacktrace:
[1] getindex at ./array.jl:728 [inlined]
[2] getindex(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Int64) at ./REPL[4]:10
[3] top-level scope at REPL[6]:1 vs. julia> module M
using Random
struct ShuffledVector{A,T} <: AbstractVector{T}
data::A
shuffle::Vector{Int}
end
ShuffledVector(A::AbstractVector{T}) where {T} = ShuffledVector{typeof(A), T}(A, randperm(length(A)))
Base.size(A::ShuffledVector) = size(A.data)
Base.@inline function Base.getindex(A::ShuffledVector, i::Int)
@boundscheck checkbounds(A, i)
A.data[A.shuffle[i]]
end
end
julia> s = M.ShuffledVector(1:4);
julia> s[5]
ERROR: BoundsError: attempt to access 4-element Main.M.ShuffledVector{UnitRange{Int64},Int64} at index [5]
Stacktrace:
[1] throw_boundserror(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Tuple{Int64}) at ./abstractarray.jl:538
[2] checkbounds at ./abstractarray.jl:503 [inlined]
[3] getindex(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Int64) at ./REPL[7]:10
[4] top-level scope at REPL[9]:1 Both implementations will have the same performance. |
Yeah, maybe the approach with |
Yes, it'll perform exactly the same. The reason-for-being for @propagate_inbounds getindex(A::AbstractArray, I...) = _getindex(IndexStyle(A), A, to_indices(A, I)...) |
Just corrected the sentence to restart the checks (one of them was stuck for some reason). After thinking a bit about it, I would prefer the |
I don't see what makes the test fail here? Could someone else look into it? It is only the documentation, so I don't see how it can fail... |
I restarted the failing workers. |
This is missing one facet. I either have to use module M
using Random
struct ShuffledVector{A,T} <: AbstractVector{T}
data::A
shuffle::Vector{Int}
end
ShuffledVector(A::AbstractVector{T}) where {T} = ShuffledVector{typeof(A), T}(A, randperm(length(A)))
Base.size(A::ShuffledVector) = size(A.data)
Base.@propagate_inbounds function Base.getindex(A::ShuffledVector, i::Int)
@boundscheck checkbounds(A, i)
A.data[A.shuffle[i]]
end
end
julia> s = M.ShuffledVector(1:4);
Note that Now for the wrapper without
Note how both cases use Now my variant with
This has lead to confusion before, JuliaArrays/StaticArrays.jl#564 |
This is exactly what it boils down to. I'm not sure there's a categorically "better" choice here — I think the arguments for and against both are highly opinion-based. Here are my opinions: Personally, I disagree that option one "makes it harder to verify the implementation of the wrapper." It's Further, I think of |
We already have a section on the Array interface, and I agree with mbauman that this doesn't seem beneficial to show |
I found it really handy to know that in order to wrap an array and keep the performance you have to use
base.@propagate_inbounds
ongetindex
andsetindex!
functions. As I think this is a typical option used for dispatching it should be mentioned in the docs.