Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC.@preserve blocks SIMD: instruction cannot be vectorized #36803

Closed
chriselrod opened this issue Jul 26, 2020 · 1 comment · Fixed by #36809
Closed

GC.@preserve blocks SIMD: instruction cannot be vectorized #36803

chriselrod opened this issue Jul 26, 2020 · 1 comment · Fixed by #36809
Labels
compiler:simd instruction-level vectorization GC Garbage collector performance Must go faster

Comments

@chriselrod
Copy link
Contributor

struct Wrapper{T} <: DenseVector{T}
    data::Vector{T}
end
Base.length(w::Wrapper) = length(w.data)
Base.size(w::Wrapper) = size(w.data)
Base.unsafe_convert(::Type{Ptr{T}}, w::Wrapper{T}) where {T} = Base.unsafe_convert(Ptr{T}, w.data)

@inline function Base.getindex(w::Wrapper, i::Integer)
    @boundscheck (0 < i  length(w)) || throw(BoundsError(w, i))
    GC.@preserve w begin
        v = unsafe_load(pointer(w), i)
    end
    v
end
@inline function Base.setindex!(pw::Wrapper{T}, v, i::Integer) where {T}
    @boundscheck (0 < i  length(w)) || throw(BoundsError(w, i))
    GC.@preserve w begin
        unsafe_store!(pointer(w), v, i)
    end
    v
end

function mysum(x)
    s = zero(eltype(x))
    @inbounds @simd for i in eachindex(x)
        s += x[i]
    end
    s
end

By setting JULIA_LLVM_ARGS to --pass-remarks-analysis=loop-vectorize --pass-remarks-missed=loop-vectorize --pass-remarks=loop-vectorize, I get:

remark: simdloop.jl:75:0: loop not vectorized: instruction cannot be vectorized
remark: simdloop.jl:75:0: loop not vectorized

Removing the GC.@preserve yields:

remark: simdloop.jl:75:0: vectorized loop (vectorization width: 4, interleaved count: 4)

I believe the GC.@preserve is necessary, so it'd be nice if it didn't carry a potential performance penalty.

@Keno
Copy link
Member

Keno commented Jul 26, 2020

Yes, the GC preserve is necessary, but should be removable in optimization. Perhaps we need to reorder or strengthen the propagate_addressspaces pass.

yuyichao added a commit that referenced this issue Jul 26, 2020
@StefanKarpinski StefanKarpinski added GC Garbage collector performance Must go faster compiler:simd instruction-level vectorization labels Jul 28, 2020
simeonschaub pushed a commit to simeonschaub/julia that referenced this issue Aug 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:simd instruction-level vectorization GC Garbage collector performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants