-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
next() performance regression in 0.5-rc3 #7
Comments
Using In Julia 0.5.0-rc2:
In Julia 0.4.6:
@yuyichao, any ideas? |
I can reproduce the performance regression of According to the profile, the 10x performance regression in 33336 /home/yuyichao/tmp/julia/3/Sobol.jl/script.jl:9; f(::Sobol.ScaledSobolSeq, ::Int64)
31593 ...e/yuyichao/projects/julia/master/src/gf.c:1887; jl_apply_generic
31551 ...rojects/julia/master/src/julia_internal.h:720; jl_typemap_assoc_exact
31498 ...ichao/projects/julia/master/src/typemap.c:862; jl_typemap_level_assoc_exact
8982 ...chao/projects/julia/master/src/typemap.c:798; jl_typemap_entry_assoc_exact
1872 ...chao/projects/julia/master/src/typemap.c:799; jl_typemap_entry_assoc_exact
582 ...chao/projects/julia/master/src/typemap.c:803; jl_typemap_entry_assoc_exact
1177 ...chao/projects/julia/master/src/typemap.c:813; jl_typemap_entry_assoc_exact
1169 ...chao/projects/julia/master/src/typemap.c:107; sig_match_leaf
16141 ...chao/projects/julia/master/src/typemap.c:817; jl_typemap_entry_assoc_exact
635 ...chao/projects/julia/master/src/typemap.c:124; sig_match_simple
524 ...chao/projects/julia/master/src/typemap.c:126; sig_match_simple
561 ...chao/projects/julia/master/src/typemap.c:129; sig_match_simple
1968 ...chao/projects/julia/master/src/typemap.c:135; sig_match_simple
1915 ...yichao/projects/julia/master/src/julia.h:941; jl_is_type_type
1403 ...chao/projects/julia/master/src/typemap.c:136; sig_match_simple
1403 ...yichao/projects/julia/master/src/julia.h:662; jl_svecref
588 ...chao/projects/julia/master/src/typemap.c:137; sig_match_simple
2832 ...chao/projects/julia/master/src/typemap.c:142; sig_match_simple
6758 ...chao/projects/julia/master/src/typemap.c:147; sig_match_simple
411 ...hao/projects/julia/master/src/jltypes.c:1715; jl_types_equal
411 ...hao/projects/julia/master/src/jltypes.c:1710; type_eqv_
411 ...hao/projects/julia/master/src/jltypes.c:1666; type_eqv__
1741 ...hao/projects/julia/master/src/jltypes.c:1664; type_eqv__
1018 ...hao/projects/julia/master/src/jltypes.c:1668; type_eqv__
432 ...hao/projects/julia/master/src/jltypes.c:1670; type_eqv__
422 ...hao/projects/julia/master/src/jltypes.c:1672; type_eqv__
955 ...hao/projects/julia/master/src/jltypes.c:1685; type_eqv__
717 ...hao/projects/julia/master/src/jltypes.c:1706; type_eqv__
1203 ...chao/projects/julia/master/src/typemap.c:831; jl_typemap_entry_assoc_exact
1130 ...e/yuyichao/projects/julia/master/src/gf.c:1928; jl_apply_generic
1089 ...rojects/julia/master/src/julia_internal.h:185; jl_call_method_internal
577 ...projects/julia/master/usr/lib/julia/sys.so:?;
560 ./boot.jl:330; Array{T,N}(::Type{Float64}, ::Int64)
532 ...uyichao/projects/julia/master/src/array.c:346; jl_alloc_array_1d
103 ...yichao/projects/julia/master/src/array.c:146; _new_array
353 ...yichao/projects/julia/master/src/array.c:149; _new_array
206 ...yichao/projects/julia/master/src/array.c:95; _new_array_
192 ...yuyichao/projects/julia/master/src/gc.c:1888; jl_gc_alloc
181 ...jects/julia/master/src/julia_internal.h:144; jl_gc_alloc_
469 ...uyichao/tmp/julia/3/Sobol.jl/src/Sobol.jl:133; next!(::Sobol.ScaledSobolSeq, ::Array{Float64,1})
334 .../yuyichao/projects/julia/master/src/gf.c:1928; jl_apply_generic
270 ...ojects/julia/master/src/julia_internal.h:185; jl_call_method_internal
174 ...ichao/tmp/julia/3/Sobol.jl/src/Sobol.jl:125; next!(::Sobol.SobolSeq{1}, ::Array{Float64,1}, ::Array{Float... |
Something like JuliaLang/julia#16418 is required to make the v0.5 version fast again (this is the penalty of the jb/functions branch), but I'm stalled there on enumerating which cases that code needs to handle to mimic jl_type_less_specific. But in the meantime, the following diff makes the code 2.5x faster and eliminates the performance regression: diff --git a/src/Sobol.jl b/src/Sobol.jl
index 73e5f48..fba7773 100644
--- a/src/Sobol.jl
+++ b/src/Sobol.jl
@@ -77,7 +77,7 @@ function next!{T<:AbstractFloat}(s::SobolSeq, x::AbstractVector{T})
end
return x
end
-next(s::SobolSeq) = next!(s, Array(Float64,ndims(s)))
+next(s::SobolSeq) = next!(s, Array{Float64,1}(ndims(s)))
# if we know in advance how many points (n) we want to compute, then
# adopt the suggestion of the Joe and Kuo paper, which in turn
@@ -88,7 +88,7 @@ function skip!(s::SobolSeq, n::Integer, x)
for unused=1:nskip; next!(s,x); end
return nothing
end
-skip(s::SobolSeq, n::Integer) = skip!(s, n, Array(Float64, ndims(s)))
+skip(s::SobolSeq, n::Integer) = skip!(s, n, Array{Float64,1}(ndims(s)))
function show(io::IO, s::SobolSeq)
print(io, "$(ndims(s))-dimensional Sobol sequence on [0,1]^$(ndims(s))")
@@ -115,7 +115,7 @@ type ScaledSobolSeq
new(SobolSeq(N), lb, ub)
end
SobolSeq(N::Integer, lb, ub) =
- ScaledSobolSeq(N, copy!(Array(Float64,N),lb), copy!(Array(Float64,N),ub))
+ ScaledSobolSeq(N, copy!(Array{Float64,1}(N),lb), copy!(Array{Float64,1}(N),ub))
ndims(s::ScaledSobolSeq) = ndims(s.s)
@@ -128,10 +128,10 @@ function next!(s::SobolSeq, x::Vector{Float64},
end
return x
end
-next(s::SobolSeq, lb::Vector, ub::Vector) = next!(s, Array(Float64,N), lb, ub)
+next(s::SobolSeq, lb::Vector, ub::Vector) = next!(s, Array{Float64,1}(N), lb, ub) # TODO: where did N come from?
next!(s::ScaledSobolSeq, x::Vector{Float64}) = next!(s.s, x, s.lb, s.ub)
-next(s::ScaledSobolSeq) = next!(s, Array(Float64,ndims(s)))
+next(s::ScaledSobolSeq) = next!(s, Array{Float64,1}(ndims(s)))
start(s::ScaledSobolSeq) = s
next(s::ScaledSobolSeq, s_::ScaledSobolSeq) = (next(s), s_) for a bonus 30% extra speed we can even eliminate a dynamic dispatch by hardcoding the type of the size of the SobolSeq to diff --git a/src/Sobol.jl b/src/Sobol.jl
index 73e5f48..4c81ee2 100644
--- a/src/Sobol.jl
+++ b/src/Sobol.jl
@@ -17,7 +17,7 @@ type SobolSeq{N}
n::UInt32 #number of x's generated so far
end
-ndims{N}(s::SobolSeq{N}) = N
+ndims{N}(s::SobolSeq{N}) = N::Int
function SobolSeq(N::Int)
(N < 0 || N > 1111) && error("invalid Sobol dimension") |
Thanks @vtjnash, I will implement those suggestions. (This will require us to drop 0.3 support in the next version, but I don't think that's a problem.) |
There seems to be a bit of a performance dip (10x) in 0.5-rc3 compared to 0.4.6, especially for scaled Sobol Sequences
e.g.
The benchmark on 0.5-rc3
S=SobolSeq(1,4,6)
@benchmark next(S)
BenchmarkTools.Trial:
samples: 10000
evals/sample: 9
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 96.00 bytes
allocs estimate: 1
minimum time: 2.81 μs (0.00% GC)
median time: 2.91 μs (0.00% GC)
mean time: 2.90 μs (0.00% GC)
maximum time: 6.16 μs (0.00% GC)
While on 0.4.6
S=SobolSeq(1,4,6)
@benchmark next(S)
BenchmarkTools.Trial:
samples: 10000
evals/sample: 251
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 80.00 bytes
allocs estimate: 1
minimum time: 306.00 ns (0.00% GC)
median time: 321.00 ns (0.00% GC)
mean time: 329.61 ns (1.27% GC)
maximum time: 6.40 μs (93.92% GC)
Any ideas?
The text was updated successfully, but these errors were encountered: