Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

next() performance regression in 0.5-rc3 #7

Closed
J-Revell opened this issue Sep 5, 2016 · 4 comments
Closed

next() performance regression in 0.5-rc3 #7

J-Revell opened this issue Sep 5, 2016 · 4 comments

Comments

@J-Revell
Copy link

J-Revell commented Sep 5, 2016

There seems to be a bit of a performance dip (10x) in 0.5-rc3 compared to 0.4.6, especially for scaled Sobol Sequences

e.g.
The benchmark on 0.5-rc3

S=SobolSeq(1,4,6)
@benchmark next(S)
BenchmarkTools.Trial:
samples: 10000
evals/sample: 9
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 96.00 bytes
allocs estimate: 1
minimum time: 2.81 μs (0.00% GC)
median time: 2.91 μs (0.00% GC)
mean time: 2.90 μs (0.00% GC)
maximum time: 6.16 μs (0.00% GC)

While on 0.4.6

S=SobolSeq(1,4,6)
@benchmark next(S)
BenchmarkTools.Trial:
samples: 10000
evals/sample: 251
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 80.00 bytes
allocs estimate: 1
minimum time: 306.00 ns (0.00% GC)
median time: 321.00 ns (0.00% GC)
mean time: 329.61 ns (1.27% GC)
maximum time: 6.40 μs (93.92% GC)

Any ideas?

@stevengj
Copy link
Member

stevengj commented Sep 6, 2016

Using next! or skip to isolate the actual Sobol code from the allocation of the array, I don't get a 10x slowdown, but I do get about a 35% slowdown.

In Julia 0.5.0-rc2:

julia> using Sobol

julia> S=SobolSeq(1,4,6)
1-dimensional scaled Sobol sequence on [4.0,6.0]

julia> @time skip(S, 1) # isolate the time to compile skip
  0.136152 seconds (183.29 k allocations: 7.836 MB)

julia> @time skip(S, 10^6)
  0.007619 seconds (7 allocations: 288 bytes)

In Julia 0.4.6:

julia> using Sobol

julia> S=SobolSeq(1,4,6)
1-dimensional scaled Sobol sequence on [4.0,6.0]

julia> @time skip(S, 1) # isolate the time to compile skip
  0.174235 seconds (288.54 k allocations: 11.935 MB)

julia> @time skip(S, 10^6)
  0.005546 seconds (7 allocations: 272 bytes)

julia> @time skip(S, 10^6)
  0.005500 seconds (7 allocations: 272 bytes)

@yuyichao, any ideas?

@yuyichao
Copy link

yuyichao commented Sep 6, 2016

I can reproduce the performance regression of next but not skip on 0.6.0-dev.501 (current master).

According to the profile, the 10x performance regression in next seems to be caused by dynamic dispatch so @vtjnash. Profile below

33336 /home/yuyichao/tmp/julia/3/Sobol.jl/script.jl:9; f(::Sobol.ScaledSobolSeq, ::Int64)
 31593 ...e/yuyichao/projects/julia/master/src/gf.c:1887; jl_apply_generic
  31551 ...rojects/julia/master/src/julia_internal.h:720; jl_typemap_assoc_exact
   31498 ...ichao/projects/julia/master/src/typemap.c:862; jl_typemap_level_assoc_exact
    8982  ...chao/projects/julia/master/src/typemap.c:798; jl_typemap_entry_assoc_exact
    1872  ...chao/projects/julia/master/src/typemap.c:799; jl_typemap_entry_assoc_exact
    582   ...chao/projects/julia/master/src/typemap.c:803; jl_typemap_entry_assoc_exact
    1177  ...chao/projects/julia/master/src/typemap.c:813; jl_typemap_entry_assoc_exact
     1169 ...chao/projects/julia/master/src/typemap.c:107; sig_match_leaf
    16141 ...chao/projects/julia/master/src/typemap.c:817; jl_typemap_entry_assoc_exact
     635  ...chao/projects/julia/master/src/typemap.c:124; sig_match_simple
     524  ...chao/projects/julia/master/src/typemap.c:126; sig_match_simple
     561  ...chao/projects/julia/master/src/typemap.c:129; sig_match_simple
     1968 ...chao/projects/julia/master/src/typemap.c:135; sig_match_simple
      1915 ...yichao/projects/julia/master/src/julia.h:941; jl_is_type_type
     1403 ...chao/projects/julia/master/src/typemap.c:136; sig_match_simple
      1403 ...yichao/projects/julia/master/src/julia.h:662; jl_svecref
     588  ...chao/projects/julia/master/src/typemap.c:137; sig_match_simple
     2832 ...chao/projects/julia/master/src/typemap.c:142; sig_match_simple
     6758 ...chao/projects/julia/master/src/typemap.c:147; sig_match_simple
      411  ...hao/projects/julia/master/src/jltypes.c:1715; jl_types_equal
       411 ...hao/projects/julia/master/src/jltypes.c:1710; type_eqv_
        411 ...hao/projects/julia/master/src/jltypes.c:1666; type_eqv__
      1741 ...hao/projects/julia/master/src/jltypes.c:1664; type_eqv__
      1018 ...hao/projects/julia/master/src/jltypes.c:1668; type_eqv__
      432  ...hao/projects/julia/master/src/jltypes.c:1670; type_eqv__
      422  ...hao/projects/julia/master/src/jltypes.c:1672; type_eqv__
      955  ...hao/projects/julia/master/src/jltypes.c:1685; type_eqv__
      717  ...hao/projects/julia/master/src/jltypes.c:1706; type_eqv__
    1203  ...chao/projects/julia/master/src/typemap.c:831; jl_typemap_entry_assoc_exact
 1130  ...e/yuyichao/projects/julia/master/src/gf.c:1928; jl_apply_generic
  1089 ...rojects/julia/master/src/julia_internal.h:185; jl_call_method_internal
   577 ...projects/julia/master/usr/lib/julia/sys.so:?; 
    560 ./boot.jl:330; Array{T,N}(::Type{Float64}, ::Int64)
     532 ...uyichao/projects/julia/master/src/array.c:346; jl_alloc_array_1d
      103 ...yichao/projects/julia/master/src/array.c:146; _new_array
      353 ...yichao/projects/julia/master/src/array.c:149; _new_array
       206 ...yichao/projects/julia/master/src/array.c:95; _new_array_
        192 ...yuyichao/projects/julia/master/src/gc.c:1888; jl_gc_alloc
         181 ...jects/julia/master/src/julia_internal.h:144; jl_gc_alloc_
    469 ...uyichao/tmp/julia/3/Sobol.jl/src/Sobol.jl:133; next!(::Sobol.ScaledSobolSeq, ::Array{Float64,1})
     334 .../yuyichao/projects/julia/master/src/gf.c:1928; jl_apply_generic
      270 ...ojects/julia/master/src/julia_internal.h:185; jl_call_method_internal
        174 ...ichao/tmp/julia/3/Sobol.jl/src/Sobol.jl:125; next!(::Sobol.SobolSeq{1}, ::Array{Float64,1}, ::Array{Float...

@vtjnash
Copy link

vtjnash commented Sep 10, 2016

Something like JuliaLang/julia#16418 is required to make the v0.5 version fast again (this is the penalty of the jb/functions branch), but I'm stalled there on enumerating which cases that code needs to handle to mimic jl_type_less_specific.

But in the meantime, the following diff makes the code 2.5x faster and eliminates the performance regression:

diff --git a/src/Sobol.jl b/src/Sobol.jl
index 73e5f48..fba7773 100644
--- a/src/Sobol.jl
+++ b/src/Sobol.jl
@@ -77,7 +77,7 @@ function next!{T<:AbstractFloat}(s::SobolSeq, x::AbstractVector{T})
     end
     return x
 end
-next(s::SobolSeq) = next!(s, Array(Float64,ndims(s)))
+next(s::SobolSeq) = next!(s, Array{Float64,1}(ndims(s)))

 # if we know in advance how many points (n) we want to compute, then
 # adopt the suggestion of the Joe and Kuo paper, which in turn
@@ -88,7 +88,7 @@ function skip!(s::SobolSeq, n::Integer, x)
     for unused=1:nskip; next!(s,x); end
     return nothing
 end
-skip(s::SobolSeq, n::Integer) = skip!(s, n, Array(Float64, ndims(s)))
+skip(s::SobolSeq, n::Integer) = skip!(s, n, Array{Float64,1}(ndims(s)))

 function show(io::IO, s::SobolSeq)
     print(io, "$(ndims(s))-dimensional Sobol sequence on [0,1]^$(ndims(s))")
@@ -115,7 +115,7 @@ type ScaledSobolSeq
         new(SobolSeq(N), lb, ub)
 end
 SobolSeq(N::Integer, lb, ub) =
-    ScaledSobolSeq(N, copy!(Array(Float64,N),lb), copy!(Array(Float64,N),ub))
+    ScaledSobolSeq(N, copy!(Array{Float64,1}(N),lb), copy!(Array{Float64,1}(N),ub))

 ndims(s::ScaledSobolSeq) = ndims(s.s)

@@ -128,10 +128,10 @@ function next!(s::SobolSeq, x::Vector{Float64},
     end
     return x
 end
-next(s::SobolSeq, lb::Vector, ub::Vector) = next!(s, Array(Float64,N), lb, ub)
+next(s::SobolSeq, lb::Vector, ub::Vector) = next!(s, Array{Float64,1}(N), lb, ub) # TODO: where did N come from?

 next!(s::ScaledSobolSeq, x::Vector{Float64}) = next!(s.s, x, s.lb, s.ub)
-next(s::ScaledSobolSeq) = next!(s, Array(Float64,ndims(s)))
+next(s::ScaledSobolSeq) = next!(s, Array{Float64,1}(ndims(s)))

 start(s::ScaledSobolSeq) = s
 next(s::ScaledSobolSeq, s_::ScaledSobolSeq) = (next(s), s_)

for a bonus 30% extra speed we can even eliminate a dynamic dispatch by hardcoding the type of the size of the SobolSeq to Int:

diff --git a/src/Sobol.jl b/src/Sobol.jl
index 73e5f48..4c81ee2 100644
--- a/src/Sobol.jl
+++ b/src/Sobol.jl
@@ -17,7 +17,7 @@ type SobolSeq{N}
     n::UInt32 #number of x's generated so far
 end

-ndims{N}(s::SobolSeq{N}) = N
+ndims{N}(s::SobolSeq{N}) = N::Int

 function SobolSeq(N::Int)
     (N < 0 || N > 1111) && error("invalid Sobol dimension")

@stevengj
Copy link
Member

stevengj commented Sep 11, 2016

Thanks @vtjnash, I will implement those suggestions. (This will require us to drop 0.3 support in the next version, but I don't think that's a problem.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants