Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeking suggestions to move forward #29

Closed
y1my1 opened this issue Apr 26, 2020 · 6 comments
Closed

Seeking suggestions to move forward #29

y1my1 opened this issue Apr 26, 2020 · 6 comments

Comments

@y1my1
Copy link

y1my1 commented Apr 26, 2020

Dear Tamas,

After a few days of trying to figure out a good way to move forward for my problem, I found there are some issues that are not quite related to DynamicHMC but I think it is a better idea to seek some suggestions from you based on your expertise.

I have followed your suggestions trying to solve my problem. Your first two suggestions are relatively easy to follow while I have spent some more time to figure out the third. After tracking down my problem, I found that my calculation of log-likelihood function involves an eigen calculation which is currently not supported for the Dual type of Forwarddiff. This can be seen in this trivial example

julia> da = [ForwardDiff.Dual(1., 0.) ForwardDiff.Dual(2., 3.); ForwardDiff.Dual(0., 1.) ForwardDiff.Dual(2., 0.)]
 2×2 Array{ForwardDiff.Dual{Nothing,Float64,1},2}:
 Dual{Nothing}(1.0,0.0)  Dual{Nothing}(2.0,3.0)
 Dual{Nothing}(0.0,1.0)  Dual{Nothing}(2.0,0.0)

julia> eigen(da)
ERROR: MethodError: no method matching eigen!(::Array{ForwardDiff.Dual{Nothing,Float64,1},2}; permute=true, scale=true, sortby=LinearAlgebra.eigsortby)
...

And I found there are some discussions related to this issue here and here. I have also tried to use the Zygote AD but also with some try/catch errors like this.

I have been thinking about some alternative ways to move forward. These are a few routes that I was thinking:

  1. program some other more pure Julia way of calculating eigenvalues and eigenvectors or try if I can avoid eigen
  2. try to see if it's possible to calculate the gradient manually
  3. try less efficient traditional MCMC method like M-H MCMC
  4. maybe check out this method after reading this discourse discussion

To be honest, I still think that my problem is most suited to using DynamicHMC and I am not quite sure yet whether the above alternative routes would solve my problem. So I wonder if you could give me some suggestions/comments to move forward? Any ideas and comments would be greatly appreciated. Thanks a lot.

Best regards,
Jack

@tpapp
Copy link
Owner

tpapp commented Apr 26, 2020

I would try to stick with NUTS/HMS, and work around the eigenvalues problem, eg make a small kernel function for that part that injects the derivatives manually. I usually look up these things in

@inproceedings{giles2008extended,
  title={An extended collection of matrix derivative results for forward and reverse mode algorithmic differentiation. An extended version of a paper that appeared in the proceedings of AD2008},
  author={Giles, M},
  booktitle={the 5th International Conference on Automatic Differentiation},
  year=2008
}

@y1my1
Copy link
Author

y1my1 commented Apr 26, 2020

This looks really promising. I will try to implement it. Thanks a lot.

@y1my1
Copy link
Author

y1my1 commented Apr 27, 2020

Hi @tpapp,

I checked the injection method. It looks like the ForwardDiff package has undergone some changes that some methods like get_jacobian and get_value have been renamed or removed and I am not sure how can I calculate the derivative, honestly.
Then I looked at my specific problem again and found that the matrices I want to eigen are symmetric. So I programmed this not efficient QR algorithm to calculate eigenvalues and eigenvectors

# calculate eigenvalues and eigenvectors using QR algorithm
function qr_eigen(A; tol=1e-7)
    previous = similar(A)
    X = copy(A)
    Qv = similar(A)

    for i in 1:1000
        previous[:] = X
        Q, R = qr(X)
        if i == 1
            Qv[:] = Q
        else
            Qv[:] = Qv*Q
        end

        X[:] = R * Q
        if isapprox(X, previous, atol=tol)
            #@debug i
            #@debug Q
            break
        end
    end
    return [X[i,i] for i in 1:size(X)[1]], Qv
end

which turns out to be compatible with ForwardDiff.Dual type. But now I encount an ArgumentError
which I saw someone mentioned here before

julia> mcmc_with_warmup(Random.GLOBAL_RNG, ∇tP, 1_00)           
┌ Debug: total iterations
│   iter = 224
└ @ Main ~/Programming/rdexp_hmc/code/likelihood.jl:247
┌ Debug: total iterations
│   iter = 224
└ @ Main ~/Programming/rdexp_hmc/code/likelihood.jl:247
[ Info: finding initial optimum
┌ Debug: total iterations
│   iter = 224
└ @ Main ~/Programming/rdexp_hmc/code/likelihood.jl:247
┌ Debug: total iterations
│   iter = 224
└ @ Main ~/Programming/rdexp_hmc/code/likelihood.jl:247
ERROR: ArgumentError: Value and slope at step length = 0 must be finite.
Stacktrace:
 [1] (::LineSearches.HagerZhang{Float64,Base.RefValue{Bool}})(::Function, ::LineSearches.var"#ϕdϕ#6"{Optim.ManifoldObjective{NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}},Array{Float64,1},Array{Float64,1},Array{Float64,1}}, ::Float64, ::Float64, ::Float64) at /Users/jack/.julia/packages/LineSearches/WrsMD/src/hagerzhang.jl:117
 [2] HagerZhang at /Users/jack/.julia/packages/LineSearches/WrsMD/src/hagerzhang.jl:101 [inlined]
 [3] perform_linesearch!(::Optim.LBFGSState{Array{Float64,1},Array{Array{Float64,1},1},Array{Array{Float64,1},1},Float64,Array{Float64,1}}, ::Optim.LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#19#21"}, ::Optim.ManifoldObjective{NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}}) at /Users/jack/.julia/packages/Optim/UkDyx/src/utilities/perform_linesearch.jl:53
 [4] update_state!(::NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Optim.LBFGSState{Array{Float64,1},Array{Array{Float64,1},1},Array{Array{Float64,1},1},Float64,Array{Float64,1}}, ::Optim.LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#19#21"}) at /Users/jack/.julia/packages/Optim/UkDyx/src/multivariate/solvers/first_order/l_bfgs.jl:198
 [5] optimize(::NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}, ::Optim.LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#19#21"}, ::Optim.Options{Float64,Nothing}, ::Optim.LBFGSState{Array{Float64,1},Array{Array{Float64,1},1},Array{Array{Float64,1},1},Float64,Array{Float64,1}}) at /Users/jack/.julia/packages/Optim/UkDyx/src/multivariate/optimize/optimize.jl:57
 [6] optimize(::NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}, ::Optim.LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#19#21"}, ::Optim.Options{Float64,Nothing}) at /Users/jack/.julia/packages/Optim/UkDyx/src/multivariate/optimize/optimize.jl:33
 [7] warmup at /Users/jack/.julia/packages/DynamicHMC/10IZC/src/mcmc.jl:151 [inlined]
 [8] #31 at /Users/jack/.julia/packages/DynamicHMC/10IZC/src/mcmc.jl:445 [inlined]

I plan to do LogDensityProblems.stresstest you mentioned here tomorrow. I wonder if you would have any other suggestions/comments about where I should check? Any help are greatly appreciated. Thanks.

@tpapp
Copy link
Owner

tpapp commented May 2, 2020

The above usually happens when your likelihood/posterior is discontinuous.

I would recommend avoiding the eigendecomposition step altogether. This may be possible by reparametrizing the problem, but I can't be more specific without some context.

@y1my1
Copy link
Author

y1my1 commented May 2, 2020

Thanks a lot for your suggestions.
You are right. I see there are two issues with AD in my calculation of likelihood/posterior. Both of them involve some kind of iterative process.
The first one is what must be super familiar, like in Macroeconomics, an iterative process to get a value function such that starting with some initial guess of V. I see you have started some discussion in Julia discourse. I saw some similarities but need to find a way to do this.
The second one is this eigendecomposition step as you mentioned. I will try to figure out a MWE later. But your suggestions are always super helpful.

@tpapp
Copy link
Owner

tpapp commented Sep 2, 2022

Closing this because there was no activity for a while. Feel free to comment here if you want to reopen.

@tpapp tpapp closed this as completed Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants