Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StackOverflowError (during training) #116

Closed
StepHaze opened this issue May 10, 2022 · 5 comments
Closed

StackOverflowError (during training) #116

StepHaze opened this issue May 10, 2022 · 5 comments

Comments

@StepHaze
Copy link

I try to implement a new game.
Scripts.test_game and Scripts.dummy_run were passed OK.

After I run julia --project -e 'using AlphaZero; Scripts.train("best")',
I get this error message:

Initializing a new AlphaZero environment

Initial report

Number of network parameters: 1,544,262
Number of regularized network parameters: 1,539,712
Memory footprint per MCTS node: 240 bytes

Running benchmark: AlphaZero against MCTS (1000 rollouts)

Progress:   2%|█▌                                                                           |  ETA: 1:25:28StackOverflowError:StackOverflowError:

Stacktrace:
[1]
Stacktrace:Array

 @  [1]./ boot.jl:457Array [inlined]

@  ./ [2]boot.jl:457  [inlined]Array

 @  [2]./ boot.jl:466Array [inlined]

@  ./ [3]boot.jl:466  [inlined]similar

 @  [3]./ array.jl:378similar [inlined]

@  ./ [4]array.jl:378  [inlined]similar

 @  [4]./ abstractarray.jl:783similar [inlined]

@  ./ [5]abstractarray.jl:783  [inlined]_unsafe_getindex

( #unused# [5]:: IndexLinear_unsafe_getindex, (A#unused#::::VectorIndexLinear{Int64}, , AI::::VectorBase.LogicalIndex{Int64}{Int64, StaticArrays.SVector{5, Bool}}, )I
:: @ Base.LogicalIndexBase{Int64, StaticArrays.SVector{5, Bool}} )./
multidimensional.jl:851 @ Base
./ [6]multidimensional.jl:851 _getindex

 @  [6]./ multidimensional.jl:839_getindex [inlined]

@  ./ [7]multidimensional.jl:839  [inlined]getindex

 @  [7]./ abstractarray.jl:1218getindex [inlined]

@  ./ [8]abstractarray.jl:1218  [inlined]available_actions

( game [8]:: AlphaZero.Examples.Best.GameEnvavailable_actions)(
game @ ::AlphaZero.GameInterfaceAlphaZero.Examples.Best.GameEnv )~/Downloads/Ju/AlphaZero.jl/src/
game.jl:320 @ AlphaZero.GameInterface
/Downloads/Ju/AlphaZero.jl/src/ [9]game.jl:320 run_simulation!
( env [9]:: AlphaZero.MCTS.Envrun_simulation!{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}(, envgame::::AlphaZero.MCTS.EnvAlphaZero.Examples.Best.GameEnv{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}; , ηgame::::VectorAlphaZero.Examples.Best.GameEnv{Float64}; , ηroot::::VectorBool{Float64}),
root @ ::AlphaZero.MCTSBool )
/Downloads/Ju/AlphaZero.jl/src/
mcts.jl:204 @ AlphaZero.MCTS
/Downloads/Ju/AlphaZero.jl/src/[10]mcts.jl:204 run_simulation!
( env[10]:: AlphaZero.MCTS.Envrun_simulation!{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}(, envgame::::AlphaZero.MCTS.EnvAlphaZero.Examples.Best.GameEnv{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}; , ηgame::::VectorAlphaZero.Examples.Best.GameEnv{Float64}; , ηroot::::VectorBool{Float64}), (repeats 11814 times)root
:: @ BoolAlphaZero.MCTS) (repeats 11814 times)
/Downloads/Ju/AlphaZero.jl/src/
mcts.jl:218 @ AlphaZero.MCTS
/Downloads/Ju/AlphaZero.jl/src/[11]mcts.jl:218 explore!
( env[11]:: AlphaZero.MCTS.Envexplore!{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}(, envgame::::AlphaZero.MCTS.EnvAlphaZero.Examples.Best.GameEnv{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}, , nsimsgame::::Int64AlphaZero.Examples.Best.GameEnv),
nsims @ ::AlphaZero.MCTSInt64 )
/Downloads/Ju/AlphaZero.jl/src/
mcts.jl:243 @ AlphaZero.MCTS
/Downloads/Ju/AlphaZero.jl/src/[12]mcts.jl:243 think
( p[12]:: MctsPlayerthink{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}(, pgame::::MctsPlayerAlphaZero.Examples.Best.GameEnv{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}),
game @ ::AlphaZeroAlphaZero.Examples.Best.GameEnv )
/Downloads/Ju/AlphaZero.jl/src/
play.jl:198 @ AlphaZero
~/Downloads/Ju/AlphaZero.jl/src/[13]play.jl:198 think

 @ [13]~/Downloads/Ju/AlphaZero.jl/src/ play.jl:259think [inlined]

@  ~/Downloads/Ju/AlphaZero.jl/src/[14]play.jl:259  [inlined]play_game

( gspec[14]:: AlphaZero.Examples.Best.GameSpecplay_game, (playergspec::::TwoPlayersAlphaZero.Examples.Best.GameSpec{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.Batchifier.BatchedOracle{AlphaZero.Batchifier.var"#8#9"}}}, MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}}, ; playerflip_probability::::TwoPlayersFloat64{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.Batchifier.BatchedOracle{AlphaZero.Batchifier.var"#8#9"}}}, MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}});
flip_probability @ ::AlphaZeroFloat64 )~/Downloads/Ju/AlphaZero.jl/src/
play.jl:308 @ AlphaZero
/Downloads/Ju/AlphaZero.jl/src/[15]play.jl:308 (::AlphaZero.var"#simulate_game#70"{TwoPlayers{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.Batchifier.BatchedOracle{AlphaZero.Batchifier.var"#8#9"}}}, MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}}, AlphaZero.Benchmark.var"#5#9"{ProgressMeter.Progress}, Simulator{AlphaZero.Benchmark.var"#4#8"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, AlphaZero.Benchmark.var"#net#6"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, typeof(record_trace)}, AlphaZero.Examples.Best.GameSpec, SimParams})
( sim_id[15]:: Int64(::AlphaZero.var"#simulate_game#70"{TwoPlayers{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.Batchifier.BatchedOracle{AlphaZero.Batchifier.var"#8#9"}}}, MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}}, AlphaZero.Benchmark.var"#5#9"{ProgressMeter.Progress}, Simulator{AlphaZero.Benchmark.var"#4#8"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, AlphaZero.Benchmark.var"#net#6"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, typeof(record_trace)}, AlphaZero.Examples.Best.GameSpec, SimParams}))(
sim_id @ ::AlphaZeroInt64 )
/Downloads/Ju/AlphaZero.jl/src/
simulations.jl:232 @ AlphaZero
~/Downloads/Ju/AlphaZero.jl/src/[16]simulations.jl:232 macro expansion

 @ [16]~/Downloads/Ju/AlphaZero.jl/src/ util.jl:189macro expansion [inlined]

@  ~/Downloads/Ju/AlphaZero.jl/src/[17]util.jl:189  [inlined](::AlphaZero.Util.var"#9#10"{AlphaZero.var"#68#69"{AlphaZero.Benchmark.var"#5#9"{ProgressMeter.Progress}, Simulator{AlphaZero.Benchmark.var"#4#8"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, AlphaZero.Benchmark.var"#net#6"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, typeof(record_trace)}, AlphaZero.Examples.Best.GameSpec, SimParams, AlphaZero.var"#48#49"{Channel{Any}}, AlphaZero.var"#make#65"{Channel{Any}}}, UnitRange{Int64}, typeof(vcat), ReentrantLock})

( )[17]
@ (::AlphaZero.Util.var"#9#10"{AlphaZero.var"#68#69"{AlphaZero.Benchmark.var"#5#9"{ProgressMeter.Progress}, Simulator{AlphaZero.Benchmark.var"#4#8"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, AlphaZero.Benchmark.var"#net#6"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, typeof(record_trace)}, AlphaZero.Examples.Best.GameSpec, SimParams, AlphaZero.var"#48#49"{Channel{Any}}, AlphaZero.var"#make#65"{Channel{Any}}}, UnitRange{Int64}, typeof(vcat), ReentrantLock})AlphaZero.Util( )~/.julia/packages/ThreadPools/hwwUU/src/
macros.jl:261 @ AlphaZero.Util ~/.julia/packages/ThreadPools/hwwUU/src/macros.jl:261

Please help

@StepHaze
Copy link
Author

StepHaze commented May 11, 2022

Ok. I took scripts\mcts.jl, changed "tictactoe" to my game and ran it.
After few moves against AI, I got the error message

ERROR: LoadError: StackOverflowError:
Stacktrace:
[1] objectid
@ ./reflection.jl:302 [inlined]
[2] hash
@ ./hashing.jl:25 [inlined]
[3] hash(t::Tuple{AlphaZero.Examples.Best.Board, Int64}, h::UInt64)
@ Base ./tuple.jl:417
[4] hash
@ ./namedtuple.jl:195 [inlined]
[5] hash
@ ./hashing.jl:20 [inlined]
[6] hashindex
@ ./dict.jl:169 [inlined]
[7] ht_keyindex(h::Dict{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.StateInfo}, key::NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}})
@ Base ./dict.jl:284
[8] haskey
@ ./dict.jl:552 [inlined]
[9] state_info(env::AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}, state::NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}})
@ AlphaZero.MCTS ~/Downloads/Ju/AlphaZero.jl/src/mcts.jl:166
[10] run_simulation!(env::AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}, game::AlphaZero.Examples.Best.GameEnv; η::Vector{Float64}, root::Bool)
@ AlphaZero.MCTS ~/Downloads/Ju/AlphaZero.jl/src/mcts.jl:205
[11] run_simulation!(env::AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}, game::AlphaZero.Examples.Best.GameEnv; η::Vector{Float64}, root::Bool) (repeats 23785 times)
@ AlphaZero.MCTS ~/Downloads/Ju/AlphaZero.jl/src/mcts.jl:218
[12] explore!(env::AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}, game::AlphaZero.Examples.Best.GameEnv, nsims::Int64)
@ AlphaZero.MCTS ~/Downloads/Ju/AlphaZero.jl/src/mcts.jl:243
[13] think(p::MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}, game::AlphaZero.Examples.Best.GameEnv)
@ AlphaZero ~/Downloads/Ju/AlphaZero.jl/src/play.jl:202
[14] select_move
@ ~/Downloads/Ju/AlphaZero.jl/src/play.jl:49 [inlined]
[15] select_move(p::TwoPlayers{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}, Human}, game::AlphaZero.Examples.Best.GameEnv, turn::Int64)
@ AlphaZero ~/Downloads/Ju/AlphaZero.jl/src/play.jl:265
[16] interactive!(game::AlphaZero.Examples.Best.GameEnv, player::TwoPlayers{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}, Human})
@ AlphaZero ~/Downloads/Ju/AlphaZero.jl/src/play.jl:364
[17] interactive!
@ ~/Downloads/Ju/AlphaZero.jl/src/play.jl:375 [inlined]
[18] interactive!(game::AlphaZero.Examples.Best.GameSpec, white::MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}, black::Human)
@ AlphaZero ~/Downloads/Ju/AlphaZero.jl/src/play.jl:377
[19] top-level scope
@ ~/Downloads/Ju/AlphaZero.jl/scripts/mcts.jl:8
[20] include(fname::String)
@ Base.MainInclude ./client.jl:451
[21] top-level scope
@ none:1
in expression starting at /home/haze/Downloads/Ju/AlphaZero.jl/scripts/mcts.jl:8

Please give me the right direction.

@EngrStudent
Copy link

I am getting overflow of gpu memory errors too.

@EngrStudent
Copy link

I re-ran using tic-tac toe game, and kept the System Monitor and the nvidia-smi tool to track hardware usage before the freeze. My video froze, and the memory was at very high usage, so I had the thought that it was a memory overflow problem.

I then looked through the paramters (params.jl) and found a "memory buffer size" variable that I reduced by an order of magnitude (80k --> 8k). The code ran without crashing but learning was poor. I increased size to 40k, and it both ran and learned. There is a trade-off between "small enough to not crash" and "big enough to not act like a lobotomy". Manually hand-holding that is going to be a pain, but it is one way to limp forward.

@jonathan-laurent
Copy link
Owner

Thanks for the feedback. I will be adding an option to store memory buffer samples on disk.

@smart-fr
Copy link

Thanks for the feedback. I will be adding an option to store memory buffer samples on disk.

Yes please! 😘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants