Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supervised learning & AlphaZero.jl #2

Open
StepHaze opened this issue Jun 8, 2022 · 0 comments
Open

Supervised learning & AlphaZero.jl #2

StepHaze opened this issue Jun 8, 2022 · 0 comments

Comments

@StepHaze
Copy link

StepHaze commented Jun 8, 2022

Sorry to disturb you again.
I'm making a project with AlphaZero.jl. I created files for a new board game and started a training, but the learning is VERY slow. I'm afraid it will take eternity.
So I decided to use played games of good players (i.e. supervised learning)

Jonathan Laurent wrote:
"I guess what you've have to do is generate many samples of the kind that are stored in AlphaZero's memory buffer. You can take these samples either from human play data or have other players play against each other to generate data. If you do so, be careful to add some exploration so that the same game is not played again and again and that you get some diversity in your data. Once you've got the data, you can either use the Trainer utility in learning.jl or just write your training procedure yourself in Flux."

I still don't understand, in which format the games and moves are stored in memory buffer.
Please help me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant