🧠 NN: Aurora-5b3145691f #34

TheBlackPlague · 2023-07-29T21:45:38Z

🎯 Summary

This PR improves over the previous network by a considerable margin, much thanks to the Berserk 11 Depth 9 Data added to the mix. The new dataset is generated in Lc0 style, adding a few bad moves to increase uniformity. The initial sample was for experimental purposes but led to great results. With more research into this, a full-blown revamped dataset will lead to considerable elo gains.

🧬 Architecture and Hyper-Parameter

       IN          ACCUMULATOR                               HIDDEN                                    OUT
 ______________      _______      ______________________________________________________________      _____
| WHITE: (768) | -> | (384) | -> | ClippedReLU(0, 1) -> (384) \                                 |    |     |
|              |    |       |    |                             \                                |    |     |
|              |    |       |    |                              CONCATENATE(ColorToMove): (768) | -> | (1) |
|              |    |       |    |                             /                                |    |     |
| BLACK: (768) | -> | (384) | -> | ClippedReLU(0, 1) -> (384) /                                 |    |     |
 --------------      -------      --------------------------------------------------------------      -----

Codename: Aurora
ID: 5b3145691f

Data: 2B FEN --- DEPTH: 9 (1.2B) + NODES: 5K (500M) + DEPTH: 9 (300M - B11_0000/d9_300m)
Batch Size: 750K
Epochs: 152
LR: 9e-3
LR Drop Step: 1
LR Drop Last: 140
LR Gamma: 0.985
WDL: 0.2
Scale: 400

👏 Acknowledgements

Lee Durbin: Lee Durbin is the one that trained this marvelous new network, assisting with hyper-parameter optimization & data improvement. Without Lee, many of StockDory's Neural Networks wouldn't ever be realized. Thank you for your overwhelming support. 🙌
Jay Honnold: The data behind StockDory's Neural Networks is generated using the Berserk Chess Engine, with the latest data generated using Berserk 11. Without Jay's sustained assistance and support, whether through mentoring or assistance with Berserk, it would be hard for StockDory to generate good data for Neural Network Training. 🙌
Viren: Viren explained Lc0's high-quality data-generation methods, where bad moves are added to increase data uniformity. Without Viren mentioning it, this wouldn't have been tried in the foreseeable future. More ideas mentioned by Viren, used by Lc0 for data generation, will be tried in the future. 🙌

📈 ELO

STC:

ELO   | 14.75 +- 8.25 (95%)
SPRT  | 10.0+0.10s Threads=1 Hash=16MB
LLR   | 2.95 (-2.94, 2.94) [0.00, 5.00]
GAMES | N: 3960 W: 1236 L: 1068 D: 1656

LTC:

ELO   | 9.86 +- 6.33 (95%)
SPRT  | 60.0+0.60s Threads=1 Hash=256MB
LLR   | 2.94 (-2.94, 2.94) [0.00, 5.00]
GAMES | N: 6096 W: 1695 L: 1522 D: 2879

Bench: 7323439

Replace the old network with Aurora-5b3145691f.

553e96d

Bench: 7323439

TheBlackPlague added + ELO This change gains ELO. = DOC This change doesn't improve the documentation. labels Jul 29, 2023

TheBlackPlague self-assigned this Jul 29, 2023

TheBlackPlague marked this pull request as ready for review July 29, 2023 22:05

TheBlackPlague merged commit 4e50b9f into master Jul 29, 2023

TheBlackPlague deleted the Aurora-5b3145691f branch July 29, 2023 22:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🧠 NN: Aurora-5b3145691f #34

🧠 NN: Aurora-5b3145691f #34

TheBlackPlague commented Jul 29, 2023 •

edited

Loading

🧠 NN: Aurora-5b3145691f #34

🧠 NN: Aurora-5b3145691f #34

Conversation

TheBlackPlague commented Jul 29, 2023 • edited Loading

🎯 Summary

🧬 Architecture and Hyper-Parameter

👏 Acknowledgements

📈 ELO

TheBlackPlague commented Jul 29, 2023 •

edited

Loading