AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
Posted: Wed Dec 06, 2017 9:22 am
Computer Chess Club
https://talkchess.com/
Thanks for the PDF file. I watched two games, included in your PDF, where AlphaZero won as black. Quite impressive game actually.clumma wrote:A truly stunning result. Matthew Lai is a coauthor!
https://arxiv.org/pdf/1712.01815.pdf
-Carl
As Daniel explains: no hard coded evaluation (software)...it's game play is based on learning (experience) from previous self-play games applied to a neural networkLyudmil Tsvetkov wrote:- Alpha had considerable hardware advantage
- SF played with version 8
- what was the code/software/evaluation base used for the first Alpha chess version, an advanced engine evaluation and search software or otherwise?
Code: Select all
Program Chess Shogi Go
AlphaZero 80k 40k 16k
Stockfish 70,000k
Elmo 35,000k
It actually is, instead of 4TPUs required to run Alpha0 so far, on x64 hardware one would need around 2000 Haswell cores to achieve the same speed of NN (80k patterns evaluated per second). Since NNs are huge, with smaller resources matrix multiplication would have to be broken into smaller sub-matrices which would exponentially slow down the calculation.kranium wrote:As Daniel explains: no hard coded evaluation (software)...it's game play is based on learning (experience) from previous self-play games applied to a neural networkLyudmil Tsvetkov wrote:- Alpha had considerable hardware advantage
- SF played with version 8
- what was the code/software/evaluation base used for the first Alpha chess version, an advanced engine evaluation and search software or otherwise?
5,000 first-generation TPUs to generate self-play games
and 64 second-generation TPUs to train the neural networks
The hardware advantage is not such an important factor during gameplay as one would imagine.
Milos wrote:It actually is, instead of 4TPUs required to run Alpha0 so far, on x64 hardware one would need around 2000 Haswell cores to achieve the same speed of NN (80k patterns evaluated per second). Since NNs are huge, with smaller resources matrix multiplication would have to be broken into smaller sub-matrices which would exponentially slow down the calculation.kranium wrote:As Daniel explains: no hard coded evaluation (software)...it's game play is based on learning (experience) from previous self-play games applied to a neural networkLyudmil Tsvetkov wrote:- Alpha had considerable hardware advantage
- SF played with version 8
- what was the code/software/evaluation base used for the first Alpha chess version, an advanced engine evaluation and search software or otherwise?
5,000 first-generation TPUs to generate self-play games
and 64 second-generation TPUs to train the neural networks
The hardware advantage is not such an important factor during gameplay as one would imagine.
Alpha0 iz basically behaving like huge highly selective opening book.kranium wrote:AlphaZero very selectively evaluating 80k vs Stockfish's 70,000k positions/sec, probably achieving tremendous depths at such speeds,
but I'd guess it's the deep (learned) positional eval which is primarily adding strength...