AlphaZero performance

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
mar
Posts: 2001
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

AlphaZero performance

Post by mar » Mon Dec 25, 2017 1:44 am

Just for fun, I generated virtual pgn of all A0-SF8 games (including opening positions, so we have a total of 1300 games) and ran through Ordo.
(after having seen this nonsense: https://www.youtube.com/watch?v=eN7BMWl_mpw)

Input (from A0's POV, all games):

Code: Select all

white:
267W 378D 5L
black:
51W 580D 19L
All games:

Code: Select all


   # PLAYER         : RATING  ERROR   POINTS  PLAYED    (%)
   1 AlphaZero      : 3383.5   14.1    797.0    1300   61.3%
   2 Stockfish 8    : 3300.0   ----    503.0    1300   38.7%

White advantage = 66.25 +/- 6.98
Draw rate (equal opponents) = 50.00 % +/- 0.00
The match:

Code: Select all


   # PLAYER         : RATING  ERROR   POINTS  PLAYED    (%)
   1 AlphaZero      : 3406.8   53.9     64.0     100   64.0%
   2 Stockfish 8    : 3300.0   ----     36.0     100   36.0%

White advantage = 85.72 +/- 26.87
Draw rate (equal opponents) = 50.00 % +/- 0.00
Ordo commandline:

Code: Select all

ordo-win32.exe -a 3300 -A "Stockfish 8" -W -p a0_sf8.pgn -s1000 -o rating_a0.txt
Generated PGN files (result only):
All: http://www.crabaware.com/Test/a0_sf8.pgn
Match: http://www.crabaware.com/Test/a0_sf8_match.pgn

So we can probably conclude that A0 is at most 100 elo stronger than SF8, which is a pretty good result for SF actually (certainly not crushing or devastating),
by which I don't want to play down the amazing achievement of A0 (even if the match ended the other way it'd still be huge)

Question remains how far could Deepmind go by training longer and using more than one 4TPU machine (assuming it would scale better than SF).

However, DeepMind certainly has much higher ambitions than claiming superiority in computer chess :), I wonder why so many people are upset.

The traditional tedious way of doing computer chess still lives (at least for the top dogs) and we should be grateful to DeepMind for showing us that there's still room for improvement.

User avatar
Ovyron
Posts: 2701
Joined: Tue Jul 03, 2007 2:30 am

Re: AlphaZero performance

Post by Ovyron » Mon Dec 25, 2017 2:04 am

Thanks. Gotta love results from virtual PGNs, not actually worse than IPON results :)

User avatar
hgm
Posts: 23713
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: AlphaZero performance

Post by hgm » Mon Dec 25, 2017 9:29 am

The easiest way to improve on Alpha Zero is not to train it longer, or make it search more nodes by using faster hardware during playing. It is starting the learning from scratch with a better NN. E.g. one that is better adapted to Chess, rather than general enough to also do go. By offering it efficiently pre-processed features of the position, such as SEE values for each square, X-rays, pins, etc.

mar
Posts: 2001
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: AlphaZero performance

Post by mar » Mon Dec 25, 2017 3:01 pm

hgm wrote:The easiest way to improve on Alpha Zero is not to train it longer, or make it search more nodes by using faster hardware during playing. It is starting the learning from scratch with a better NN. E.g. one that is better adapted to Chess, rather than general enough to also do go. By offering it efficiently pre-processed features of the position, such as SEE values for each square, X-rays, pins, etc.
I haven't though about this and it makes perfect sense. It would be exciting to know how much they could improve with this approach (I guess adding more features might correlate with NN size and thus performance, but that could be tuned as well).

I guess DeepMind will move on to other challenges so we'll probably have to wait until similar poweful hardware becomes common (or maybe building a training network of tens of thousands of volunteers which seems unlikely).

User avatar
hgm
Posts: 23713
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: AlphaZero performance

Post by hgm » Mon Dec 25, 2017 3:16 pm

I am not sure how unrealistic that is. We don't need to train it in 4 hours. One year would be fine.

CheckersGuy
Posts: 273
Joined: Wed Aug 24, 2016 7:49 pm

Re: AlphaZero performance

Post by CheckersGuy » Mon Dec 25, 2017 4:41 pm

Just take a loot at the LeelaZero project for inspiration. Training progress is made daily even though they are using an inferior neural network compared to alphaGoZero.

Leo
Posts: 832
Joined: Fri Sep 16, 2016 4:55 pm
Location: USA/Minnesota
Full name: Leo

Re: AlphaZero performance

Post by Leo » Mon Dec 25, 2017 5:32 pm

mar wrote:Just for fun, I generated virtual pgn of all A0-SF8 games (including opening positions, so we have a total of 1300 games) and ran through Ordo.
(after having seen this nonsense: https://www.youtube.com/watch?v=eN7BMWl_mpw)

Input (from A0's POV, all games):

Code: Select all

white:
267W 378D 5L
black:
51W 580D 19L
All games:

Code: Select all


   # PLAYER         : RATING  ERROR   POINTS  PLAYED    (%)
   1 AlphaZero      : 3383.5   14.1    797.0    1300   61.3%
   2 Stockfish 8    : 3300.0   ----    503.0    1300   38.7%

White advantage = 66.25 +/- 6.98
Draw rate (equal opponents) = 50.00 % +/- 0.00
The match:

Code: Select all


   # PLAYER         : RATING  ERROR   POINTS  PLAYED    (%)
   1 AlphaZero      : 3406.8   53.9     64.0     100   64.0%
   2 Stockfish 8    : 3300.0   ----     36.0     100   36.0%

White advantage = 85.72 +/- 26.87
Draw rate (equal opponents) = 50.00 % +/- 0.00
Ordo commandline:

Code: Select all

ordo-win32.exe -a 3300 -A "Stockfish 8" -W -p a0_sf8.pgn -s1000 -o rating_a0.txt
Generated PGN files (result only):
All: http://www.crabaware.com/Test/a0_sf8.pgn
Match: http://www.crabaware.com/Test/a0_sf8_match.pgn

So we can probably conclude that A0 is at most 100 elo stronger than SF8, which is a pretty good result for SF actually (certainly not crushing or devastating),
by which I don't want to play down the amazing achievement of A0 (even if the match ended the other way it'd still be huge)

Question remains how far could Deepmind go by training longer and using more than one 4TPU machine (assuming it would scale better than SF).

However, DeepMind certainly has much higher ambitions than claiming superiority in computer chess :), I wonder why so many people are upset.

The traditional tedious way of doing computer chess still lives (at least for the top dogs) and we should be grateful to DeepMind for showing us that there's still room for improvement.
I didn't like AZs arrogant in your face attitude in their press releases plus the fact that they handicapped SF.
Advanced Micro Devices fan.

Leo
Posts: 832
Joined: Fri Sep 16, 2016 4:55 pm
Location: USA/Minnesota
Full name: Leo

Re: AlphaZero performance

Post by Leo » Mon Dec 25, 2017 5:47 pm

mar wrote:
hgm wrote:The easiest way to improve on Alpha Zero is not to train it longer, or make it search more nodes by using faster hardware during playing. It is starting the learning from scratch with a better NN. E.g. one that is better adapted to Chess, rather than general enough to also do go. By offering it efficiently pre-processed features of the position, such as SEE values for each square, X-rays, pins, etc.
I haven't though about this and it makes perfect sense. It would be exciting to know how much they could improve with this approach (I guess adding more features might correlate with NN size and thus performance, but that could be tuned as well).

I guess DeepMind will move on to other challenges so we'll probably have to wait until similar poweful hardware becomes common (or maybe building a training network of tens of thousands of volunteers which seems unlikely).
I think they have moved on already. I am sure there is enough brainpower with people in this forum to set up and develop a similar approach.
Advanced Micro Devices fan.

User avatar
Rebel
Posts: 4699
Joined: Thu Aug 18, 2011 10:04 am

Re: AlphaZero performance

Post by Rebel » Tue Dec 26, 2017 9:45 am

mar wrote:Just for fun, I generated virtual pgn of all A0-SF8 games (including opening positions, so we have a total of 1300 games) and ran through Ordo. (after having seen this nonsense: https://www.youtube.com/watch?v=eN7BMWl_mpw)
Turned it off half way but the guy in the beginning makes the claim the diagrams (page 6) are AZ vs SF games. Now looking at the result (64.4%) that seems likely but from the text in the paper it's all not so obvious, rather confusing, YMMV.

Table 2: Analysis of the 12 most popular human openings (played more than 100,000 times in an online database (1)). Each opening is labelled by its ECO code and common name. The plot shows the proportion of self-play training games in which AlphaZero played each opening, against training time. We also report the win/draw/loss results of 100 game AlphaZero vs. Stockfish matches starting from each opening, as either white (w) or black (b), from AlphaZero's perspective. Finally, the principal variation (PV) of AlphaZero is provided from each opening.

So who played who and can we conclude each of these 12 matches of each 100 games started from the position in the diagram?

Albert Silver
Posts: 2860
Joined: Wed Mar 08, 2006 8:57 pm
Location: Rio de Janeiro, Brazil

Re: AlphaZero performance

Post by Albert Silver » Tue Dec 26, 2017 7:23 pm

Rebel wrote:
mar wrote:Just for fun, I generated virtual pgn of all A0-SF8 games (including opening positions, so we have a total of 1300 games) and ran through Ordo. (after having seen this nonsense: https://www.youtube.com/watch?v=eN7BMWl_mpw)
Turned it off half way but the guy in the beginning makes the claim the diagrams (page 6) are AZ vs SF games. Now looking at the result (64.4%) that seems likely but from the text in the paper it's all not so obvious, rather confusing, YMMV.

Table 2: Analysis of the 12 most popular human openings (played more than 100,000 times in an online database (1)). Each opening is labelled by its ECO code and common name. The plot shows the proportion of self-play training games in which AlphaZero played each opening, against training time. We also report the win/draw/loss results of 100 game AlphaZero vs. Stockfish matches starting from each opening, as either white (w) or black (b), from AlphaZero's perspective. Finally, the principal variation (PV) of AlphaZero is provided from each opening.

So who played who and can we conclude each of these 12 matches of each 100 games started from the position in the diagram?
Not sure I understand the first question 'Who played who?'

I don't think a lot can be concluded from the reported results. Let's start with the assumption that A0 is about 100 Elo stronger, and that it is in all positions (which I'm certain is not correct): that will strongly influence the results, meaning you cannot ascribe any inherent value to the opening's worth. Then there is the opening itself: some openings might simply be worse objectively, or require a more exploitative approach, contrary to the equilibrium one that a NN would inevitably use.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."

Post Reply