Alphazero news

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Alphazero news

Post by Laskos »

matthewlai wrote: Fri Dec 14, 2018 12:52 am
Laskos wrote: Fri Dec 14, 2018 12:34 am
Laskos wrote: Thu Dec 13, 2018 11:11 pm
matthewlai wrote: Thu Dec 13, 2018 7:42 pm
Yes, the same 12. The openings the most commonly played openings from a large set of high level human games (I don't remember which set off the top of my head). There is no cherry-picking. They were the top 12. If they favour AZ, that just means AZ plays common human openings better.

Yes, the score is more equal with TCEC openings, because there are many TCEC openings where both SF and AZ agree that one side is significantly ahead out of the opening. In the extreme case, if every opening starts with 95% winrate for one side, you'll get 0 Elo between any two reasonably strong player, even if one is 500 Elo stronger than the other. We also explained that in the paper. For someone with your statistical background, it really shouldn't be difficult to see.
As I have very limited options to answer, I want just to remark that previous TCEC superfinals with not 500 Elo model obeying Elo poins difference had even more skewed results from these "outrageous" openings.
Now that I am back home right to go to sleep after a crazy evening, I have to say a bit:

from those "outrageous" TCEC 2016 (Season 9) openings which even an "expert on statistics" (I am no any expert on statistics, I just know what I need) should understand that are useless, Stockfish 8 won the superfinal against Houdini 5 with exactly the same score as A0 beat SF8:

SF8 vs Houdini 5
+17 -8 =75

A0 vs SF8
+17 -8 =75

So, SF8, rated only some 30-40 Elo points above Houdini 5 on several rating lists, with engines obeying the Elo model, somehow from these "outrageous" TCEC openings, managed to beat convincingly (LOS=96%) Houdini 5 in 100 LTC games. It was either an accident, or these TCEC openings do have some significant resolution power.

That you used the same 12 "human openings" is bad. Everyone familiar with them with Lc0 knows that Lc0 overperforms on them by almost 100 Elo points. I am not claiming that you did it intentionally, but as Javier Ross said, most lead to closed position with few tactics, very favorable to Lc0 (A0).

All in all, I take your "outrageous TCEC openings" result as the most reliable (you introduce a huge systematic error in all the other results), and am of opinion that from usual openings used by me or usual testers, or from TCEC openings, SF10 and A0 are pretty closely matched. And I would bet 1:1 on SF10 to beat A0 (that one, not the improved one) from TCEC 2016 Season 9 openings in your conditions.
If you say those human openings are bad, what are you comparing it to? Clearly not start position, but another set of openings?

So if you have a set of openings that SF performs better on, and a set of openings that AZ performs better on, why is it unfair to use set AZ performs better on, but not the set that SF performs better on? Everything is relative here. We are basically defining a new game that is almost exactly like chess, but starting from arbitrary positions. The positions are part of the rules of this new game, and obviously changing the rules will change how different engines do.

The TCEC openings are all open and tactical openings, favouring SF. Why do you say they are more reliable?

If AZ can always play into closed openings from start position no matter what the opponent does, why should its performance on open openings be reflected in its Elo rating?

The most fair way I can think of to modify the rules to introduce diversity is to let engines play from start position, but force them to diverge from games already played in the match, maybe in the first 10 moves or so. The details would still need to be worked out, but that way engines still have a lot of power to choose openings they want to play (just like humans), but there's enough diversity to get low error on Elo. For example, maybe neither engine is allowed to repeat the same first 8 move sequence from its side, unless the opponent diverges first. With swapping colours this would be fair. So if both players repeat a past game, white loses on move 8.

Or just make it part of the game. Engines are allowed access to past games in the match, and if they decide to repeat games, the loser will just keep on losing. Just like human games.
I am really in pajama going to sleep. Play that A0 (not the improved one) as it wants, really as it wants, with the current free BrainFish version using UCI option for diversified openings, in your TCEC conditions. It's that easy. I would bet 1:1 that BrainFish wins. I may lose, but the bet seems favorable to me.
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: Alphazero news

Post by jp »

matthewlai wrote: Fri Dec 14, 2018 12:52 am If AZ can always play into closed openings from start position no matter what the opponent does, why should its performance on open openings be reflected in its Elo rating?
What makes you believe that?
clumma
Posts: 186
Joined: Fri Oct 10, 2014 10:05 pm
Location: Berkeley, CA

Re: Alphazero news

Post by clumma »

Laskos wrote: Fri Dec 14, 2018 1:04 am I am really in pajama going to sleep. Play that A0 (not the improved one) as it wants, really as it wants, with the current free BrainFish version using UCI option for diversified openings, in your TCEC conditions. It's that easy. I would bet 1:1 that BrainFish wins. I may lose, but the bet seems favorable to me.
Agreed.

-Carl
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: Alphazero news

Post by yanquis1972 »

If you want to measure a self-learning NNs true strength, you’d ideally play all games from move one. The 12 forced openings are varied enough imo. Maybe even further truncated positions could replace some or be added (1.e4 e5 as an example). In general I don’t believe NNs are meant to be used in isolation for broad analysis, they’re literally designed to win chess games from the opening move.

I hope in the future we’ll see nets trained on specific openings; I think it could be revolutionary.
User avatar
hgm
Posts: 27796
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Alphazero news

Post by hgm »

noobpwnftw wrote: Thu Dec 13, 2018 11:43 pmWell the "nearly" here is a form of compromise to truth because if we have perfect knowledge then the chances that "nearly" happens is zero.
So that means than a perfect player would perform very poorly from nearly won positions, almost never be able to convert one into a win, and always quickly letting them degenerate to a nearly lost one before it starts to fight. While a heuristic player against the same set of imperfect opponents migh win 90% or better.

A perfect player will play like crap against good heuristic opponents, (which understand the 'nearly'), in drawn positions. So it would be a really bad idea to use it to judge the quality of common opening lines, which should be all well within the draw zone.
noobpwnftw
Posts: 560
Joined: Sun Nov 08, 2015 11:10 pm

Re: Alphazero news

Post by noobpwnftw »

hgm wrote: Fri Dec 14, 2018 10:06 am
noobpwnftw wrote: Thu Dec 13, 2018 11:43 pmWell the "nearly" here is a form of compromise to truth because if we have perfect knowledge then the chances that "nearly" happens is zero.
So that means than a perfect player would perform very poorly from nearly won positions, almost never be able to convert one into a win, and always quickly letting them degenerate to a nearly lost one before it starts to fight. While a heuristic player against the same set of imperfect opponents migh win 90% or better.

A perfect player will play like crap against good heuristic opponents, (which understand the 'nearly'), in drawn positions. So it would be a really bad idea to use it to judge the quality of common opening lines, which should be all well within the draw zone.
I agree that it may play very ugly moves in the openings, but it does not mean or at least we do no know that whether those "nearly won/lost" positions are the majority of the positions across the entire game, if so, then your approximation of those "nearly won/lost" would be very wrong.

Either way, this proves my point that by introducing diversity to the opening moves does not seem to harm an engine's performance in any way, even if the engine is a perfect player.
Werewolf
Posts: 1796
Joined: Thu Sep 18, 2008 10:24 pm

Re: Alphazero news

Post by Werewolf »

@ Matthew

One thing I never understood from the original match was why SF8 had hash set to 1 GB. This may raise nps, but it surely lowered strength. Also the decision to have HT on was questionable.

What were the reasons for this?
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: Alphazero news

Post by matthewlai »

Werewolf wrote: Fri Dec 14, 2018 11:36 am @ Matthew

One thing I never understood from the original match was why SF8 had hash set to 1 GB. This may raise nps, but it surely lowered strength. Also the decision to have HT on was questionable.

What were the reasons for this?
We set it to 1GB in the very beginning because we wanted to play many games in parallel on the same machine (with much lower core count per game). It slipped through to the final version and no one noticed (realistically speaking, I was probably the only person who could have noticed, since I am the only person on the team familiar with conventional chess engines, so it was my fault) until we went back to look at configs to write the paper. In the interest of full disclosure and academic integrity, we included that information in the preprint, and fixed it for the final paper. We tried to copy SF TCEC config as much as possible for the final paper.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Alphazero news

Post by Michel »

If AZ can always play into closed openings from start position no matter what the opponent does, why should its performance on open openings be reflected in its Elo rating?
It is a question of philosophy. As 100% of the practical use of chess engines consists of analysis one can argue that a chess engine should be able to play good chess in any (reasonable) position...
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: Alphazero news

Post by matthewlai »

Michel wrote: Fri Dec 14, 2018 1:20 pm
If AZ can always play into closed openings from start position no matter what the opponent does, why should its performance on open openings be reflected in its Elo rating?
It is a question of philosophy. As 100% of the practical use of chess engines consists of analysis one can argue that a chess engine should be able to play good chess in any (reasonable) position...
That is totally true, but the claim of the paper is that we have created an engine that is good at playing chess (and Go and shogi), and not one that is useful for analysis :).
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.