AlphaZero Chess is not that strong ...

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Dec 20, 2017 6:12 pm

Werewolf wrote:
Laskos wrote:

1/ Elo terminology is a bit misleading here.

Take these 3 results:

+2 -0 =8
+20 -0 =80
+200 -0 =800

...

And all in all, all your objections of handicapping SF8 are almost irrelevant.
I think your argument is excellent, I'd not thought of that, but I'm not persuaded by your conclusion.

If SF was raised by 100 elo (and this seems possible taking a variety of steps: hash, opening book, TB, HT OFF on a faster machine, TC and so on)
then the result would be much closer.

If it was (say) +10, =90

and this followed through to

+100
=900

then I would agree with you, but we don't know that.

Secondly, a really good book can sometimes catch the opponent out with a deep line. It could be that without its book SF just didn't get positions it could pressure A0 with.

Don't listen to Kai, he does not know much about chess.
It is more than obvious most of the games were decided in the early opening, so that was crucial.
Alpha repetitively won 2 French Defence games in the very same manner, 2 QID games in the very same manner, 4 more games, when it got huge advantage out of the opening. So, it is good to have your conclusions, but to know your facts: the opening book was by far the most important factor in the match, even weightier than the hardware advantage.

So that, people are drawing some conclusions using general rules, while they fail to consider the most important facts: that could not be very scientific.

PS. and those were only the published ones; if the 10 published games contain repetitive wins, I don't know what the unpublished games contain...

CheckersGuy · Post by **CheckersGuy** » Wed Dec 20, 2017 6:12 pm

If that's your mentality then you go around the world learning only a fraction of what you could learn

Ras · Post by **Ras** » Wed Dec 20, 2017 6:31 pm

Lyudmil Tsvetkov wrote:the opening book was by far the most important factor in the match

Nonsense. At 1 minute per move, Stockfish has quite some calculation depth on that hardware. If Stockfish willingly heads into positions it cannot handle well, then this is simply an engine shortcoming of Stockfish.

Werewolf · Post by **Werewolf** » Wed Dec 20, 2017 7:06 pm

Lyudmil Tsvetkov wrote:
PS. and those were only the published ones; if the 10 published games contain repetitive wins, I don't know what the unpublished games contain...

That's an interesting possibility. It really gives a whole new perspective on why they didn't publish the other 18 wins....perhaps they were so similar we'd smell a rat.

Werewolf · Post by **Werewolf** » Wed Dec 20, 2017 7:08 pm

hgm wrote: It only gives 5 Elo per doubling as long as the hash is still not large enough. Which is typically 10% of the size of the tree.) After that, there is no further advantage in enlarging the hash table (or it is even detrimental).

64 threads would fill 1 GB hash in seconds. I don't know what would be optimal but it must be quite a bit above this.

Evert · Post by **Evert** » Wed Dec 20, 2017 7:31 pm

Lyudmil Tsvetkov wrote: In what way is it strong?
Now, you have a weightlifter pulling 300 kilos from the ground, SF.
Then, you have Alpha, pulling 50. It is much weaker.
Then you add up 10 Alphas to pull the same weight, and they outperform SF.

In what way is this strong?

I don't understand, what the hell NN means.

You don't say. Gosh golly, I'd never have guessed.

You know, it's really a shame that A0 beat Stockfish. It means that people have inane diacussions about what tweaks can give Stockfish a 5 elo edge. Meanwhile the real point passes over their head in orbit.

It does not matter if A0 is 100 elo above or below Stockfish. It wouldn't matter if it scored on-par with, say, Hiarcs, or Crafty: it'd still be a massive breakthrough. If you don't understand why that is so, go educate yourself.

mjlef · Post by **mjlef** » Wed Dec 20, 2017 7:49 pm

gladius wrote:
hgm wrote:1GB hash seems pretty big. Have you actually measured if there is any advantage whatsoever of increasing the hash size?

And how much would AlphaZero gain by using a book, and not having to run at fixed time per move, in your estimate?
1GB hash is far too low. I have immense respect for the deepmind team, but this was a pretty serious error.

At 70 million positions/second, that 1GB of hash is being overwritten once every second (10 bytes per entry = 700MB overwritten/second). Even worse, it's being contended by all 64 threads on the same 1GB of RAM. To give some context, we usually play our 60 sec games with 64Mb of hash, with one thread!

In these games, at 1 minute/move, the hash table is all but useless. It should probably have been 32 or 64GB, and the machine certainly had that much RAM.

Based on the Komodo recommendation of a 40% fill rate per move, I estimate about 128 GB is about right assuming a reasonably fast 64 core machine.

So, if someone has a 64 core machine (or even 32 core), we could try this:

a. machine 1 uses1 GB Hash, Stockfish 8, no opening book. No Syzygy.
b. machine 2 uses the latest Stockfish, 64 GB or 128 GB of hash, with Syzygy. And a decent opening book.

It is unclear how they kept the programs from constantly playing the first few moves the same. We could use some very small opening book for the run.

Just use fixed time of 60 seconds per move. Assuming a typical games lasts 60 moves, then it would take 2 hours per games, or 200 hours total. I am interested in this enough to donate some to cover machine rental on Amazon EC2 or some suitable machine.

Mark

CheckersGuy · Post by **CheckersGuy** » Wed Dec 20, 2017 7:52 pm

mjlef wrote:
gladius wrote:
hgm wrote:1GB hash seems pretty big. Have you actually measured if there is any advantage whatsoever of increasing the hash size?

And how much would AlphaZero gain by using a book, and not having to run at fixed time per move, in your estimate?
1GB hash is far too low. I have immense respect for the deepmind team, but this was a pretty serious error.

At 70 million positions/second, that 1GB of hash is being overwritten once every second (10 bytes per entry = 700MB overwritten/second). Even worse, it's being contended by all 64 threads on the same 1GB of RAM. To give some context, we usually play our 60 sec games with 64Mb of hash, with one thread!

In these games, at 1 minute/move, the hash table is all but useless. It should probably have been 32 or 64GB, and the machine certainly had that much RAM.
Based on the Komodo recommendation of a 40% fill rate per move, I estimate about 128 GB is about right assuming a reasonably fast 64 core machine.

So, if someone has a 64 core machine (or even 32 core), we could try this:

a. machine 1 uses1 GB Hash, Stockfish 8, no opening book. No Syzygy.
b. machine 2 uses the latest Stockfish, 64 GB or 128 GB of hash, with Syzygy. And a decent opening book.

It is unclear how they kept the programs from constantly playing the first few moves the same. We could use some very small opening book for the run.

Just use fixed time of 60 seconds per move. Assuming a typical games lasts 60 moves, then it would take 2 hours per games, or 200 hours total. I am interested in this enough to donate some to cover machine rental on Amazon EC2 or some suitable machine.
Mark

Why would you want someone else to do this and donate ?

Dont you (and your team) have some hardware to run these tests on

hgm · Post by **hgm** » Wed Dec 20, 2017 7:54 pm

Werewolf wrote:64 threads would fill 1 GB hash in seconds. I don't know what would be optimal but it must be quite a bit above this.

Let's do a actual calculation: Stockfish was doing 70Mnps, and uses 10 bytes hash entry. That is 700MB/s. So it takes 14 sec before the overload factor of 10 is reached, and the search starts to suffer.

Doubling the table to 2GB, (for +5 Elo) would put this at 28 sec. Total thiking time was 60 sec. Because it was fixed time per move, the move in most cases would already have been decided on long before the 60 sec, with not enough time left to change it. So the fial 15 sec or so were usually wasted (perhaps only producing some hash entries that would benefit it on the next move, if it was lucky enough to guess the opponent move right). So whether it was hadicapped by hash starvation in these final 15 sec should have little effect. But the preceeding 45 sec is still longer than 28 sec. So one doubling is ot eough, we need 1.5 doubling, to ~3GB. Which would then give +7.5 Elo.

I cannot talk it up much more than that, sorry...

jhellis3 · Post by **jhellis3** » Wed Dec 20, 2017 8:13 pm

I would say 7.5 Elo is underrating the gain from the additional hash. Once saturated, it governs depth reached in search over a given period of time quite strongly. Thus, the question to ask, is how many additional ply could be gained, on average. Scaling at 2x which is quite conservative given SF's branching factor, we can estimate that SF could achieve at least 2-3 more ply on average (over a 1 min search). While I don't know the exact resulting Elo gain, I'd hazard a guess somewhere north of 7.5 Elo.

I think the important part is that, while there were many factors which disadvantaged SF in the testing conditions, the ultimate result would not have changed too much. The draw rate could have been quite a bit higher, but that is about it.

The reason SF lost many of the games it did had very little to do with search depth and quite a bit with holes in its eval (which would not be fixed with more favorable conditions).

AlphaZero Chess is not that strong ...

Re: AlphaZero Chess is not that strong ...

Re: AlphaZero Chess is not that strong ...

Re: AlphaZero Chess is not that strong ...

Re: AlphaZero Chess is not that strong ...

Re: AlphaZero Chess is not that strong ...

Re: AlphaZero Chess is not that strong ...

Re: AlphaZero Chess is not that strong ...

Re: AlphaZero Chess is not that strong ...

Re: AlphaZero Chess is not that strong ...

Re: AlphaZero Chess is not that strong ...