SF was more seriously handicapped than I thought

Laskos · Post by **Laskos** » Tue Jan 02, 2018 1:22 pm

Sorry for bringing A0 topic again, but this remarkable achievement of using NN + MCTS, and beating categorically the top conventional engine, is worth for me to have another short look at it.

First, I observed that hash size matters more than I thought. At 3s/move (1 thread) on my PC, hash is filled to some 50MB, and optimal hash size would be 128MB (40% hashfull). It was derived that the optimal hash needed against A0 was 128GB, but only 1GB was used. So, on my PC, I measured the effect of 128MB hash (optimal) against 1MB hash with SF dev at 3s/move. The result in 1000 games was a pretty surprising to me

+189 -66 =745
or +43 Elo points

In A0 versus SF8, that effect would be smaller (diminishing gains at LTC and hardware used), but not negligible.

Then, I decided to test SF dev at 12s/move (as an emulation of A0) versus SF dev at 3s/move, using 2moves_v1.epd for diversity of openings. The result was:

+19 -1 =20

So, an actual overshoot of what A0 did to SF8, but that doesn't bother me, as again, diminishing gains are at work in that real A0 match.

I left A0 as it was (SF dev at 12s/move), enabling it with a general and solid 3moves_GM.epd openings for variety, but pitted it against full panoply SF dev now. This full panoply SF dev is BrainFish + Cerebellum + Time Control 105''+ 1'' (equivalent in total time used to 3s/move) + Syzygy-6 from SSD. And the result is:

+3 -2 =35 for A0 (or SF dev at 12s/move).

The change from the previous result is pretty drastic. It is probably exaggerated by the fact that Cerebellum book is an anti-Stockfish book, but nevertheless, the draw rate increases dramatically, and the strength difference now is small.

That was just nitpicking, as I am sure if DeepMind will try seriously to improve upon A0, it will surpass anyway dramatically any conventional engine. Their achievement is remarkable, just a reminder to myself that this "panoply" of engines is not that unimportant.

Laskos · Post by **Laskos** » Tue Jan 02, 2018 2:16 pm

Forgot to mention that in the first match, SF was with 1MB hash, in the second, 128MB hash (close to optimal).

pilgrimdan · Post by **pilgrimdan** » Tue Jan 02, 2018 4:34 pm

Laskos wrote:Forgot to mention that in the first match, SF was with 1MB hash, in the second, 128MB hash (close to optimal).

makes you wonder why was DeepMind in such a hurry ... they could have taken a little bit more time ... beefed up a0 a little more ... and then gave Stockfish optimal conditions ... with the same result ... guess they didn't care what response would come from the chess community ...

CheckersGuy · Post by **CheckersGuy** » Tue Jan 02, 2018 6:02 pm

pilgrimdan wrote:
Laskos wrote:Forgot to mention that in the first match, SF was with 1MB hash, in the second, 128MB hash (close to optimal).
makes you wonder why was DeepMind in such a hurry ... they could have taken a little bit more time ... beefed up a0 a little more ... and then gave Stockfish optimal conditions ... with the same result ... guess they didn't care what response would come from the chess community ...

I think the paper had very low priority for DeepMind. I don't think they care at all what the computer-chess community thinks about this

Jouni · Post by **Jouni** » Tue Jan 02, 2018 6:09 pm

But where is the promised More details in full peer-reviewed paper coming soon (8.12.2017)? It didn't pass?

Vinvin · Post by **Vinvin** » Tue Jan 02, 2018 7:05 pm

+ in case of 64 threads, don't forget that the hashtables is the area where all the threads exchange their information. When the memory is low, the weakening will be even bigger with multithreading.

Milos · Post by **Milos** » Tue Jan 02, 2018 7:54 pm

CheckersGuy wrote:I think the paper had very low priority for DeepMind. I don't think they care at all what the computer-chess community thinks about this

Ofc that they don't care. The only thing they actually care is a shameless self-promotion. What else to expect from multibillion dollar advertising company very well known for highly unethical behaviour?
The naivety of some forum members here is really staggering.

Jesse Gersenson · Post by **Jesse Gersenson** » Tue Jan 02, 2018 9:46 pm

Vinvin wrote:+ in case of 64 threads, don't forget that the hashtables is the area where all the threads exchange their information. When the memory is low, the weakening will be even bigger with multithreading.

What does 64 threads mean?
32-core xeon with HT?
64 xeon cores?
32-core amd?
64 'google compute' virtual cores on multiple machines?
Something else?

Ovyron · Post by **Ovyron** » Tue Jan 02, 2018 10:07 pm

Milos wrote:Ofc that they don't care. The only thing they actually care is a shameless self-promotion. What else to expect from multibillion dollar advertising company very well known for highly unethical behaviour?
The naivety of some forum members here is really staggering.

Have you seen the Alpha Zero games analysis on Youtube? Done with latest stockfish where it gives -1.20 eval to a position completely won by white, and how its stubborn with its 0.00 evals for lost positions even when A0's moves are shown to it? Gives itself an edge whenever A0 sacrifices material for mobility because it can't tell its Bishop or Queen are uselessly stuck?

A stronger Stockfish wouldn't have changed anything because its problems are at the core of its approach, the whole point was to show how it thaught itself to be so strong, and it's not like Stockfish played in a Pentium 4 with 1MB RAM.

Some people can't see past a Google logo, and if the paper didn't come from DeepMind many people's reaction would have been different (unfortunately, many more people would have taken the skeptic way...)

JJJ · Post by **JJJ** » Wed Jan 03, 2018 3:02 am

Ovyron wrote:
Milos wrote:Ofc that they don't care. The only thing they actually care is a shameless self-promotion. What else to expect from multibillion dollar advertising company very well known for highly unethical behaviour?
The naivety of some forum members here is really staggering.
Have you seen the Alpha Zero games analysis on Youtube? Done with latest stockfish where it gives -1.20 eval to a position completely won by white, and how its stubborn with its 0.00 evals for lost positions even when A0's moves are shown to it? Gives itself an edge whenever A0 sacrifices material for mobility because it can't tell its Bishop or Queen are uselessly stuck?

A stronger Stockfish wouldn't have changed anything because its problems are at the core of its approach, the whole point was to show how it thaught itself to be so strong, and it's not like Stockfish played in a Pentium 4 with 1MB RAM.

Some people can't see past a Google logo, and if the paper didn't come from DeepMind many people's reaction would have been different (unfortunately, many more people would have taken the skeptic way...)

very nicely said !

SF was more seriously handicapped than I thought

SF was more seriously handicapped than I thought

Re: SF was more seriously handicapped than I thought

Re: SF was more seriously handicapped than I thought

Re: SF was more seriously handicapped than I thought

Re: SF was more seriously handicapped than I thought

Re: SF was more seriously handicapped than I thought

Re: SF was more seriously handicapped than I thought

Re: SF was more seriously handicapped than I thought

Re: SF was more seriously handicapped than I thought

Re: SF was more seriously handicapped than I thought