We are doomed - AlphaGo Zero, learning only from basic rules

Michel · Post by **Michel** » Fri Oct 20, 2017 11:08 am

Interestingly the new AlphaGo uses residual neural networks. These supposedly do not suffer from layer saturation which is the phenomenon that performance does not improve or even degrades as more layers are added to the network.

https://arxiv.org/pdf/1512.03385.pdf

It might be interesting to try this in Giraffe. I think the author reported that adding more layers did not improve Giraffe. Perhaps converting to a residual neural network can fix this.

Note that the aim should be to have an eval which is _much_ better than the one of Stockfish, to offset the fact that NN are inherently slower.

Vinvin · Post by **Vinvin** » Fri Oct 20, 2017 1:12 pm

24 minutes video from DeepMind : https://www.youtube.com/watch?v=WXHFqTvfFSw

reflectionofpower · Post by **reflectionofpower** » Fri Oct 20, 2017 1:24 pm

Vinvin wrote:24 minutes video from DeepMind : https://www.youtube.com/watch?v=WXHFqTvfFSw

nice heads up, good video.

Uri Blass · Post by **Uri Blass** » Fri Oct 20, 2017 9:29 pm

Daniel Shawul wrote:
PK wrote:The only direct comparison of Giraffe eval with a decently strong engine that I am aware of:

http://talkchess.com/forum/viewtopic.ph ... 39&t=64096
I wan't actually aware of that test but the one given in Giraffe's paper on Page 25: https://arxiv.org/pdf/1509.01549v1.pdf

Peter's test also confirm that Giraffee's eval is close to Stockfish's, but it is not equally efficient due to 10x slowdown incurred by the NN evaluation. So it seems Giraffe has
already learned (probably not tabula rasa ? ) all the human chess knowledge of the past century...
Code: Select all
Giraffe &#40;1s&#41; 2400 258570 9641
Giraffe &#40;0.5s&#41; 2400 119843 9211
Giraffe &#40;0.1s&#41; 2400 24134 8526
Stockfish 5 3387 108540 10505
Senpai 1.0 3096 86711 9414
Texel 1.04 2995 119455 8494
Arasan 17.5 2847 79442 7961
Scorpio 2.7.6 2821 139143 8795
Crafty 24.0 2801 296918 8541
GNU Chess 6 / Fruit 2.1 2685 58552 8307
Sungorus 1.4 2309 145069 7729

I do not think that there is an engine that learned all the human knowledge of the past century.

Knowledge is also knowledge of tactical patterns when you do not need to calculate all legal moves of the opponent like a stupid computer.

For example

[D]8/2q5/8/4k3/8/8/7Q/4K3 b - - 0 1

humans know that black lose the queen without calculating all the legal moves of the black king because they know the pattern.

I do not know if giraffe or stockfish know it by evaluation.

Of course the situation is different when the position is the following and you know that black save the queen here without calculating the possible legal moves of black.

[D]8/4q3/8/4k3/8/8/4Q3/4K3 b - - 0 1

Another example is the following

[D]4k3/4Q3/8/6B1/8/8/8/4K3 b - - 0 1

You memorize the pattern queen defended by something when the black king is one square near it in the last rank and you do not need to calculate black's moves when you simply know it is mate because you remember the pattern.

I do not know if there are chess programs that know it and know that the following is the same pattern of course

[D]8/8/8/6Qk/5P2/8/8/4K3 b - - 0 1

but not the following that is not the same pattern

[D]5B2/8/7Q/7k/8/8/8/4K3 b - - 0 1

Uri

Dirt · Post by **Dirt** » Fri Oct 20, 2017 9:49 pm

Vinvin wrote:24 minutes video from DeepMind : https://www.youtube.com/watch?v=WXHFqTvfFSw

That was 24 minutes? It seemed more like 2:40 to me.

Rebel · Post by **Rebel** » Fri Oct 20, 2017 10:14 pm

Leo wrote:
Cardoso wrote:For that one I think they will take milleniums, or simply never.
They can't cure migranes or diabetes, much less stop aging.
And I think some people expect too much from science.
My mother had a severe skin disease on her feet called hyperkeratosis, with profound cracks in the sckin wich hurted badly, she was treated with the best doctor in the field in the country (portugal), with very agressive medications, and none of the several treatements worked. Desperate my mother tryed some plant called "malvas" in portuguese, after 2 weeks she was much better, after 6 weeks she was completed cured and the problem went completely away.
I think the human body is too complex for science.
Even a single cell is tremendously complex.
I very much agree. I am tired of the bombast from Google on what great things they are going to do for humanity. AI is overrated.

[a bit off-topic] I am (was) used to think so, but not any longer. Google CRISPR/Cas9 for instance and consider the potential.

Vinvin · Post by **Vinvin** » Tue Oct 24, 2017 3:22 pm

Dirt wrote:
Vinvin wrote:24 minutes video from DeepMind : https://www.youtube.com/watch?v=WXHFqTvfFSw
That was 24 minutes? It seemed more like 2:40 to me.

Sorry I mismatched with the first game demo here : https://www.youtube.com/watch?v=-Wh4CfsWDyM

Vinvin · Post by **Vinvin** » Tue Oct 24, 2017 3:39 pm

Daniel Shawul wrote:
Vinvin wrote:I hope for such an experience for chess : starting a very deep learning with only basic rules, piece coordinate and piece interaction.
Even no "piece value" concept hardcoded.
Giraffe already did that with the NN evaluation. It used hand-selected features as inputs (presence & location of pieces) etc, and was able to compete with Stockfish's eval. It is most likely possible to have a Giraffe-zero atleast for the evaluation only --i.e. it will learn everything the chess world knows about good static evaluation (not search) from self-play games only in a couple of hours.

If I read well the doc about Giraffe ( https://arxiv.org/pdf/1509.01549.pdf ), it uses 3 network layers.

In the doc about AlphaGo Zero Here , it uses 12 layers !

nature.com wrote:(2) AlphaGo Lee is the program that defeated Lee Sedol 4–1 in March 2016.
It was previously unpublished, but is similar in most regards to AlphaGo Fan (12).
However, we highlight several key differences to facilitate a fair comparison. First,
the value network was trained from the outcomes of fast games of self-play by
AlphaGo, rather than games of self-play by the policy network; this procedure
was iterated several times—an initial step towards the tabula rasa algorithm pre-
sented in this paper. Second, the policy and value networks were larger than those
described in the original paper—using 12 convolutional layers of 256 planes—
and were trained for more iterations. This player was also distributed over many
machines using 48 TPUs, rather than GPUs, enabling it to evaluate neural networks
faster during search.

Cardoso · Post by **Cardoso** » Tue Oct 24, 2017 6:41 pm

We are doomed alright, but it's not because of AphaGo Zero, or Google or anything of the kind.
If we are doomed it's because other reasons 99% of us dont't give a damn, even when advised/warned.
We adults often complain it's dificult to raise our kids, we complain they don't respond to our best efforts, they don't listen to us, they don't care of our advice.
Well adults behave the same way, they also don't respond to the best advices, they also insist in doing things their own way, and of course the result can't be good. So in this respect many adults didn't really grow up. We have too many mental barriers to sound advice.
Sorry if I sound too generic and cryptic, but I wouldn't like to give further details. Just look at the news tonight and think on today's society and it's problems and the caos families live in and maybe you will agree with me.

duncan · Post by **duncan** » Thu Oct 26, 2017 2:36 pm

does anyone know if AlphaGo Zero has hit a plateau or is it still gaining elo points ?

We are doomed - AlphaGo Zero, learning only from basic rules

Re: We are doomed - AlphaGo Zero, learning only from basic r

Re: We are doomed - AlphaGo Zero, learning only from basic r

Re: We are doomed - AlphaGo Zero, learning only from basic r

Re: We are doomed - AlphaGo Zero, learning only from basic r

Re: We are doomed - AlphaGo Zero, learning only from basic r

Re: We are doomed - AlphaGo Zero, learning only from basic r

Re: We are doomed - AlphaGo Zero, learning only from basic r

Re: We are doomed - AlphaGo Zero, learning only from basic r

Re: We are doomed - AlphaGo Zero, learning only from basic r

Re: We are doomed - AlphaGo Zero, learning only from basic r