In chess,AlphaZero outperformed Stockfish after just 4 hours

Pio · Post by **Pio** » Mon Dec 18, 2017 10:34 pm

Hi Ed!

The thing is that the engine will not learn openings first. It will learn the endings first just like a human.

I do not find anything strange about their achievement. Of course you could do it even better by giving the ANN some very efficient patterns (as suggested by H.G.M and change the topology of the network to match the piece movements). You could also use symmetries to make it even better and have special rockad networks when the castling option is still available.

BR
Pio

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Mon Dec 18, 2017 11:10 pm

Henk wrote:I am waiting for the news that chess already has been solved.

By the way it is demotivating to work on your old school engine if you know that the new A0 approach give so much better results. Pity for all the work you spend on your alpha beta algorithm.

Time to realign to NN-nomen nescio.
And also, start saving for a TPU...

yurikvelo · Post by **yurikvelo** » Mon Dec 18, 2017 11:12 pm

hgm wrote:But I have little doubt that it would also be very strong in positions it never saw before (e.g. Chess960 start positions). If it did not understand the general principles of Chess, it could never have played well enough during the entire game to beat Stockfish.

Is it pure NN-engine, or it switches to some classic engine (e.g. Stockfish) once it reaches some point in a game (e.g. +3 eval by classic engine, or when game is simplified to some low count of pieces).

I can imagine NN doing well in early game, but cannot imagine late-mid or EG trained in 4 hours.
Is 6-men Syzygy ever can be understood by NN, except fully loaded into 'neuron memory'?

hgm · Post by **hgm** » Mon Dec 18, 2017 11:40 pm

It does not switch, but it is never a pure NN. It does a search of some sort. ot an alpha-beta search, but more like book building. It starts in the current game position, and then it starts building a tree, by stepping through the existing tree to the most promising leaf so far, adding one ply on top of that leaf, having the NN calculate win probabilities in the new leaves, and propagating the best new win probability back towards the root. It repeats that 80,000 times per second. And after one minute it then plays the move in the root that leads to the most promising leaf.

This way it can see complex tactics that would be way too difficult for the NN to grasp. It keeps doing this from start to end of the game. There is no opening book or EGT.

corres · Post by **corres** » Tue Dec 19, 2017 12:56 am

[quote="Rebel"]

No mobility, no king safety, no passed pawn evaluation, no castling knowledge, not even piece values?

[/quote]

If you imagine the neural network as a mark, a print of all played games during learning phase you can know why AlphaZero does not need any old fashioned categories. As an opening book, an endgame database also do not contain any information about these categories, the neural network (NN) of AlphaZero does not have them. Based on these characteristics one can think about NN that it is a kind of learning book or learning file. But NN does not give moves only it gives probability to win for every moves too.
Basicaly I can believe to make a strong chess playing machine with the help of self playing and self learning starting only from FIDE rules. But the too small learning game and learning time, the human like playing style, the very small number of games against Stockfish, the rough weakening of Stockfish playing power and the great media fuss around the result of DeepMind's group make me skeptical about they used "Tabula Rasa" only.

corres · Post by **corres** » Tue Dec 19, 2017 1:08 am

[quote="Henk"]

Probably when doing many simulations end game positions are evaluated right.
So in self play player wins that does evaluate end game best. And if end game is evaluated right then next stage is end of end game and so forth until opening stage is reached.
Each time neural network is improving so games get less random.

[/quote]

This is a very believable idea.
To create endgame databases this method was used too.

corres · Post by **corres** » Tue Dec 19, 2017 9:27 am

[quote="Lyudmil Tsvetkov"]

[quote="Henk"]
By the way it is demotivating to work on your old school engine if you know that the new A0 approach give so much better results. Pity for all the work you spend on your alpha beta algorithm.
[/quote]

Time to realign to NN-nomen nescio.
And also, start saving for a TPU...

[/quote]

Because there is no any chance to get an AlphaZero as home machine one will use the classical chess engines even for a long time. Because of this the works of developers on these machine are not wasted at all.
Hobby chess engine makers are also not able to use such a big system like DeepMind do.
We need Stockfish, Houdini, Komodo,... to play chess on our PCs.

Ludmil, you should save for the rent of working time from a firm owing a type of AlphaZero machine maybe from Chess Base.

corres · Post by **corres** » Tue Dec 19, 2017 9:49 am

[quote="hgm"]

Playing full games from a prepared book is completely impossible. The game tree of Chess is just waaaaay too big for that. You really must be able to have a algorithm to select a winning move in positions you have never seen before, because you will be out of book before one quarter of the game is over, most likely in a nearly equal position if you have a serious opponent.

[/quote]

This is the cause why they need a machine with MCTS not only for learning but for playing chess too.
Another reason is the NN makes big mistakes sometimes so MCTS acts as a control over the results gained from NN.

corres · Post by **corres** » Tue Dec 19, 2017 9:59 am

[quote="kranium"]

Monte Carlo search does not use a tradition eval as we know it, so mobility, king safety etc. are irrelevant.
It uses a struct to hold info likes wins, losses, draws, win %, etc.,
then simply references accumulated data for the current position to select the move with the highest probability of winning.
Ivanhoe has a Montecarlo search implementation (with which I'm fairly familiar) and it works quite well.
The default implementation uses a sort of 'searchmoves' algorithm:
go montecarlo cpus 8 min -25 max 325 length 40 depth 10 moves c2c4 d2d4 e2e4 g1f3
Years ago I experimented with a version that would obtain the root move list from current position and actually play a strong game.
If you send it all 20 possible moves from the traditional start position, you'd be amazed how quickly the potential move choices are narrowed down...and it usually plays 1. c4 or 1. e4
I still have it if anyone interested (but it does crash once in awhile).

[/quote]

Dear Norman,
I am interested in the source of that Ivanhoe mentioned by you.
Where can I download it?
What is about its Elo number?
Thanks
Robert

Ozymandias · Post by **Ozymandias** » Tue Dec 19, 2017 10:52 am

syzygy wrote:
Ozymandias wrote:I didn't think they could output so many games per second, people argued that the HW difference between the PC used by SF and the machine in which the NN worked, wasn't that high. Unless they replaced the machine for the match, it clearly was they case.
They used lots of HW for training.

They used a big PC with 4 TPU expansion cards (each using just 28-40 Watt) for playing. SF likely played on the same big PC but obviously did not use the TPUs.

It's all documented quite well.

(Note that SF also uses lots of HW for tuning *and* a lot of human brains. The AlphaZero approach seems to be far more suitable for massive parallelisation.)

I haven't even read many of the other threads about A0, much less the paper. It's easier to ask any given question and trust the answer from select people, who have actually done that. Thnx.

In chess,AlphaZero outperformed Stockfish after just 4 hours

From the document - In chess, AlphaZero outperformed Stockfish after just 4 hours. How believable is that?

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Re: In chess,AlphaZero outperformed Stockfish after just 4 h