It's working!! The first dozen moves are no random anymore! Proof is that whites dark Bishop retreated twice when threatened by blacks h6 and g5 pawn moves and so does the Knight at Nf3 moved to Ne5 when threatened by g4.
Interesting is that the remaining moves goes back to random proving that the learning concentration starts at the opening phase then moving forward meaning it may master the endgame last.
Are you sure? When I saw this game it looked like the opening was fed
and until the given opening was over the blunderfest started...
It's working!! The first dozen moves are no random anymore! Proof is that whites dark Bishop retreated twice when threatened by blacks h6 and g5 pawn moves and so does the Knight at Nf3 moved to Ne5 when threatened by g4.
Interesting is that the remaining moves goes back to random proving that the learning concentration starts at the opening phase then moving forward meaning it may master the endgame last.
Are you sure? When I saw this game it looked like the opening was fed
and until the given opening was over the blunderfest started...
The opening wasn’t fed, the network had learned it. That’s what I mean when I say it’s stating to understand chess. No opening books used
And to show that the training process is actually increasing strength, the version that trained overnight is now beating the previous best 49-13-38, or ~131 elo.
Here is an example game, with the new network as black.
[pgn]
[Event "?"]
[Site "?"]
[Date "2018.01.15"]
[Round "8"]
[White "lc_base"]
[Black "lc_new"]
[Result "0-1"]
[ECO "D15"]
[Opening "QGD Slav"]
[PlyCount "90"]
[TimeControl "inf"]
[Variation "4.Nc3"]
To clarify more: the current training data is a big set of SF-SF games where a book was used, in order to be able to debug the training and the search.
It seems the network can already "remember" many book lines from its training. It did not invent them from self-play. But it does not have a book itself either.
pferd wrote:This seems like a very interesting project.
I am playing some 5 minute games against it right now and it moves instantly every single time. Is this the expected behaviour?
Looking at the games it seems that something is wrong.
Google got level above GM strength.
I understand that they had hardware advantage but still
I think that it should be possible to get at least above level of 2000 by similiar methods and the level that I see that is simply stupid mistakes of losing pieces is not even close.
Beating the old version by result that is not close to 100-0 is too slow progress.
Google got something that get better if you give it more time.
If I understand correctly lczero does not get better with more time because it play immediately when time control is not relevant.
pferd wrote:This seems like a very interesting project.
I am playing some 5 minute games against it right now and it moves instantly every single time. Is this the expected behaviour?
Looking at the games it seems that something is wrong.
Google got level above GM strength.
I understand that they had hardware advantage but still
I think that it should be possible to get at least above level of 2000 by similiar methods and the level that I see that is simply stupid mistakes of losing pieces is not even close.
Beating the old version by result that is not close to 100-0 is too slow progress.
Google got something that get better if you give it more time.
If I understand correctly lczero does not get better with more time because it play immediately when time control is not relevant.
I think there is no search yet. That would explain that the program moves instantly. The eval is in theory as strong as a quiescence search -just my guess- but that is even for Alpha Zero not yet GM I suppose. And the neural netwerk is much much smaller. Because the community does not have those massive resources, especially floating point hardware that Google threw at this. It simply is not available yet for us simple amateurs.
If lczero had learned to play 2. c4 in Indian openings or other d4 openings, totally from scratch without even knowing some piece values, that would have been a miracle. Nothing short of that, I almost believed it though but what Gian-Carlo is saying explains it better and is much much more realistic.
lczero is obviously learning.
You can not expect it to duplicate Alpha Zero. Somebody said it would take 1700 years on his slow laptop to duplicate Alpha Zero. (If I remember that rigt, I should look it up) And there aren't even learning clients yet, at least that is what Gary said a few days ago.
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
pferd wrote:This seems like a very interesting project.
I am playing some 5 minute games against it right now and it moves instantly every single time. Is this the expected behaviour?
This is because the default number of playouts is set to 800, and there is no time management.
You can increase the number of playouts by passing the -p argument on the command line, like -p20000, and the engine will think for longer, and play stronger.