Announcing lczero

Guenther · Post by **Guenther** » Mon Jan 15, 2018 8:48 pm

Jhoravi wrote:
gladius wrote:Training from SF self-play games seems to be working well. Here is self-play game on the latest network:
[pgn]
1. d4 Nf6 2. c4 e6 3. Nf3 d5 4. Nc3 c6 5. Bg5 h6 6. Bh4 dxc4 7. e4 g5 8.
Bg3 b5 9. h4 g4 10. Ne5 a5 11. f4 gxf3 12. Qxf3 a4 13. Bf4 Qxd4 14. Rc1 Nbd7
15. Nxc6 Qxc3+ 16. bxc3 Bb7 17. Bxc4 bxc4 18. Bxh6 Bxc6 19. Bg5 Ne5 20.
Qxf6 Bxe4 21. h5 Rxh5 22. Rxh5 Ng6 23. Qxg6 fxg6 24. Rh4 Bxg2 25. Rh6 Bxh6
26. Bxh6 Bh3 27. Bg7 Bg4 28. Bf6 Kf7 29. Bh8 Rxh8 30. Kf2 Rh2+ 31. Kg3 a3
32. Kxh2 Kf6 33. Kg3 Kf5 34. Re1 e5 35. Rxe5+ Kxe5 36. Kxg4 Kf6 37. Kf4 g5+
38. Ke4 Ke6 39. Kd4 g4 40. Kxc4 g3 41. Kb5 g2 42. c4 g1=Q 43. c5 Qb1+ 44.
Ka5 Kd7 45. Ka4 Qb2 46. Ka5 Kc6 47. Ka6 Kxc5 48. Ka7 Qxa2 49. Ka6 Qb1 50.
Ka7 Qa2 51. Ka8 Qb1 52. Ka7 a2 53. Ka8 a1=Q+
[/pgn]

It is starting to understand chess :). Still, a long, long ways to go of course.

The weights are available for download from https://github.com/glinscott/lczero-wei ... _64.txt.gz if you want to try at home. It has working UCI support, so it could even play against other engines now!
It's working!! The first dozen moves are no random anymore! Proof is that whites dark Bishop retreated twice when threatened by blacks h6 and g5 pawn moves and so does the Knight at Nf3 moved to Ne5 when threatened by g4.

Interesting is that the remaining moves goes back to random proving that the learning concentration starts at the opening phase then moving forward meaning it may master the endgame last.

Are you sure? When I saw this game it looked like the opening was fed
and until the given opening was over the blunderfest started...

gladius · Post by **gladius** » Mon Jan 15, 2018 9:45 pm

Guenther wrote:
Jhoravi wrote:
gladius wrote:Training from SF self-play games seems to be working well. Here is self-play game on the latest network:
[pgn]
1. d4 Nf6 2. c4 e6 3. Nf3 d5 4. Nc3 c6 5. Bg5 h6 6. Bh4 dxc4 7. e4 g5 8.
Bg3 b5 9. h4 g4 10. Ne5 a5 11. f4 gxf3 12. Qxf3 a4 13. Bf4 Qxd4 14. Rc1 Nbd7
15. Nxc6 Qxc3+ 16. bxc3 Bb7 17. Bxc4 bxc4 18. Bxh6 Bxc6 19. Bg5 Ne5 20.
Qxf6 Bxe4 21. h5 Rxh5 22. Rxh5 Ng6 23. Qxg6 fxg6 24. Rh4 Bxg2 25. Rh6 Bxh6
26. Bxh6 Bh3 27. Bg7 Bg4 28. Bf6 Kf7 29. Bh8 Rxh8 30. Kf2 Rh2+ 31. Kg3 a3
32. Kxh2 Kf6 33. Kg3 Kf5 34. Re1 e5 35. Rxe5+ Kxe5 36. Kxg4 Kf6 37. Kf4 g5+
38. Ke4 Ke6 39. Kd4 g4 40. Kxc4 g3 41. Kb5 g2 42. c4 g1=Q 43. c5 Qb1+ 44.
Ka5 Kd7 45. Ka4 Qb2 46. Ka5 Kc6 47. Ka6 Kxc5 48. Ka7 Qxa2 49. Ka6 Qb1 50.
Ka7 Qa2 51. Ka8 Qb1 52. Ka7 a2 53. Ka8 a1=Q+
[/pgn]

It is starting to understand chess . Still, a long, long ways to go of course.

The weights are available for download from https://github.com/glinscott/lczero-wei ... _64.txt.gz if you want to try at home. It has working UCI support, so it could even play against other engines now!
It's working!! The first dozen moves are no random anymore! Proof is that whites dark Bishop retreated twice when threatened by blacks h6 and g5 pawn moves and so does the Knight at Nf3 moved to Ne5 when threatened by g4.

Interesting is that the remaining moves goes back to random proving that the learning concentration starts at the opening phase then moving forward meaning it may master the endgame last.
Are you sure? When I saw this game it looked like the opening was fed
and until the given opening was over the blunderfest started...

The opening wasn’t fed, the network had learned it. That’s what I mean when I say it’s stating to understand chess. No opening books used

Guenther · Post by **Guenther** » Mon Jan 15, 2018 9:48 pm

gladius wrote:
Guenther wrote: Are you sure? When I saw this game it looked like the opening was fed
and until the given opening was over the blunderfest started...
The opening wasn’t fed, the network had learned it. That’s what I mean when I say it’s stating to understand chess. No opening books used :)

Ok, thanks for the confirmation.

gladius · Post by **gladius** » Mon Jan 15, 2018 9:56 pm

And to show that the training process is actually increasing strength, the version that trained overnight is now beating the previous best 49-13-38, or ~131 elo.

Here is an example game, with the new network as black.
[pgn]
[Event "?"]
[Site "?"]
[Date "2018.01.15"]
[Round "8"]
[White "lc_base"]
[Black "lc_new"]
[Result "0-1"]
[ECO "D15"]
[Opening "QGD Slav"]
[PlyCount "90"]
[TimeControl "inf"]
[Variation "4.Nc3"]

1. d4 d5 2. c4 c6 3. Nf3 Nf6 4. Nc3 e6 5. e3 Nbd7 6. Bd3 dxc4 7. Bxc4 b5 8. Bd3
Bb7 9. e4 b4 10. Na4 c5 11. e5 Nd5 12. h4 h6 13. h5 cxd4 14. Bg6 fxg6 15. hxg6
Be7 16. Bd2 Qa5 17. a3 O-O 18. Bxh6 gxh6 19. axb4 Qxb4+ 20. Nc3 a5 21. Rxh6 Kg7
22. Rh5 a4 23. Qxa4 Rxa4 24. Rxa4 Ra8 25. Rxb4 Bxb4 26. Nxd4 Nxe5 27. Rg5 Nxc3
28. bxc3 Bxc3+ 29. Ke2 Bxd4 30. f3 Nxg6 31. g3 Ra2+ 32. Kd3 Bf6 33. Rh5 Bh4
34. g4 Bxf3 35. g5 Nf4+ 36. Ke3 Nxh5 37. Kxf3 Bxg5 38. Kg4 Kg6 39. Kf3 Kg7
40. Kg4 Kg6 41. Kf3 Nf4 42. Kg4 e5 43. Kf3 Kf5 44. Kg3 Ke4 45. Kg4
Rg2# {Black mates} 0-1
[/pgn]

Rebel · Post by **Rebel** » Tue Jan 16, 2018 1:01 am

I am waiting for the Steinitz gambit to pop up as best for white

Gian-Carlo Pascutto · Tue Jan 16, 2018 10:52 am

Guenther wrote: Ok, thanks for the confirmation.

To clarify more: the current training data is a big set of SF-SF games where a book was used, in order to be able to debug the training and the search.

It seems the network can already "remember" many book lines from its training. It did not invent them from self-play. But it does not have a book itself either.

pferd · Post by **pferd** » Tue Jan 16, 2018 3:47 pm

This seems like a very interesting project.

I am playing some 5 minute games against it right now and it moves instantly every single time. Is this the expected behaviour?

Uri Blass · Post by **Uri Blass** » Tue Jan 16, 2018 6:04 pm

pferd wrote:This seems like a very interesting project.

I am playing some 5 minute games against it right now and it moves instantly every single time. Is this the expected behaviour?

Looking at the games it seems that something is wrong.

Google got level above GM strength.
I understand that they had hardware advantage but still
I think that it should be possible to get at least above level of 2000 by similiar methods and the level that I see that is simply stupid mistakes of losing pieces is not even close.

Beating the old version by result that is not close to 100-0 is too slow progress.

Google got something that get better if you give it more time.
If I understand correctly lczero does not get better with more time because it play immediately when time control is not relevant.

Eelco de Groot · Post by **Eelco de Groot** » Tue Jan 16, 2018 6:53 pm

Uri Blass wrote:
pferd wrote:This seems like a very interesting project.

I am playing some 5 minute games against it right now and it moves instantly every single time. Is this the expected behaviour?
Looking at the games it seems that something is wrong.

Google got level above GM strength.
I understand that they had hardware advantage but still
I think that it should be possible to get at least above level of 2000 by similiar methods and the level that I see that is simply stupid mistakes of losing pieces is not even close.

Beating the old version by result that is not close to 100-0 is too slow progress.

Google got something that get better if you give it more time.
If I understand correctly lczero does not get better with more time because it play immediately when time control is not relevant.

I think there is no search yet. That would explain that the program moves instantly. The eval is in theory as strong as a quiescence search -just my guess- but that is even for Alpha Zero not yet GM I suppose. And the neural netwerk is much much smaller. Because the community does not have those massive resources, especially floating point hardware that Google threw at this. It simply is not available yet for us simple amateurs.

If lczero had learned to play 2. c4 in Indian openings or other d4 openings, totally from scratch without even knowing some piece values, that would have been a miracle. Nothing short of that, I almost believed it though

but what Gian-Carlo is saying explains it better and is much much more realistic.

lczero is obviously learning.

You can not expect it to duplicate Alpha Zero. Somebody said it would take 1700 years on his slow laptop to duplicate Alpha Zero. (If I remember that rigt, I should look it up) And there aren't even learning clients yet, at least that is what Gary said a few days ago.

gladius · Post by **gladius** » Tue Jan 16, 2018 7:24 pm

pferd wrote:This seems like a very interesting project.

I am playing some 5 minute games against it right now and it moves instantly every single time. Is this the expected behaviour?

This is because the default number of playouts is set to 800, and there is no time management.

You can increase the number of playouts by passing the -p argument on the command line, like -p20000, and the engine will think for longer, and play stronger.

Announcing lczero

Re: Announcing lczero

Re: Announcing lczero

Re: Announcing lczero

Re: Announcing lczero

Re: Announcing lczero

Re: Announcing lczero

Re: Announcing lczero

Re: Announcing lczero

Re: Announcing lczero

Re: Announcing lczero