Page 1 of 2

LCZero is learning

Posted: Tue Jan 30, 2018 7:43 pm
by gladius
Currently, we are using supervised learning (training off of expert games), to ensure there are no major bugs in the code. Next up will be the self-play learning phase where we start from scratch, should be fun seeing it learn to play chess :).

With the supervised network, we are starting to see some good chess! Here is a nice attacking game against GnuChess (LCZero as white):
[pgn]
1. e4 Nc6 2. Nf3 Nf6 3. Nc3 d5 4. exd5 Nxd5 5. Bb5 Nf4 6. O-O Bf5 7. d4 Nd5 8. Ne5 Qd6 9. Nxd5 Qxd5 10. Bxc6+ bxc6 11. c4 Qd6 12. Qf3 g6 13. Nxc6 Bg7 14. Bf4 Qe6 15. Rfe1 Qxc4 16. Rxe7+ Kf8 17. Rae1 Kg8 18. b3 Qc2 19. Qd5 Be6 20. R7xe6 fxe6 21. Qxe6+ Kf8 22. Bh6 Bxh6 23. Qf6+ Kg8 24. Ne7#
[/pgn]

You can download the weights here: https://github.com/glinscott/leela-ches ... bb-net.zip and run it against other engines if you want, as there is UCI support. There aren't any windows compiles currently, but it should be fairly straightforward to compile on windows.

Re: LCZero is learning

Posted: Tue Jan 30, 2018 7:52 pm
by Rodolfo Leoni
It's great! And a nice game too, so it looks like all people that can participate to the learning process will have a lot of fun. As you already know, I like the project and I hope I can contribute with CPU time. Waiting... :D

Re: LCZero is learning

Posted: Tue Jan 30, 2018 8:01 pm
by AlvaroBegue
Can you describe what kind of search is being used here?

Re: LCZero is learning

Posted: Tue Jan 30, 2018 8:16 pm
by gladius
AlvaroBegue wrote:Can you describe what kind of search is being used here?
It's a standard UCT search, as described in the AlphaGo paper.

Nothing chess specific in it, which is the fun part :).

Re: LCZero is learning

Posted: Tue Jan 30, 2018 9:05 pm
by CMCanavessi
Can it learn by playing itself only, or can it also learn while playing others?

Let's say I put LCZero without previous training in a tournament, after it has finished, will it learn and use it or the "learning mode" is completely different from the "playing mode"?

Re: LCZero is learning

Posted: Tue Jan 30, 2018 10:04 pm
by Daniel Shawul
Good progress, policy network seems to avoid blunders now.

A quick game:

[pgn]
[Event "Tour13"]
[Site "daniel-Satellite-C855"]
[Date "2018.01.30"]
[Round "1"]
[White "ScorpioMCTS"]
[Black "Lczero"]
[Result "1-0"]
[BlackElo "2200"]
[Time "13:17:03"]
[WhiteElo "2200"]
[TimeControl "240+4"]
[SetUp "1"]
[FEN "rq2kb1r/p2p1ppp/1pb1pn2/8/2P1P3/P1N5/1PQ1BPPP/R1B1K2R w KQkq - 0 1"]
[Termination "adjudication"]
[PlyCount "46"]
[WhiteType "program"]
[BlackType "program"]

1. Be3 {(Bc1-e3 Bf8-d6 Ke1-c1 Ke8-g8 h2-h3 Bd6-e5 Be2-f3 Be5xc3 Qc2xc3
Bc6xe4) -0.08/17 18} Bc5 2. Bxc5 {(Be3xc5 b6xc5 Be2-f3 Qb8-c7 Ke1-g1 Ke8-g8
b2-b4 d7-d6) -0.08/17 9} bxc5 3. Bf3 {(Be2-f3 Qb8-e5 h2-h3 Ke8-g8 Ke1-g1
d7-d6 Rf1-e1 Ra8-b8) -0.07/15 9} g5 4. h3 {(h2-h3 Ke8-g8 Ra1-d1 Qb8-f4
Ke1-g1 d7-d6 b2-b4 Ra8-b8 b4xc5) -0.12/16 9} h5 5. Rd1 {(Ra1-d1 Ke8-g8
g2-g3 Qb8-c7 _a1-a1 Ra8-c8 Ke1-g1 d7-d6) -0.12/15 9} g4 6. hxg4 {(h3xg4
Qb8-e5 g4xh5 d7-d5 c4xd5 e6xd5 Ke1-g1 d5-d4 Nc3-d5) -0.26/13 9} h4 7. g5
{(g4-g5 Nf6-g8 g2-g3 Ng8-e7 b2-b3 h4-h3 Bf3-g4 d7-d6 Bg4xh3) -0.40/15 9}
Qe5 8. gxf6 {(g5xf6 Qe5xf6 Nc3-b5 Ra8-b8 Nb5-c7 Ke8-d8 _a1-a1) -0.75/16 9}
Rb8 9. Bg4 {(Bf3-g4 Qe5xf6 f2-f3 h4-h3 g2-g3 Qf6-g5 Qc2-f2 d7-d6) -0.73/14
9} Qxf6 10. Qd2 {(Qc2-d2 Ke8-g8 Ke1-g1 Qf6-g6 Bg4-f3 _a1-a1 Rd1-e1 h4-h3)
-0.70/15 9} Rg8 11. Bf3 {(Bg4-f3 Rg8-g6 Qd2-c2 Rg6-g8 b2-b3 Qf6-f4 Nc3-e2)
-0.70/13 9} Rb3 12. Qc2 {(Qd2-c2 Rb3-b7 _a1-a1 Qf6-f4 b2-b3) -0.73/13 9}
Rb8 13. b3 {(b2-b3 Rb8-d8 _a1-a1 Qf6-f4 Qc2-e2 d7-d6) -0.72/14 9} Rg7 14.
Rd6 {(Rd1-d6 Qf6-f4 Rd6-d2 a7-a5 Rd2-d1) -0.72/15 9} Ke7 15. Rd2 {(Rd6-d2
Ke7-e8 _a1-a1 Ke8-f8 Rh1-h3 Qf6-f4) -0.70/16 9} d6 16. e5 {(e4-e5 Qf6xe5
Rd2-e2 Qe5-d4 Bf3xc6 Ke7-f8 Ke1-g1 Kf8-g8) -1.18/12 9} Qxe5+ 17. Re2
{(Rd2-e2 Bc6xf3 Re2xe5 d6xe5 g2xf3 Rb8-h8 Qc2-e4 f7-f6) -1.18/9 9} Bxf3 18.
Rxe5 {(Re2xe5 Bf3xg2 Nc3-d5 Bg2xd5 c4xd5 d6xe5 Qc2xc5 Ke7-f6 d5xe6 f7xe6
Rh1xh4 Rg7-g1 Ke1-d2 Rb8xb3 Qc5xa7 Rb3-b2) -0.76/17 9} dxe5 19. gxf3
{(g2xf3 Rg7-g8 Rh1xh4 Rg8-h8 Rh4-g4 Rh8-h1) -1.21/11 9} Rh8 20. Qe4
{(Qc2-e4 Ke7-f6 Rh1xh4 Rg7-g1 Ke1-e2 Rh8xh4) -1.22/10 9} f6 21. Qb7+
{(Qe4-b7 Ke7-f8 Qb7-a8 Kf8-f7 Qa8xa7 Kf7-f8 Qa7-b8 Kf8-f7 Qb8xh8 Rg7-g8
Qh8-h5) -2.20/13 9} Kf8 22. Qb8+ {(Qb7-b8 Kf8-f7 Qb8xa7 Kf7-g8 Qa7-a8
Kg8-f7 Qa8xh8 e5-e4) -2.26/14 9} Kf7 23. Qxa7+ {(Qb8xa7 Kf7-g8 Qa7-a8
Kg8-f7 Qa8xh8 Rg7-g8 Qh8xh4 f6-f5 Ke1-d2 Kf7-e8) -2.29/14 8} Kg8 {User
Adjudication} 1-0
[/pgn]
In the mean i also improved ScorpioMCTS with alpha-beta rollouts..

Good luck
Daniel

Re: LCZero is learning

Posted: Wed Jan 31, 2018 5:34 am
by gladius
Daniel Shawul wrote:Good progress, policy network seems to avoid blunders now.
Yes, definitely promising. Still a long way to go though.
Daniel Shawul wrote:In the mean i also improved ScorpioMCTS with alpha-beta rollouts..
Nice, yes, that looks like a very promising approach!

Re: LCZero ready for windows compiles..

Posted: Wed Jan 31, 2018 8:04 am
by supersharp77
gladius wrote:Currently, we are using supervised learning (training off of expert games), to ensure there are no major bugs in the code. Next up will be the self-play learning phase where we start from scratch, should be fun seeing it learn to play chess :).

With the supervised network, we are starting to see some good chess! Here is a nice attacking game against GnuChess (LCZero as white):
[pgn]
1. e4 Nc6 2. Nf3 Nf6 3. Nc3 d5 4. exd5 Nxd5 5. Bb5 Nf4 6. O-O Bf5 7. d4 Nd5 8. Ne5 Qd6 9. Nxd5 Qxd5 10. Bxc6+ bxc6 11. c4 Qd6 12. Qf3 g6 13. Nxc6 Bg7 14. Bf4 Qe6 15. Rfe1 Qxc4 16. Rxe7+ Kf8 17. Rae1 Kg8 18. b3 Qc2 19. Qd5 Be6 20. R7xe6 fxe6 21. Qxe6+ Kf8 22. Bh6 Bxh6 23. Qf6+ Kg8 24. Ne7#
[/pgn]

You can download the weights here: https://github.com/glinscott/leela-ches ... bb-net.zip and run it against other engines if you want, as there is UCI support. There aren't any windows compiles currently, but it should be fairly straightforward to compile on windows.
Yes project is looking very very good...just need some windows compiles
w32 w64 and old windows would be nice..then real testing can begin!
Just waiting.....Thx AR :D :wink:

Re: LCZero is learning

Posted: Wed Jan 31, 2018 9:17 am
by Rebel
gladius wrote:Currently, we are using supervised learning (training off of expert games), to ensure there are no major bugs in the code. Next up will be the self-play learning phase where we start from scratch, should be fun seeing it learn to play chess :).

With the supervised network, we are starting to see some good chess! Here is a nice attacking game against GnuChess (LCZero as white):
[pgn]
1. e4 Nc6 2. Nf3 Nf6 3. Nc3 d5 4. exd5 Nxd5 5. Bb5 Nf4 6. O-O Bf5 7. d4 Nd5 8. Ne5 Qd6 9. Nxd5 Qxd5 10. Bxc6+ bxc6 11. c4 Qd6 12. Qf3 g6 13. Nxc6 Bg7 14. Bf4 Qe6 15. Rfe1 Qxc4 16. Rxe7+ Kf8 17. Rae1 Kg8 18. b3 Qc2 19. Qd5 Be6 20. R7xe6 fxe6 21. Qxe6+ Kf8 22. Bh6 Bxh6 23. Qf6+ Kg8 24. Ne7#
[/pgn]

You can download the weights here: https://github.com/glinscott/leela-ches ... bb-net.zip and run it against other engines if you want, as there is UCI support. There aren't any windows compiles currently, but it should be fairly straightforward to compile on windows.
Fascinating.

Re: LCZero is learning

Posted: Wed Jan 31, 2018 10:40 am
by peter
Great, thanks
:!: