LCZero update (2)
Moderators: hgm, Rebel, chrisw
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: LCZero update
Fixed number of playouts is sames as using fixed number of nodes or depth, so there shouldn't be any difference in strength on CPU and GPU with that setup.
-
- Posts: 291
- Joined: Wed May 08, 2013 6:49 am
Re: LCZero update
ID55 new update on http://play.lczero.org/ at 100 elo jump it prefered Kings Indian Defense in slow mode!
[pgn]1. d4 Nf6
2. c4 g6
3. Nc3 Bg7
4. e4 d6
5. Nf3 c6
6. h3 O-O
7. Bg5 h6
8. Be3 Nbd7
9. Qd2 e5
10. d5 Nc5
11. Qc2 Qe7
12. g4 Bd7
13. g5 hxg5
14. Bxg5 Rae8
15. Be2 a5
16. O-O-O a4
17. Rdg1 Kh8
18. h4 Qd8
19. h5 gxh5
20. Rxh5+ Kg8
21. Bxf6 Qxf6
22. Rhg5 Qh6
23. Qd2 f5
24. Rxg7+ Qxg7
25. Rxg7+ Kxg7
26. Qg5+ Kh7
27. Qh5+ Kg7
28. Ng5 f4
29. Bg4 Bxg4
30. Qxg4 Kf6
31. Nh7+ Ke7
32. Qg5+ Kd7
33. Qg7+ Re7
34. Nxf8+ Ke8
35. Qg8 Nd3+
36. Kd2 Nxb2
37. Ne6+ Kd7
38. Qd8#[/pgn]
[pgn]1. d4 Nf6
2. c4 g6
3. Nc3 Bg7
4. e4 d6
5. Nf3 c6
6. h3 O-O
7. Bg5 h6
8. Be3 Nbd7
9. Qd2 e5
10. d5 Nc5
11. Qc2 Qe7
12. g4 Bd7
13. g5 hxg5
14. Bxg5 Rae8
15. Be2 a5
16. O-O-O a4
17. Rdg1 Kh8
18. h4 Qd8
19. h5 gxh5
20. Rxh5+ Kg8
21. Bxf6 Qxf6
22. Rhg5 Qh6
23. Qd2 f5
24. Rxg7+ Qxg7
25. Rxg7+ Kxg7
26. Qg5+ Kh7
27. Qh5+ Kg7
28. Ng5 f4
29. Bg4 Bxg4
30. Qxg4 Kf6
31. Nh7+ Ke7
32. Qg5+ Kd7
33. Qg7+ Re7
34. Nxf8+ Ke8
35. Qg8 Nd3+
36. Kd2 Nxb2
37. Ne6+ Kd7
38. Qd8#[/pgn]
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: LCZero update
True. But there was doubt whether the GPU version was working correctly.Daniel Shawul wrote:Fixed number of playouts is sames as using fixed number of nodes or depth, so there shouldn't be any difference in strength on CPU and GPU with that setup.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: LCZero update
Still intrigues me that using full CPU (4 cores), I can get speeds (NPS) achievable only with the best GPUs. Shouldn't top GPU be an order of magnitude faster than full CPU? On 4 cores, CPU version seems to be 1800+ CCRL Elo level. Gauntlet of games at 1s/move:Joost Buijs wrote:Over here the CPU only version does about 400 n/s on a single core (Broadwell 3.8 GHz.), when I use my cheap GT-720 GPU with 192 Cuda cores this figure drops down to 250 n/s. On my GTX-1080Ti it runs at ~3500 n/s (when running 2 instances of the client).Guenther wrote:I had a very different experience with the (finally) working cpu version.Laskos wrote:I have a weak video card, but I didn't expect that:CMCanavessi wrote:New official version released:
https://github.com/glinscott/leela-ches ... s/tag/v0.4
Finally includes a windows build with all the dlls, and a working windows CPU-Only build as well
http://www.talkchess.com/forum/viewtopi ... 45&start=5
CPU version is performing much better. Is LCZero using the GPU card properly?
Here it was around 4 times slower on one thread despite having a cheap
gpu card. May be I create exact numbers again. Currently I have already
deleted the cpu version after my measurement.
Code: Select all
Games Completed = 40 of 40 (Avg game length = 72.851 sec)
Settings = Gauntlet/64MB/1000ms per move/M 500cp for 3 moves, D 140 moves/EPD:C:\LittleBlitzer\2moves_v1.epd(32000)
Time = 3721 sec elapsed, 0 sec remaining
1. LCZero CPU 4 threads 22.0/40 19-15-6 (L: m=15 t=0 i=0 a=0) (D: r=6 i=0 f=0 s=0 a=0) (tpm=960.5 d=17.49 nps=3767)
2. Predateur 2.2.1 (1786) 10.0/20 10-10-0 (L: m=10 t=0 i=0 a=0) (D: r=0 i=0 f=0 s=0 a=0) (tpm=887.6 d=55.72 nps=3133895)
3. Zurichess Appenzeller (1821) 8.0/20 5-9-6 (L: m=9 t=0 i=0 a=0) (D: r=6 i=0 f=0 s=0 a=0) (tpm=23.0 d=4.49 nps=959911)
I have the feeling that the new v4 client has problems uploading the games. This morning I let v4 run for some time, it produced about 30 games but only a few of them appear in the server statistics. Running the client with -debug doesn't give any extra information at all, so I really don't know what is going on.
-
- Posts: 2801
- Joined: Mon Feb 11, 2008 3:53 pm
- Location: Denmark
- Full name: Damir Desevac
Re: LCZero update
Here is the game against latest Network.
I was White.
History:
1. d4 d5
2. Nf3 Nf6
3. Bf4 g6
4. e3 Bg7
5. c4 O-O
6. Nc3 c6
7. Bd3 Nbd7
8. Ne5 Nxe5
9. dxe5 Ng4
10. h4 Nxe5
11. h5 Nxd3+
12. Qxd3 e5
13. Bg3 e4
14. Qd2 Qg5
15. hxg6 fxg6
16. cxd5 cxd5
17. Qxd5+ Qxd5
18. Nxd5 Bf5
19. Ne7+ Kh8
20. Nxf5 Rxf5
21. O-O-O Rb5
22. b3 a5
23. Rh4 Re8
24. Rd7 Kg8
25. Bf4 Rf8
26. Bh6 Bxh6
27. Rxh6 Rf7
28. Rxf7 Kxf7
29. Rxh7+ Kf6
30. Kc2 Rf5
31. Rxb7 Rxf2+
32. Kc3 Rxg2
33. a4 g5
34. Rb5 Rh2
35. Rxa5 g4
36. Ra8 g3
37. Rg8 g2
38. a5 Kf7
39. Rg3 Ke6
40. Kb4 Kd5
41. Kb5 Kd6
42. b4 Kc7
43. Rg7+ Kc8
44. Kb6 Rh6+
45. Kb5 Rh2
46. Rg6 Kc7
47. Kc4 Kb8
48. Kd4 Kc8
49. Kxe4 Kc7
50. Kd5 Kd8
51. b5 Kc7
52. Rg7+ Kb8
53. e4 Ka8
54. b6 Kb8
55. a6 Ka8
56. Rg6 Rh5+
57. Kc6 Rh6
58. Rxh6 g1=Q
59. Rh8+ Qg8
60. Rxg8#
I was White.
History:
1. d4 d5
2. Nf3 Nf6
3. Bf4 g6
4. e3 Bg7
5. c4 O-O
6. Nc3 c6
7. Bd3 Nbd7
8. Ne5 Nxe5
9. dxe5 Ng4
10. h4 Nxe5
11. h5 Nxd3+
12. Qxd3 e5
13. Bg3 e4
14. Qd2 Qg5
15. hxg6 fxg6
16. cxd5 cxd5
17. Qxd5+ Qxd5
18. Nxd5 Bf5
19. Ne7+ Kh8
20. Nxf5 Rxf5
21. O-O-O Rb5
22. b3 a5
23. Rh4 Re8
24. Rd7 Kg8
25. Bf4 Rf8
26. Bh6 Bxh6
27. Rxh6 Rf7
28. Rxf7 Kxf7
29. Rxh7+ Kf6
30. Kc2 Rf5
31. Rxb7 Rxf2+
32. Kc3 Rxg2
33. a4 g5
34. Rb5 Rh2
35. Rxa5 g4
36. Ra8 g3
37. Rg8 g2
38. a5 Kf7
39. Rg3 Ke6
40. Kb4 Kd5
41. Kb5 Kd6
42. b4 Kc7
43. Rg7+ Kc8
44. Kb6 Rh6+
45. Kb5 Rh2
46. Rg6 Kc7
47. Kc4 Kb8
48. Kd4 Kc8
49. Kxe4 Kc7
50. Kd5 Kd8
51. b5 Kc7
52. Rg7+ Kb8
53. e4 Ka8
54. b6 Kb8
55. a6 Ka8
56. Rg6 Rh5+
57. Kc6 Rh6
58. Rxh6 g1=Q
59. Rh8+ Qg8
60. Rxg8#
-
- Posts: 1563
- Joined: Thu Jul 16, 2009 10:47 am
- Location: Almere, The Netherlands
Re: LCZero update
My expectation was that LCZero on GPU would run a lot faster than on CPU. On my i7-6950x (using 10 cores) the CPU version does ~2500 nps, my GTX-1080Ti does ~3500 nps, so not much difference at all.Laskos wrote: Still intrigues me that using full CPU (4 cores), I can get speeds (NPS) achievable only with the best GPUs. Shouldn't top GPU be an order of magnitude faster than full CPU?
I don't have experience with matrix multiplication on a GPU, but when I use the 1080Ti for 'public key encryption' it runs an order of magnitude faster than the 6950x and somehow I expected LCZero to perform in the same way. Maybe the OpenCL code is not optimal yet, and probably there are other things that can be optimized as well, the project is very new and my guess is that the code will mature over time.
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: LCZero update
@HG, Ok got it.
Daniel
It is probably because matrix-matrix multiplication is memory-bound not compute-bound. If you don't do much computation per byte loaded, your speedup over the CPU (using all cores) is probably not going to go above 5-6X. Moreover DGEMM etc have been optimized for years for vector CPU machines so they are hard to beat.Joost Buijs wrote:My expectation was that LCZero on GPU would run a lot faster than on CPU. On my i7-6950x (using 10 cores) the CPU version does ~2500 nps, my GTX-1080Ti does ~3500 nps, so not much difference at all.Laskos wrote: Still intrigues me that using full CPU (4 cores), I can get speeds (NPS) achievable only with the best GPUs. Shouldn't top GPU be an order of magnitude faster than full CPU?
I don't have experience with matrix multiplication on a GPU, but when I use the 1080Ti for 'public key encryption' it runs an order of magnitude faster than the 6950x and somehow I expected LCZero to perform in the same way. Maybe the OpenCL code is not optimal yet, and probably there are other things that can be optimized as well, the project is very new and my guess is that the code will mature over time.
Daniel
-
- Posts: 1563
- Joined: Thu Jul 16, 2009 10:47 am
- Location: Almere, The Netherlands
Re: LCZero update
You are right, but the performance seems to be lower than it can be.Daniel Shawul wrote: It is probably because matrix-matrix multiplication is memory-bound not compute-bound. If you don't do much computation per byte loaded, your speedup over the CPU (using all cores) is probably not going to go above 5-6X. Moreover DGEMM etc have been optimized for years for vector CPU machines so they are hard to beat.
Daniel
I'm running under Windows, the project builds fine with MSVC-2017 but atm. I can't get it to work with the Intel compiler which is clearly a better compiler for FP work. I also would like to replace OpenBlas with Intel MKL, just because I'm curious to know which library performs better. When I have some time this weekend I will take a closer look at it.
-
- Posts: 1627
- Joined: Thu Mar 09, 2006 12:35 pm
Re: LCZero update
The graph starts perhaps to show a point of diminishing returns?CMCanavessi wrote:New official version released:
https://github.com/glinscott/leela-ches ... s/tag/v0.4
Finally includes a windows build with all the dlls, and a working windows CPU-Only build as well
Is this somewhat worrisome for the project?
http://lczero.org/
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
"Is it a boy or girl?"
YES! He replied.....
-
- Posts: 1627
- Joined: Thu Mar 09, 2006 12:35 pm
Re: LCZero update
Hi, can you give some complete instructions (1,2,3,4 etc) about that?jpqy wrote:It's indeed working when explained well.. for using it into Cutechess you need to make a play.bat file then the engine get loaded..Thanks with the help from Aloril and other guys on LCZero chat!CMCanavessi wrote: It IS working, you just don't know how to use it. You need to specify the network file with -w <file>
A bat file containing what for example among other things?
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
"Is it a boy or girl?"
YES! He replied.....