Page 1 of 61

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Thu Mar 29, 2018 12:32 pm
by Laskos
Now, with this post of Carlos Canavessi http://www.talkchess.com/forum/viewtopi ... 9&start=44 and the release of CPU and GPU versions, I have something odd happening. Although my GPU is weak, probably some 10 times slower than the best GTX cards, I didn't expect GPU-engine to perform so badly compared to CPU-only engine. In other GPU intensive tasks, it helps greatly over my Haswell four core i7 CPU + integrated GPU. Is LCZero using efficiently GPU?

Old LCZero shows the following:

Code: Select all

Generated 1924 moves
Detecting residual layers...v1...64 channels...6 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.1.75
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   GeForce GT 730
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 388.13
Device speed:  1400 MHz
Device cores:  2 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: GeForce GT 730
with OpenCL 1.2 capability.
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
BLAS Core: Haswell
New GPU LCZero shows the following (identical):

Code: Select all

Generated 1924 moves
Detecting residual layers...v1...64 channels...6 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.1.75
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   GeForce GT 730
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 388.13
Device speed:  1400 MHz
Device cores:  2 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: GeForce GT 730
with OpenCL 1.2 capability.
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
BLAS Core: Haswell
New CPU version shows:

Code: Select all

Generated 1924 moves
Detecting residual layers...v1...64 channels...6 blocks.
BLAS Core: Haswell


Using the old LCZero (actually the one from 1-2 days ago), and new CPU and GPU Windows binaries, I got the following:

Code: Select all

Games Completed = 300 of 300 (Avg game length = 97.535 sec)
Settings = RR/64MB/1000ms per move/M 500cp for 3 moves, D 140 moves/EPD:C:\LittleBlitzer\2moves_v1.epd(32000)
Time = 8427 sec elapsed, 0 sec remaining
 1.  LCZero Old               	53.0/200	45-139-16  	(L: m=139 t=0 i=0 a=0)	(D: r=6 i=9 f=1 s=0 a=0)	(tpm=972.7 d=11.29 nps=138)
 2.  LCZero GPU               	54.0/200	45-137-18  	(L: m=137 t=0 i=0 a=0)	(D: r=9 i=8 f=1 s=0 a=0)	(tpm=972.9 d=11.29 nps=98)
 3.  LCZero CPU               	193.0/200	190-4-6  	(L: m=4 t=0 i=0 a=0)	(D: r=3 i=3 f=0 s=0 a=0)	(tpm=960.7 d=14.66 nps=498)





   # PLAYER        : RATING  ERROR    POINTS  PLAYED     (%)   CFS(next)
   1 LCZero CPU    :  387.5   81.1     193.0     200    96.5     100    
   2 LCZero GPU    : -192.1   51.5      54.0     200    27.0      54    
   3 LCZero Old    : -195.4   52.6      53.0     200    26.5     ---    

White advantage = 0.00 +/- 28.99
Draw rate (equal opponents) = 12.69 % +/- 2.87

CPU LCZero has some 5 times higher NPS than GPU LCZero, and is much stronger. The old and the new Windows GPU version seem to perform comparably. With the old LCZero, I got

LCZero Old versus Zurichess Appenzeller (Elo 1821 CCRL):
1s/move:
9.0 - 91.0



Now, with the CPU version, I get a much better result:

LCZero CPU (one core) versus Zurichess Appenzeller (Elo 1821 CCRL):
1s/move:
33.0 - 67.0

Which gives a rating of about 1700 CCRL Elo points for CPU version. I have to revise my plot for progress and extrapolation, using stronger on my PC CPU version. From 27th to 28th of March, LCZero advanced very little, maybe it is a sign of plateauing? If not, the extrapolation on one core (for CPU version on my Haswell) would look as following:

Image

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Thu Mar 29, 2018 12:42 pm
by Werewolf
When you analyse in Arena, how many rollouts / sec are you getting on with the CPU and GPU versions, please?

I'm getting 2000 - 5000 on GPU

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Thu Mar 29, 2018 1:02 pm
by Laskos
Werewolf wrote:When you analyse in Arena, how many rollouts / sec are you getting on with the CPU and GPU versions, please?

I'm getting 2000 - 5000 on GPU
Interesting. I am getting about 150 on GPU and 600 on CPU.

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Thu Mar 29, 2018 1:05 pm
by Werewolf
Laskos wrote:
Werewolf wrote:When you analyse in Arena, how many rollouts / sec are you getting on with the CPU and GPU versions, please?

I'm getting 2000 - 5000 on GPU
Interesting. I am getting about 150 on GPU and 600 on CPU.

That does seem low for the GPU, though it might just be your card. The Titan V is at least 3x faster than my card which is eye opening.

By the way, do you know why there is such big regression on this site from time to time?

http://lczero.org/networks

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Thu Mar 29, 2018 1:13 pm
by Jhoravi
Werewolf wrote:
By the way, do you know why there is such big regression on this site from time to time?

http://lczero.org/networks
It was caused by a bug discussed here.
https://groups.google.com/forum/#!topic ... vf9p3biLRk
Now it's fixed. :D

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Thu Mar 29, 2018 1:16 pm
by Werewolf
Amazing. I'll wait for the training before doing tests on the new one.

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Fri Mar 30, 2018 11:22 am
by petero2
duncan wrote:
Laskos wrote:
duncan wrote: do you know the elo of the strongest ai chess software today ?
http://www.computerchess.org.uk/ccrl/40 ... t_all.html

Look at 1 core performances, as LCZero is on one core too. Stockfish 9 on one core is about 3500 CCRL Elo.
I meant software which uses neural networks.is it giraffe ?
I think Texel Gi is the strongest that can run without a GPU/TPU. In my tests it is around 2820 on the CCRL 40/4 rating scale, so about 400 elo above Giraffe.

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Fri Mar 30, 2018 7:14 pm
by Milos
Laskos wrote:Now, with this post of Carlos Canavessi http://www.talkchess.com/forum/viewtopi ... 9&start=44 and the release of CPU and GPU versions, I have something odd happening. Although my GPU is weak, probably some 10 times slower than the best GTX cards, I didn't expect GPU-engine to perform so badly compared to CPU-only engine. In other GPU intensive tasks, it helps greatly over my Haswell four core i7 CPU + integrated GPU. Is LCZero using efficiently GPU?

Old LCZero shows the following:

Code: Select all

Generated 1924 moves
Detecting residual layers...v1...64 channels...6 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.1.75
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   GeForce GT 730
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 388.13
Device speed:  1400 MHz
Device cores:  2 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: GeForce GT 730
with OpenCL 1.2 capability.
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
BLAS Core: Haswell
New GPU LCZero shows the following (identical):

Code: Select all

Generated 1924 moves
Detecting residual layers...v1...64 channels...6 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.1.75
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   GeForce GT 730
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 388.13
Device speed:  1400 MHz
Device cores:  2 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: GeForce GT 730
with OpenCL 1.2 capability.
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
BLAS Core: Haswell
New CPU version shows:

Code: Select all

Generated 1924 moves
Detecting residual layers...v1...64 channels...6 blocks.
BLAS Core: Haswell


Using the old LCZero (actually the one from 1-2 days ago), and new CPU and GPU Windows binaries, I got the following:

Code: Select all

Games Completed = 300 of 300 (Avg game length = 97.535 sec)
Settings = RR/64MB/1000ms per move/M 500cp for 3 moves, D 140 moves/EPD:C:\LittleBlitzer\2moves_v1.epd(32000)
Time = 8427 sec elapsed, 0 sec remaining
 1.  LCZero Old               	53.0/200	45-139-16  	(L: m=139 t=0 i=0 a=0)	(D: r=6 i=9 f=1 s=0 a=0)	(tpm=972.7 d=11.29 nps=138)
 2.  LCZero GPU               	54.0/200	45-137-18  	(L: m=137 t=0 i=0 a=0)	(D: r=9 i=8 f=1 s=0 a=0)	(tpm=972.9 d=11.29 nps=98)
 3.  LCZero CPU               	193.0/200	190-4-6  	(L: m=4 t=0 i=0 a=0)	(D: r=3 i=3 f=0 s=0 a=0)	(tpm=960.7 d=14.66 nps=498)





   # PLAYER        : RATING  ERROR    POINTS  PLAYED     (%)   CFS(next)
   1 LCZero CPU    :  387.5   81.1     193.0     200    96.5     100    
   2 LCZero GPU    : -192.1   51.5      54.0     200    27.0      54    
   3 LCZero Old    : -195.4   52.6      53.0     200    26.5     ---    

White advantage = 0.00 +/- 28.99
Draw rate (equal opponents) = 12.69 % +/- 2.87

CPU LCZero has some 5 times higher NPS than GPU LCZero, and is much stronger. The old and the new Windows GPU version seem to perform comparably. With the old LCZero, I got

LCZero Old versus Zurichess Appenzeller (Elo 1821 CCRL):
1s/move:
9.0 - 91.0



Now, with the CPU version, I get a much better result:

LCZero CPU (one core) versus Zurichess Appenzeller (Elo 1821 CCRL):
1s/move:
33.0 - 67.0

Which gives a rating of about 1700 CCRL Elo points for CPU version. I have to revise my plot for progress and extrapolation, using stronger on my PC CPU version. From 27th to 28th of March, LCZero advanced very little, maybe it is a sign of plateauing? If not, the extrapolation on one core (for CPU version on my Haswell) would look as following:

Image
This is toatally normal since your GPU is really crap (pardon my French :)).
GT 730 is like 15 times slower than 1080 in terms of dot product ops, and 1080 is just slightly faster than 5960X with all 8 cores so I guess 4 cores Haswell (4770K is it?) being 5 times faster than 730 is really spot on ;).

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Sat Mar 31, 2018 7:01 am
by lkaufman
petero2 wrote:
duncan wrote:
Laskos wrote:
duncan wrote: do you know the elo of the strongest ai chess software today ?
http://www.computerchess.org.uk/ccrl/40 ... t_all.html

Look at 1 core performances, as LCZero is on one core too. Stockfish 9 on one core is about 3500 CCRL Elo.
I meant software which uses neural networks.is it giraffe ?
I think Texel Gi is the strongest that can run without a GPU/TPU. In my tests it is around 2820 on the CCRL 40/4 rating scale, so about 400 elo above Giraffe.
I'd like to ask a somewhat different question, what is the elo of the strongest non-private chess software (using CPU, not GPU) today that uses Monte-Carlo Tree Search, regardless of its evaluation function?

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Sat Mar 31, 2018 7:12 am
by CMCanavessi
lkaufman wrote:
petero2 wrote:
duncan wrote:
Laskos wrote:
duncan wrote: do you know the elo of the strongest ai chess software today ?
http://www.computerchess.org.uk/ccrl/40 ... t_all.html

Look at 1 core performances, as LCZero is on one core too. Stockfish 9 on one core is about 3500 CCRL Elo.
I meant software which uses neural networks.is it giraffe ?
I think Texel Gi is the strongest that can run without a GPU/TPU. In my tests it is around 2820 on the CCRL 40/4 rating scale, so about 400 elo above Giraffe.
I'd like to ask a somewhat different question, what is the elo of the strongest non-private chess software (using CPU, not GPU) today that uses Monte-Carlo Tree Search, regardless of its evaluation function?
Scorpio maybe?