Now, with this post of Carlos Canavessi
http://www.talkchess.com/forum/viewtopi ... 9&start=44 and the release of CPU and GPU versions, I have something odd happening. Although my GPU is weak, probably some 10 times slower than the best GTX cards, I didn't expect GPU-engine to perform so badly compared to CPU-only engine. In other GPU intensive tasks, it helps greatly over my Haswell four core i7 CPU + integrated GPU. Is LCZero using efficiently GPU?
Old LCZero shows the following:
Code: Select all
Generated 1924 moves
Detecting residual layers...v1...64 channels...6 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.1.75
Platform profile: FULL_PROFILE
Platform name: NVIDIA CUDA
Platform vendor: NVIDIA Corporation
Device ID: 0
Device name: GeForce GT 730
Device type: GPU
Device vendor: NVIDIA Corporation
Device driver: 388.13
Device speed: 1400 MHz
Device cores: 2 CU
Device score: 1112
Selected platform: NVIDIA CUDA
Selected device: GeForce GT 730
with OpenCL 1.2 capability.
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
BLAS Core: Haswell
New GPU LCZero shows the following (identical):
Code: Select all
Generated 1924 moves
Detecting residual layers...v1...64 channels...6 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.1.75
Platform profile: FULL_PROFILE
Platform name: NVIDIA CUDA
Platform vendor: NVIDIA Corporation
Device ID: 0
Device name: GeForce GT 730
Device type: GPU
Device vendor: NVIDIA Corporation
Device driver: 388.13
Device speed: 1400 MHz
Device cores: 2 CU
Device score: 1112
Selected platform: NVIDIA CUDA
Selected device: GeForce GT 730
with OpenCL 1.2 capability.
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
BLAS Core: Haswell
New CPU version shows:
Code: Select all
Generated 1924 moves
Detecting residual layers...v1...64 channels...6 blocks.
BLAS Core: Haswell
Using the old LCZero (actually the one from 1-2 days ago), and new CPU and GPU Windows binaries, I got the following:
Code: Select all
Games Completed = 300 of 300 (Avg game length = 97.535 sec)
Settings = RR/64MB/1000ms per move/M 500cp for 3 moves, D 140 moves/EPD:C:\LittleBlitzer\2moves_v1.epd(32000)
Time = 8427 sec elapsed, 0 sec remaining
1. LCZero Old 53.0/200 45-139-16 (L: m=139 t=0 i=0 a=0) (D: r=6 i=9 f=1 s=0 a=0) (tpm=972.7 d=11.29 nps=138)
2. LCZero GPU 54.0/200 45-137-18 (L: m=137 t=0 i=0 a=0) (D: r=9 i=8 f=1 s=0 a=0) (tpm=972.9 d=11.29 nps=98)
3. LCZero CPU 193.0/200 190-4-6 (L: m=4 t=0 i=0 a=0) (D: r=3 i=3 f=0 s=0 a=0) (tpm=960.7 d=14.66 nps=498)
# PLAYER : RATING ERROR POINTS PLAYED (%) CFS(next)
1 LCZero CPU : 387.5 81.1 193.0 200 96.5 100
2 LCZero GPU : -192.1 51.5 54.0 200 27.0 54
3 LCZero Old : -195.4 52.6 53.0 200 26.5 ---
White advantage = 0.00 +/- 28.99
Draw rate (equal opponents) = 12.69 % +/- 2.87
CPU LCZero has some 5 times higher NPS than GPU LCZero, and is much stronger. The old and the new Windows GPU version seem to perform comparably. With the old LCZero, I got
LCZero Old versus Zurichess Appenzeller (Elo 1821 CCRL):
1s/move:
9.0 - 91.0
Now, with the CPU version, I get a much better result:
LCZero CPU (one core) versus Zurichess Appenzeller (Elo 1821 CCRL):
1s/move:
33.0 - 67.0
Which gives a rating of about 1700 CCRL Elo points for CPU version. I have to revise my plot for progress and extrapolation, using stronger on my PC CPU version. From 27th to 28th of March, LCZero advanced very little, maybe it is a sign of plateauing? If not, the extrapolation on one core (for CPU version on my Haswell) would look as following: