Albert Silver wrote:Yes, here is what I got on my laptop:Guenther wrote:At first start LCZero does an automatical tuning for what settings to use with your gpu. This is of course a standard tuning.JJJ wrote:What do you mean the full tune on it ?
from starting position, my leela needed 1min26 to reach depth 26. I don't know if it is ok for my card or below.
Anyway, my leela is winning against Hakkapelitta. So that's nice already.
By doing a full tuning you can get a speed increase up to perhaps 150-200% in some cases.
Delete the automatically created file named leelaz_opencl_tuning and start the process like described below.
Run sth like this (adapt names/files) from commandline or add it to a batch file in case of windows.
This might run some time and you can see how each tried setting gets more GFlops out of your card.
example for my very weak gpu which must be retuned now for the new NN size (NN ID222 is now at 15*192):Code: Select all
lczero07.exe --tune-only --full-tuner -w ID222
Code: Select all
C:\Engines\UCIPG\LCZero_07ID222>lczero07.exe --tune-only --full-tuner -w ID222 Using 2 thread(s). Detecting residual layers...v2...192 channels...15 blocks. Initializing OpenCL. Detected 1 OpenCL platforms. Platform version: OpenCL 1.2 CUDA 9.1.75 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 0 Device name: GeForce GT 710 Device type: GPU Device vendor: NVIDIA Corporation Device driver: 388.13 Device speed: 954 MHz Device cores: 1 CU Device score: 1112 Selected platform: NVIDIA CUDA Selected device: GeForce GT 710 with OpenCL 1.2 capability. Started OpenCL SGEMM tuner. RNG seed: 0xec7a02 (thread: 701728073) Will try 5279 valid configurations. (1/5279) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 ST ...
[/quoteCode: Select all
C:\Users\Albert\Chess\Leela Zero\GPU>lczero.exe -t3 -w weights.txt --full-tuner Using 3 thread(s). Detecting residual layers...v2...192 channels...15 blocks. Initializing OpenCL. Device name: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz Device type: CPU Device vendor: Intel(R) Corporation Device driver: 7.6.0.611 Device speed: 2600 MHz Device cores: 8 CU Device score: 521 Platform version: OpenCL 1.2 CUDA 9.1.84 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 2 Device name: GeForce GTX 980M Device type: GPU Device vendor: NVIDIA Corporation Device driver: 391.35 Device speed: 1126 MHz Device cores: 12 CU Device score: 1112 Selected platform: NVIDIA CUDA Selected device: GeForce GTX 980M with OpenCL 1.2 capability. Started OpenCL SGEMM tuner. RNG seed: 0x65d47141 (thread: 2783254248) Will try 5117 valid configurations. (1/5117) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.1067 ms (177.0 GFLOPS) (6/5117) KWG=32 KWI=2 MDIMA=8 MDIMC=16 MWG=32 NDIMB=8 NDIMC=16 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0946 ms (199.5 GFLOPS) (9/5117) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0894 ms (211.1 GFLOPS) (79/5117) KWG=16 KWI=8 MDIMA=32 MDIMC=32 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0651 ms (289.9 GFLOPS) (566/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=1 STRN=0 VWM=2 VWN=2 0.0594 ms (317.6 GFLOPS) (853/5117) KWG=16 KWI=2 MDIMA=8 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=1 VWM=2 VWN=2 0.0571 ms (330.4 GFLOPS) (1276/5117) KWG=32 KWI=2 MDIMA=16 MDIMC=16 MWG=32 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0551 ms (342.8 GFLOPS) (1278/5117) KWG=32 KWI=2 MDIMA=16 MDIMC=8 MWG=32 NDIMB=16 NDIMC=16 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0530 ms (356.2 GFLOPS) (1306/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0501 ms (377.0 GFLOPS) (1348/5117) KWG=16 KWI=2 MDIMA=32 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=2 VWN=1 0.0484 ms (390.2 GFLOPS) (1404/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=16 NDIMC=16 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=4 VWN=1 0.0444 ms (424.8 GFLOPS) (1504/5117) KWG=16 KWI=8 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=4 VWN=2 0.0424 ms (444.7 GFLOPS) (1837/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=1 STRN=0 VWM=4 VWN=2 0.0421 ms (447.9 GFLOPS) (3906/5117) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=1 0.0399 ms (473.1 GFLOPS) (3921/5117) KWG=16 KWI=2 MDIMA=32 MDIMC=32 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=1 0.0374 ms (504.1 GFLOPS) (3942/5117) KWG=32 KWI=8 MDIMA=16 MDIMC=32 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=1 0.0348 ms (542.6 GFLOPS) (4400/5117) KWG=32 KWI=8 MDIMA=8 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=1 STRN=0 VWM=2 VWN=2 0.0332 ms (568.9 GFLOPS) Wavefront/Warp size: 32 Max workgroup size: 1024 Max workgroup dimensions: 1024 1024 64 BLAS Core: Haswell
Curious when I try to run full tune i get the message " could not open weights file : network
Any ideas? I just renamed the id226 to network?
how good is a GeForce GTX 1060 6GB for Leela ?
Moderators: hgm, Rebel, chrisw
-
- Posts: 1280
- Joined: Tue Aug 18, 2009 3:06 am
Re: how good is a GeForce GTX 1060 6GB for Leela ?
-
- Posts: 6340
- Joined: Mon Mar 13, 2006 2:34 pm
- Location: Acworth, GA
Re: how good is a GeForce GTX 1060 6GB for Leela ?
I got that message once. It was because I forgot to extract the weights file from the weights_###.txt.gz package.Robert Flesher wrote:Albert Silver wrote:Yes, here is what I got on my laptop:Guenther wrote:At first start LCZero does an automatical tuning for what settings to use with your gpu. This is of course a standard tuning.JJJ wrote:What do you mean the full tune on it ?
from starting position, my leela needed 1min26 to reach depth 26. I don't know if it is ok for my card or below.
Anyway, my leela is winning against Hakkapelitta. So that's nice already.
By doing a full tuning you can get a speed increase up to perhaps 150-200% in some cases.
Delete the automatically created file named leelaz_opencl_tuning and start the process like described below.
Run sth like this (adapt names/files) from commandline or add it to a batch file in case of windows.
This might run some time and you can see how each tried setting gets more GFlops out of your card.
example for my very weak gpu which must be retuned now for the new NN size (NN ID222 is now at 15*192):Code: Select all
lczero07.exe --tune-only --full-tuner -w ID222
Code: Select all
C:\Engines\UCIPG\LCZero_07ID222>lczero07.exe --tune-only --full-tuner -w ID222 Using 2 thread(s). Detecting residual layers...v2...192 channels...15 blocks. Initializing OpenCL. Detected 1 OpenCL platforms. Platform version: OpenCL 1.2 CUDA 9.1.75 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 0 Device name: GeForce GT 710 Device type: GPU Device vendor: NVIDIA Corporation Device driver: 388.13 Device speed: 954 MHz Device cores: 1 CU Device score: 1112 Selected platform: NVIDIA CUDA Selected device: GeForce GT 710 with OpenCL 1.2 capability. Started OpenCL SGEMM tuner. RNG seed: 0xec7a02 (thread: 701728073) Will try 5279 valid configurations. (1/5279) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 ST ...
Code: Select all
C:\Users\Albert\Chess\Leela Zero\GPU>lczero.exe -t3 -w weights.txt --full-tuner Using 3 thread(s). Detecting residual layers...v2...192 channels...15 blocks. Initializing OpenCL. Device name: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz Device type: CPU Device vendor: Intel(R) Corporation Device driver: 7.6.0.611 Device speed: 2600 MHz Device cores: 8 CU Device score: 521 Platform version: OpenCL 1.2 CUDA 9.1.84 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 2 Device name: GeForce GTX 980M Device type: GPU Device vendor: NVIDIA Corporation Device driver: 391.35 Device speed: 1126 MHz Device cores: 12 CU Device score: 1112 Selected platform: NVIDIA CUDA Selected device: GeForce GTX 980M with OpenCL 1.2 capability. Started OpenCL SGEMM tuner. RNG seed: 0x65d47141 (thread: 2783254248) Will try 5117 valid configurations. (1/5117) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.1067 ms (177.0 GFLOPS) (6/5117) KWG=32 KWI=2 MDIMA=8 MDIMC=16 MWG=32 NDIMB=8 NDIMC=16 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0946 ms (199.5 GFLOPS) (9/5117) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0894 ms (211.1 GFLOPS) (79/5117) KWG=16 KWI=8 MDIMA=32 MDIMC=32 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0651 ms (289.9 GFLOPS) (566/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=1 STRN=0 VWM=2 VWN=2 0.0594 ms (317.6 GFLOPS) (853/5117) KWG=16 KWI=2 MDIMA=8 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=1 VWM=2 VWN=2 0.0571 ms (330.4 GFLOPS) (1276/5117) KWG=32 KWI=2 MDIMA=16 MDIMC=16 MWG=32 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0551 ms (342.8 GFLOPS) (1278/5117) KWG=32 KWI=2 MDIMA=16 MDIMC=8 MWG=32 NDIMB=16 NDIMC=16 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0530 ms (356.2 GFLOPS) (1306/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0501 ms (377.0 GFLOPS) (1348/5117) KWG=16 KWI=2 MDIMA=32 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=2 VWN=1 0.0484 ms (390.2 GFLOPS) (1404/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=16 NDIMC=16 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=4 VWN=1 0.0444 ms (424.8 GFLOPS) (1504/5117) KWG=16 KWI=8 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=4 VWN=2 0.0424 ms (444.7 GFLOPS) (1837/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=1 STRN=0 VWM=4 VWN=2 0.0421 ms (447.9 GFLOPS) (3906/5117) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=1 0.0399 ms (473.1 GFLOPS) (3921/5117) KWG=16 KWI=2 MDIMA=32 MDIMC=32 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=1 0.0374 ms (504.1 GFLOPS) (3942/5117) KWG=32 KWI=8 MDIMA=16 MDIMC=32 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=1 0.0348 ms (542.6 GFLOPS) (4400/5117) KWG=32 KWI=8 MDIMA=8 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=1 STRN=0 VWM=2 VWN=2 0.0332 ms (568.9 GFLOPS) Wavefront/Warp size: 32 Max workgroup size: 1024 Max workgroup dimensions: 1024 1024 64 BLAS Core: Haswell
Curious when I try to run full tune i get the message " could not open weights file : network
Any ideas? I just renamed the id226 to network?
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
__________________________________________________________________
Ted Summers
-
- Posts: 4606
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: how good is a GeForce GTX 1060 6GB for Leela ?
If you renamed it to network you must of course add -network to the commandline.Robert Flesher wrote:
Curious when I try to run full tune i get the message " could not open weights file : network
Any ideas? I just renamed the id226 to network?
Otherwise check if your system hasn't added a *.txt* file extension and
that you are able to see it.
(it should be set to visible always anyway, if you want to work with your computer)
-
- Posts: 4606
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: how good is a GeForce GTX 1060 6GB for Leela ?
Well, since quite a while that is not necessary anymore.AdminX wrote: I got that message once. It was because I forgot to extract the weights file from the weights_###.txt.gz package.
LCZero meanwhile reads the compressed file directly too.
-
- Posts: 5228
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: how good is a GeForce GTX 1060 6GB for Leela ?
Here are numbers from my GTX 750 Ti :Albert Silver wrote:Yes, here is what I got on my laptop:Guenther wrote:At first start LCZero does an automatical tuning for what settings to use with your gpu. This is of course a standard tuning.JJJ wrote:What do you mean the full tune on it ?
from starting position, my leela needed 1min26 to reach depth 26. I don't know if it is ok for my card or below.
Anyway, my leela is winning against Hakkapelitta. So that's nice already.
By doing a full tuning you can get a speed increase up to perhaps 150-200% in some cases.
Delete the automatically created file named leelaz_opencl_tuning and start the process like described below.
Run sth like this (adapt names/files) from commandline or add it to a batch file in case of windows.
This might run some time and you can see how each tried setting gets more GFlops out of your card.
example for my very weak gpu which must be retuned now for the new NN size (NN ID222 is now at 15*192):Code: Select all
lczero07.exe --tune-only --full-tuner -w ID222
Code: Select all
C:\Engines\UCIPG\LCZero_07ID222>lczero07.exe --tune-only --full-tuner -w ID222 Using 2 thread(s). Detecting residual layers...v2...192 channels...15 blocks. Initializing OpenCL. Detected 1 OpenCL platforms. Platform version: OpenCL 1.2 CUDA 9.1.75 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 0 Device name: GeForce GT 710 Device type: GPU Device vendor: NVIDIA Corporation Device driver: 388.13 Device speed: 954 MHz Device cores: 1 CU Device score: 1112 Selected platform: NVIDIA CUDA Selected device: GeForce GT 710 with OpenCL 1.2 capability. Started OpenCL SGEMM tuner. RNG seed: 0xec7a02 (thread: 701728073) Will try 5279 valid configurations. (1/5279) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 ST ...
Code: Select all
C:\Users\Albert\Chess\Leela Zero\GPU>lczero.exe -t3 -w weights.txt --full-tuner Using 3 thread(s). Detecting residual layers...v2...192 channels...15 blocks. Initializing OpenCL. Device name: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz Device type: CPU Device vendor: Intel(R) Corporation Device driver: 7.6.0.611 Device speed: 2600 MHz Device cores: 8 CU Device score: 521 Platform version: OpenCL 1.2 CUDA 9.1.84 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 2 Device name: GeForce GTX 980M Device type: GPU Device vendor: NVIDIA Corporation Device driver: 391.35 Device speed: 1126 MHz Device cores: 12 CU Device score: 1112 Selected platform: NVIDIA CUDA Selected device: GeForce GTX 980M with OpenCL 1.2 capability. Started OpenCL SGEMM tuner. RNG seed: 0x65d47141 (thread: 2783254248) Will try 5117 valid configurations. (1/5117) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.1067 ms (177.0 GFLOPS) (6/5117) KWG=32 KWI=2 MDIMA=8 MDIMC=16 MWG=32 NDIMB=8 NDIMC=16 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0946 ms (199.5 GFLOPS) (9/5117) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0894 ms (211.1 GFLOPS) (79/5117) KWG=16 KWI=8 MDIMA=32 MDIMC=32 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0651 ms (289.9 GFLOPS) (566/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=1 STRN=0 VWM=2 VWN=2 0.0594 ms (317.6 GFLOPS) (853/5117) KWG=16 KWI=2 MDIMA=8 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=1 VWM=2 VWN=2 0.0571 ms (330.4 GFLOPS) (1276/5117) KWG=32 KWI=2 MDIMA=16 MDIMC=16 MWG=32 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0551 ms (342.8 GFLOPS) (1278/5117) KWG=32 KWI=2 MDIMA=16 MDIMC=8 MWG=32 NDIMB=16 NDIMC=16 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0530 ms (356.2 GFLOPS) (1306/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0501 ms (377.0 GFLOPS) (1348/5117) KWG=16 KWI=2 MDIMA=32 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=2 VWN=1 0.0484 ms (390.2 GFLOPS) (1404/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=16 NDIMC=16 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=4 VWN=1 0.0444 ms (424.8 GFLOPS) (1504/5117) KWG=16 KWI=8 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=4 VWN=2 0.0424 ms (444.7 GFLOPS) (1837/5117) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=1 STRN=0 VWM=4 VWN=2 0.0421 ms (447.9 GFLOPS) (3906/5117) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=1 0.0399 ms (473.1 GFLOPS) (3921/5117) KWG=16 KWI=2 MDIMA=32 MDIMC=32 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=1 0.0374 ms (504.1 GFLOPS) (3942/5117) KWG=32 KWI=8 MDIMA=16 MDIMC=32 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=1 0.0348 ms (542.6 GFLOPS) (4400/5117) KWG=32 KWI=8 MDIMA=8 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=1 STRN=0 VWM=2 VWN=2 0.0332 ms (568.9 GFLOPS) Wavefront/Warp size: 32 Max workgroup size: 1024 Max workgroup dimensions: 1024 1024 64 BLAS Core: Haswell
Code: Select all
>lczero.exe --tune-only --full-tuner -w weights.txt
Using 2 thread(s).
Detecting residual layers...v2...192 channels...15 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.1.75
Platform profile: FULL_PROFILE
Platform name: NVIDIA CUDA
Platform vendor: NVIDIA Corporation
Device ID: 0
Device name: GeForce GTX 750 Ti
Device type: GPU
Device vendor: NVIDIA Corporation
Device driver: 388.13
Device speed: 1110 MHz
Device cores: 5 CU
Device score: 1112
Selected platform: NVIDIA CUDA
Selected device: GeForce GTX 750 Ti
with OpenCL 1.2 capability.
Started OpenCL SGEMM tuner.
RNG seed: 0x41912f66 (thread: 886403780)
Will try 5128 valid configurations.
(1/5128) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=16 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.1962 ms (96.2 GFLOPS)
(15/5128) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=16 NDIMB=16 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.1960 ms (96.3 GFLOPS)
(20/5128) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=32 NDIMB=16 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.1816 ms (104.0 GFLOPS)
(26/5128) KWG=16 KWI=2 MDIMA=32 MDIMC=8 MWG=32 NDIMB=16 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.1613 ms (117.0 GFLOPS)
(54/5128) KWG=16 KWI=8 MDIMA=8 MDIMC=8 MWG=32 NDIMB=16 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.1612 ms (117.1 GFLOPS)
(55/5128) KWG=32 KWI=8 MDIMA=8 MDIMC=8 MWG=64 NDIMB=16 NDIMC=8 NWG=32 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.1466 ms (128.8 GFLOPS)
(71/5128) KWG=16 KWI=8 MDIMA=32 MDIMC=8 MWG=64 NDIMB=16 NDIMC=8 NWG=32 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.1436 ms (131.5 GFLOPS)
(95/5128) KWG=16 KWI=2 MDIMA=16 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=2 VWN=1 0.1121 ms (168.4 GFLOPS)
(136/5128) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=4 VWN=1 0.1038 ms (181.8 GFLOPS)
(257/5128) KWG=32 KWI=8 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=8 VWN=2 0.1007 ms (187.5 GFLOPS)
(1304/5128) KWG=32 KWI=2 MDIMA=32 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0841 ms (224.4 GFLOPS)
(1339/5128) KWG=32 KWI=8 MDIMA=32 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.0768 ms (245.7 GFLOPS)
(1376/5128) KWG=16 KWI=2 MDIMA=16 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=2 VWN=1 0.0733 ms (257.5 GFLOPS)
(1441/5128) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=0 STRM=0 STRN=0 VWM=4 VWN=1 0.0716 ms (263.4 GFLOPS)
(1742/5128) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=1 STRN=0 VWM=4 VWN=1 0.0574 ms (328.7 GFLOPS)
(1755/5128) KWG=16 KWI=8 MDIMA=16 MDIMC=8 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=0 STRM=1 STRN=0 VWM=4 VWN=1 0.0566 ms (333.5 GFLOPS)
(2501/5128) KWG=16 KWI=8 MDIMA=16 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=0 STRM=1 STRN=1 VWM=2 VWN=2 0.0562 ms (335.9 GFLOPS)
(4276/5128) KWG=16 KWI=2 MDIMA=16 MDIMC=8 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=1 SB=1 STRM=1 STRN=0 VWM=4 VWN=1 0.0542 ms (348.5 GFLOPS)
(5068/5128) KWG=16 KWI=8 MDIMA=32 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=1 STRN=1 VWM=2 VWN=2 0.0538 ms (350.6 GFLOPS)
-
- Posts: 4606
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: how good is a GeForce GTX 1060 6GB for Leela ?
Still we don't know how the GFlops correlate with nps. GFlops alone don't determine the speed, but memory and clockspeed are relevant too.Vinvin wrote:Code: Select all
... (5068/5128) KWG=16 KWI=8 MDIMA=32 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=1 STRN=1 VWM=2 VWN=2 0.0538 ms (350.6 GFLOPS)
May be I try to calculate a formula from the data, if you also add that meanwhile 'self-established' benchmark of 'go infinite' and report for depth 26.
(note that I already asked for a way to establish benchmark stats, 5 weeks ago at the LCZero github site - the result was a bit disappointing and the
ways of measurement too)
-
- Posts: 1280
- Joined: Tue Aug 18, 2009 3:06 am
Re: how good is a GeForce GTX 1060 6GB for Leela ?
Guenther wrote:Well, since quite a while that is not necessary anymore.AdminX wrote: I got that message once. It was because I forgot to extract the weights file from the weights_###.txt.gz package.
LCZero meanwhile reads the compressed file directly too.
I have no idea what I am doing wrong but I cannot get it to run. I get the same message over and over!
-
- Posts: 4606
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: how good is a GeForce GTX 1060 6GB for Leela ?
Can you describe exactly what you are doing and what files are there?Robert Flesher wrote:Guenther wrote:Well, since quite a while that is not necessary anymore.AdminX wrote: I got that message once. It was because I forgot to extract the weights file from the weights_###.txt.gz package.
LCZero meanwhile reads the compressed file directly too.
I have no idea what I am doing wrong but I cannot get it to run. I get the same message over and over! :evil:
-
- Posts: 1280
- Joined: Tue Aug 18, 2009 3:06 am
Re: how good is a GeForce GTX 1060 6GB for Leela ?
C:\users\robert\desktop\lczero\lczero.exe --tune-only --full-tuner -w networkGuenther wrote:Can you describe exactly what you are doing and what files are there?Robert Flesher wrote:Guenther wrote:Well, since quite a while that is not necessary anymore.AdminX wrote: I got that message once. It was because I forgot to extract the weights file from the weights_###.txt.gz package.
LCZero meanwhile reads the compressed file directly too.
I have no idea what I am doing wrong but I cannot get it to run. I get the same message over and over!
the id file is
is in the LCzero folder and named network
-
- Posts: 4606
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: how good is a GeForce GTX 1060 6GB for Leela ?
Did you check that it is really renamed to network w/o any extension as I wrote already earlier? (file manager : display extensions for known file types)Robert Flesher wrote:C:\users\robert\desktop\lczero\lczero.exe --tune-only --full-tuner -w networkGuenther wrote:Can you describe exactly what you are doing and what files are there?Robert Flesher wrote:Guenther wrote:Well, since quite a while that is not necessary anymore.AdminX wrote: I got that message once. It was because I forgot to extract the weights file from the weights_###.txt.gz package.
LCZero meanwhile reads the compressed file directly too.
I have no idea what I am doing wrong but I cannot get it to run. I get the same message over and over! :evil:
the id file is
is in the LCzero folder and named network