While running in GPU does LCZero use the GPU memory too or just the system RAM?kasinp wrote: I should rephrase: the performance will be close, but the GPU will achieve it without breaking sweat (it will never exceed 80% utilization), in my case it doesn't even go above silent mode. OTOH, the i9 will behave like a decent space heater to achieve the similar level of play. In that sense I consider the GPU to be a hands-down better option.
PK
LCZero is using my cores, not my GPU.
Moderators: hgm, Rebel, chrisw
-
- Posts: 291
- Joined: Wed May 08, 2013 6:49 am
Re: LCZero is using my cores, not my GPU.
-
- Posts: 141
- Joined: Thu Mar 09, 2006 12:58 am
Re: LCZero is using my cores, not my GPU.
OK. So I will download both files, but I will only run the GPU file unless we get cold weather here.kasinp wrote:I should rephrase: the performance will be close, but the GPU will achieve it without breaking sweat (it will never exceed 80% utilization), in my case it doesn't even go above silent mode. OTOH, the i9 will behave like a decent space heater to achieve the similar level of play. In that sense I consider the GPU to be a hands-down better option.kasinp wrote:I would say GPU hands down. In my case a 1080 ti outperforms the CPU version running on a 14-core Xeon at 3.00GHz. Seems it really prefers the GPU architecture.Milton wrote:I am getting a new computer around May 3 (a Dell Alien Area 51) and I would like to participate in this project.
Based on this configuration, would it be best for me to download the GPU file or the CPU file?
Intel® Core™ i9 7980XE
NVIDIA® GeForce® GTX 1080 with 8GB GDDR5X
32GB Dual Channel HyperX™ DDR4 XMP at 2933MHz
PK
PK
-
- Posts: 3026
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: LCZero is using my cores, not my GPU.
I cannot say whether it uses the GPU memory or not, though presume it does, but it definitely uses the system RAM as well, even when running on the GPU.Jhoravi wrote:While running in GPU does LCZero use the GPU memory too or just the system RAM?kasinp wrote: I should rephrase: the performance will be close, but the GPU will achieve it without breaking sweat (it will never exceed 80% utilization), in my case it doesn't even go above silent mode. OTOH, the i9 will behave like a decent space heater to achieve the similar level of play. In that sense I consider the GPU to be a hands-down better option.
PK
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
-
- Posts: 2748
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: LCZero is using my cores, not my GPU.
Is there a command line flag or something to tell it to use the GPU?
Code: Select all
--gpu arg ID of the OpenCL device(s) to use (disables
autodetection).
that will output a list of your OpenCL devices with OpenCL IDs,
then you can select the device with --gpu arg and run --tune-only and --full-tuner,
that will try a thousands of configuration options and create a config file for your device.
e.g.
Code: Select all
lczero --tune-only
lczero --gpu 0 --tune-only --full-tuner
Srdja
-
- Posts: 12606
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: LCZero is using my cores, not my GPU.
It does not see my GPU at all.
I have no file weights.txt so I copied network.txt to weights.txt
G:\chess\LCZero>lczero --tune-only
Using 2 thread(s).
Detecting residual layers...v1...64 channels...6 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 2.0 AMD-APP (1912.5)
Platform profile: FULL_PROFILE
Platform name: AMD Accelerated Parallel Processing
Platform vendor: Advanced Micro Devices, Inc.
Device ID: 0
Device name: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Device type: CPU
Device vendor: GenuineIntel
Device driver: 1912.5 (sse2,avx)
Device speed: 3392 MHz
Device cores: 8 CU
Device score: 520
Selected platform: AMD Accelerated Parallel Processing
Selected device: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
with OpenCL 2.0 capability.
Loaded existing SGEMM tuning.
G:\chess\LCZero>lczero --gpu 0 --tune-only --full-tuner
Using 2 thread(s).
Detecting residual layers...v1...64 channels...6 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 2.0 AMD-APP (1912.5)
Platform profile: FULL_PROFILE
Platform name: AMD Accelerated Parallel Processing
Platform vendor: Advanced Micro Devices, Inc.
Device ID: 0
Device name: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Device type: CPU
Device vendor: GenuineIntel
Device driver: 1912.5 (sse2,avx)
Device speed: 3392 MHz
Device cores: 8 CU
Device score: 520
Selected platform: AMD Accelerated Parallel Processing
Selected device: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
with OpenCL 2.0 capability.
Started OpenCL SGEMM tuner.
RNG seed: 0xb1e44da6 (thread: 3568476555)
Will try 5238 valid configurations.
(1/5238) KWG=16 KWI=2 MDIMA=8 MDIMC=16 MWG=16 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.7277 ms (2.9 GFLOPS)
(4/5238) KWG=32 KWI=2 MDIMA=8 MDIMC=16 MWG=64 NDIMB=8 NDIMC=16 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.6078 ms (3.5 GFLOPS)
(5/5238) KWG=16 KWI=2 MDIMA=16 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.4344 ms (4.8 GFLOPS)
(78/5238) KWG=32 KWI=2 MDIMA=8 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=2 VWN=1 0.3680 ms (5.7 GFLOPS)
(107/5238) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=2 VWN=1 0.3461 ms (6.1 GFLOPS)
(127/5238) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=16 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=4 VWN=1 0.3450 ms (6.1 GFLOPS)
(129/5238) KWG=32 KWI=2 MDIMA=16 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=4 VWN=1 0.3315 ms (6.3 GFLOPS)
(133/5238) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=4 VWN=1 0.2737 ms (7.7 GFLOPS)
(519/5238) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=1 STRN=0 VWM=8 VWN=1 0.2609 ms (8.0 GFLOPS)
I have no file weights.txt so I copied network.txt to weights.txt
G:\chess\LCZero>lczero --tune-only
Using 2 thread(s).
Detecting residual layers...v1...64 channels...6 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 2.0 AMD-APP (1912.5)
Platform profile: FULL_PROFILE
Platform name: AMD Accelerated Parallel Processing
Platform vendor: Advanced Micro Devices, Inc.
Device ID: 0
Device name: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Device type: CPU
Device vendor: GenuineIntel
Device driver: 1912.5 (sse2,avx)
Device speed: 3392 MHz
Device cores: 8 CU
Device score: 520
Selected platform: AMD Accelerated Parallel Processing
Selected device: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
with OpenCL 2.0 capability.
Loaded existing SGEMM tuning.
G:\chess\LCZero>lczero --gpu 0 --tune-only --full-tuner
Using 2 thread(s).
Detecting residual layers...v1...64 channels...6 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 2.0 AMD-APP (1912.5)
Platform profile: FULL_PROFILE
Platform name: AMD Accelerated Parallel Processing
Platform vendor: Advanced Micro Devices, Inc.
Device ID: 0
Device name: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Device type: CPU
Device vendor: GenuineIntel
Device driver: 1912.5 (sse2,avx)
Device speed: 3392 MHz
Device cores: 8 CU
Device score: 520
Selected platform: AMD Accelerated Parallel Processing
Selected device: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
with OpenCL 2.0 capability.
Started OpenCL SGEMM tuner.
RNG seed: 0xb1e44da6 (thread: 3568476555)
Will try 5238 valid configurations.
(1/5238) KWG=16 KWI=2 MDIMA=8 MDIMC=16 MWG=16 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.7277 ms (2.9 GFLOPS)
(4/5238) KWG=32 KWI=2 MDIMA=8 MDIMC=16 MWG=64 NDIMB=8 NDIMC=16 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.6078 ms (3.5 GFLOPS)
(5/5238) KWG=16 KWI=2 MDIMA=16 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=1 VWN=1 0.4344 ms (4.8 GFLOPS)
(78/5238) KWG=32 KWI=2 MDIMA=8 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=2 VWN=1 0.3680 ms (5.7 GFLOPS)
(107/5238) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=2 VWN=1 0.3461 ms (6.1 GFLOPS)
(127/5238) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=16 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=4 VWN=1 0.3450 ms (6.1 GFLOPS)
(129/5238) KWG=32 KWI=2 MDIMA=16 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=4 VWN=1 0.3315 ms (6.3 GFLOPS)
(133/5238) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=16 NDIMC=8 NWG=16 SA=0 SB=0 STRM=0 STRN=0 VWM=4 VWN=1 0.2737 ms (7.7 GFLOPS)
(519/5238) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=0 SB=0 STRM=1 STRN=0 VWM=8 VWN=1 0.2609 ms (8.0 GFLOPS)
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 2748
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: LCZero is using my cores, not my GPU.
Yes, it lists only the CPU.It does not see my GPU at all.
On Linux, the command "lspci | grep VGA" should list your GPU as PCI device.
Install and/or run the command "clinfo",
that should list all of your installed OpenCL devices.
Linux distros use usually the open source driver "Nouveau" for Nvidia devices,
so you have to install the proprietary Nvidia driver packages from your repository
(driver, libopencl, opencl-icd) to get OpenCL support.
64 channels and 6 blocks, looks like you are running still version 0.5. or 0.4.I have no file weights.txt so I copied network.txt to weights.txt
G:\chess\LCZero>lczero --tune-only
Using 2 thread(s).
Detecting residual layers...v1...64 channels...6 blocks.
--
Srdja
-
- Posts: 2748
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: LCZero is using my cores, not my GPU.
Ah, just see, you are running windows,G:\chess\LCZero>lczero --tune-only
so maybe a gpu driver update may help?
Afaik LC0 needs OpenCl >= 1.2, not sure if lower versions will work.
--
Srdja
-
- Posts: 251
- Joined: Sat Dec 02, 2006 10:47 pm
- Location: Toronto
- Full name: Peter Kasinski
Re: LCZero is using my cores, not my GPU.
GPU version 0.7 using 1080 ti also keeps 2-3 threads on my 14-core Xeon busy. During 10-15 min. analysis these typically allocate 3-4GB or RAM.Albert Silver wrote:I cannot say whether it uses the GPU memory or not, though presume it does, but it definitely uses the system RAM as well, even when running on the GPU.Jhoravi wrote:While running in GPU does LCZero use the GPU memory too or just the system RAM?kasinp wrote: I should rephrase: the performance will be close, but the GPU will achieve it without breaking sweat (it will never exceed 80% utilization), in my case it doesn't even go above silent mode. OTOH, the i9 will behave like a decent space heater to achieve the similar level of play. In that sense I consider the GPU to be a hands-down better option.
PK
PK
-
- Posts: 12606
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: LCZero is using my cores, not my GPU.
I downloaded the GPU version, so I would assume it would use my GPU.
But right now I see LCZero using between 13 and 90% of my CPU.
It was not like that when I first started it. My CPU was quiet for some time.
I wonder if it downloads the CPU version by itself and then runs that.
What I see does not make sense to me.
But right now I see LCZero using between 13 and 90% of my CPU.
It was not like that when I first started it. My CPU was quiet for some time.
I wonder if it downloads the CPU version by itself and then runs that.
What I see does not make sense to me.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 228
- Joined: Sun Mar 12, 2006 3:11 pm
Re: LCZero is using my cores, not my GPU.
Even the gpu version used the cpu for MCTS playout. (Default is one thread I think).
And as you have seen there are separate executables for gpu and cpu. GPU used opencl and cpu openblas IIRC
And as you have seen there are separate executables for gpu and cpu. GPU used opencl and cpu openblas IIRC