lc0-win-20180512-cuda90-cudnn712-00

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: lc0-win-20180512-cuda90-cudnn712-00

Post by Milos »

Albert Silver wrote: Sun May 20, 2018 12:23 am
Albert Silver wrote: Sat May 19, 2018 9:29 pm
Laskos wrote: Sat May 19, 2018 9:16 pm
Wow, thanks for the tips, I am new to my settings with GPU. Indeed, with these settings, with ID237 and with May 19 LC0 CUDA, against Zurichess Neuchatel, a modern AB engine, LC0 performs at 3120 Elo level in CCRL 40/4' conditions (still not very many games), a rating I have never seen even remotely with any LC0 (master or CUDA) in these TC conditions. The same GTX 1060 6GB card as yours and 2 i7 threads. Thanks also for the revised WAC and the result with PUCT values on it. You really seem to hit a sweet point, as tactics is the most important cause of its misses. Time and again, a tactical blunder gives a half a point or a full point away.
I actually have interesting news for settings based on a discussion in Discord, the official LC0 channel. Someone ran CLOP on it to finetune all the settings to find optimal results. He came up with slowmover (the time management setting) best at 2.2-2.3, cPUCT at about 2.8, and FPU Reduction at -0.08 (yes, negative value). I have not tested this myself, but am sharing:

Image
I misread and slowmover should be about 2.75
Again the same thing. Those are simply the parameters that provide the highest nps (except slowmover).
Nps is strongly dependent on FPUR. Just make a test, run go nodes from starting position with FPUR = 0 and FPUR = 0.2 and you'll see a drastic difference (while keeping PUCT at default 1.2). For me it was more than 30% nps. On the other hand seems smaller FPUR values don't affect strength much.
Similar for PUCT, higher PUCT you get higher nps. However, in this case strength will strongly depend on TC i.e. number of nodes searched. With short TCs it will be beneficial, with longer TCs you'd start seeing a regression compared to lower PUCT values.
And finally slowmover. It is the one that affects primarily type of TC. If you have a high value of increment compared to base clock or fixed time per move, slowmover values above 2 would work the best. However, if you test it with TC without increment you'd notice that it affects strength negatively. The way time manager is implemented atm (very poorly), slowmover is nothing but a very crude way to optimize it.
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: lc0-win-20180512-cuda90-cudnn712-00

Post by Albert Silver »

Milos wrote: Mon May 21, 2018 3:28 am
Albert Silver wrote: Sun May 20, 2018 12:23 am
Albert Silver wrote: Sat May 19, 2018 9:29 pm

I actually have interesting news for settings based on a discussion in Discord, the official LC0 channel. Someone ran CLOP on it to finetune all the settings to find optimal results. He came up with slowmover (the time management setting) best at 2.2-2.3, cPUCT at about 2.8, and FPU Reduction at -0.08 (yes, negative value). I have not tested this myself, but am sharing:

Image
I misread and slowmover should be about 2.75
Again the same thing. Those are simply the parameters that provide the highest nps (except slowmover).
Nps is strongly dependent on FPUR. Just make a test, run go nodes from starting position with FPUR = 0 and FPUR = 0.2 and you'll see a drastic difference (while keeping PUCT at default 1.2). For me it was more than 30% nps. On the other hand seems smaller FPUR values don't affect strength much.
Similar for PUCT, higher PUCT you get higher nps. However, in this case strength will strongly depend on TC i.e. number of nodes searched. With short TCs it will be beneficial, with longer TCs you'd start seeing a regression compared to lower PUCT values.
And finally slowmover. It is the one that affects primarily type of TC. If you have a high value of increment compared to base clock or fixed time per move, slowmover values above 2 would work the best. However, if you test it with TC without increment you'd notice that it affects strength negatively. The way time manager is implemented atm (very poorly), slowmover is nothing but a very crude way to optimize it.
It's a good theory, except that I tested all my PUCT values at 3+0 and 5+0, and then proposed them to GCP. He in turn tested them at very fast TCs, but stopped the test early due to disastrous results.The lower PUCT value was stronger at very short TCs, while the higher PUCT values only shined at longer TCs. Mind you, these CLOP optimized values are not the ones I had hit on, just similar. Here are the two for comparison, though i have not tested the optimized ones to compare.

CLOP

Slowmover: 2.75
cPUCT: 2.8
FPU: -0.08

Mine

Slowmover: 2.0
cPUCT: 3.0
FPU: 0.0
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
User avatar
pohl4711
Posts: 2434
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: lc0-win-20180512-cuda90-cudnn712-00

Post by pohl4711 »

Albert Silver wrote: Mon May 21, 2018 4:39 am
Milos wrote: Mon May 21, 2018 3:28 am
Albert Silver wrote: Sun May 20, 2018 12:23 am

I misread and slowmover should be about 2.75
Again the same thing. Those are simply the parameters that provide the highest nps (except slowmover).
Nps is strongly dependent on FPUR. Just make a test, run go nodes from starting position with FPUR = 0 and FPUR = 0.2 and you'll see a drastic difference (while keeping PUCT at default 1.2). For me it was more than 30% nps. On the other hand seems smaller FPUR values don't affect strength much.
Similar for PUCT, higher PUCT you get higher nps. However, in this case strength will strongly depend on TC i.e. number of nodes searched. With short TCs it will be beneficial, with longer TCs you'd start seeing a regression compared to lower PUCT values.
And finally slowmover. It is the one that affects primarily type of TC. If you have a high value of increment compared to base clock or fixed time per move, slowmover values above 2 would work the best. However, if you test it with TC without increment you'd notice that it affects strength negatively. The way time manager is implemented atm (very poorly), slowmover is nothing but a very crude way to optimize it.
It's a good theory, except that I tested all my PUCT values at 3+0 and 5+0, and then proposed them to GCP. He in turn tested them at very fast TCs, but stopped the test early due to disastrous results.The lower PUCT value was stronger at very short TCs, while the higher PUCT values only shined at longer TCs. Mind you, these CLOP optimized values are not the ones I had hit on, just similar. Here are the two for comparison, though i have not tested the optimized ones to compare.

CLOP

Slowmover: 2.75
cPUCT: 2.8
FPU: -0.08

Mine

Slowmover: 2.0
cPUCT: 3.0
FPU: 0.0
These graph is outdated.

That is the latest version: https://cdn.discordapp.com/attachments/ ... 4_n400.png

So, I use these numbers:
fpu-reduction: -0.06
cpuct: 3.1
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: lc0-win-20180512-cuda90-cudnn712-00

Post by yanquis1972 »

one thing thats clear to me is higher PUCT = higher GPU usage. iow if GPU usage is at 90% at one level & 98% at another one...wouldn't the result simply be higher performance?
shrapnel
Posts: 1339
Joined: Fri Nov 02, 2012 9:43 am
Location: New Delhi, India

Re: lc0-win-20180512-cuda90-cudnn712-00

Post by shrapnel »

yanquis1972 wrote: Mon May 21, 2018 6:31 am one thing thats clear to me is higher PUCT = higher GPU usage. iow if GPU usage is at 90% at one level & 98% at another one...wouldn't the result simply be higher performance?
Hmm...you could be on to something there. Using cPUCT 3.1 as Pohl suggested barely put 83-85 % load on my GPU and speed of lc0 was 8k
Image
Raising cPUCT to 4.9 put around 90-92 % load on my 1080 Ti and increased the speed to 11k.
Lots of headroom still on my 1080 Ti, I think, as the Tester was only using a 1060.
So, maybe the Settings will vary according to the GPU used and TC ?
Anyway, I've barely started.
i7 5960X @ 4.1 Ghz, 64 GB G.Skill RipJaws RAM, Twin Asus ROG Strix OC 11 GB Geforce 2080 Tis
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: lc0-win-20180512-cuda90-cudnn712-00

Post by Albert Silver »

shrapnel wrote: Mon May 21, 2018 4:54 pm
yanquis1972 wrote: Mon May 21, 2018 6:31 am one thing thats clear to me is higher PUCT = higher GPU usage. iow if GPU usage is at 90% at one level & 98% at another one...wouldn't the result simply be higher performance?
Hmm...you could be on to something there. Using cPUCT 3.1 as Pohl suggested barely put 83-85 % load on my GPU and speed of lc0 was 8k
Image
Raising cPUCT to 4.9 put around 90-92 % load on my 1080 Ti and increased the speed to 11k.
Lots of headroom still on my 1080 Ti, I think, as the Tester was only using a 1060.
So, maybe the Settings will vary according to the GPU used and TC ?
Anyway, I've barely started.
Bear in mind that with a card as powerful as the 1080ti you should be using the -t3 flag to get the most from it. As to GPU usage, be careful to not use the Windows Task Manager as a reference, I does not show properly unless you know how. With my GTX1060 I get 95%+ on average even with plain LCZero. I also think cPUCT 4.9 may be overkill and hurt performance. It is definitely not a case of 'more is better'.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
shrapnel
Posts: 1339
Joined: Fri Nov 02, 2012 9:43 am
Location: New Delhi, India

Re: lc0-win-20180512-cuda90-cudnn712-00

Post by shrapnel »

Albert Silver wrote: Mon May 21, 2018 5:13 pmBear in mind that with a card as powerful as the 1080ti you should be using the -t3 flag to get the most from it. As to GPU usage, be careful to not use the Windows Task Manager as a reference, I does not show properly unless you know how. With my GTX1060 I get 95%+ on average even with plain LCZero. I also think cPUCT 4.9 may be overkill and hurt performance. It is definitely not a case of 'more is better'.
Oh, OK.
I will use the -t3 flag in future.
But I'm not using the Windows Task Manager to gauge usage . I'm using the GPU-Z Utility and that too which is specially tuned for my particular Card, which is Asus ROG Strix 11 GB Gaming Geforce GTX 1080 Ti.
So, I think it is very accurate.
As for results, of course lots of Testing still to be done.
i7 5960X @ 4.1 Ghz, 64 GB G.Skill RipJaws RAM, Twin Asus ROG Strix OC 11 GB Geforce 2080 Tis
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: lc0-win-20180512-cuda90-cudnn712-00

Post by Milos »

shrapnel wrote: Mon May 21, 2018 5:25 pm I'm using the GPU-Z Utility and that too which is specially tuned for my particular Card, which is Asus ROG Strix 11 GB Gaming Geforce GTX 1080 Ti.
So, I think it is very accurate.
As for results, of course lots of Testing still to be done.
Lol, specially tuned. That is just a skin man around regular GPU-Z. :lol:
shrapnel
Posts: 1339
Joined: Fri Nov 02, 2012 9:43 am
Location: New Delhi, India

Re: lc0-win-20180512-cuda90-cudnn712-00

Post by shrapnel »

Milos wrote: Mon May 21, 2018 5:37 pm Lol, specially tuned. That is just a skin man around regular GPU-Z. :lol:
OK. Still, I suppose it's more accurate than just using Windows Task Manager.
i7 5960X @ 4.1 Ghz, 64 GB G.Skill RipJaws RAM, Twin Asus ROG Strix OC 11 GB Geforce 2080 Tis
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: lc0-win-20180512-cuda90-cudnn712-00

Post by Milos »

shrapnel wrote: Mon May 21, 2018 6:00 pm
Milos wrote: Mon May 21, 2018 5:37 pm Lol, specially tuned. That is just a skin man around regular GPU-Z. :lol:
OK. Still, I suppose it's more accurate than just using Windows Task Manager.
Windows task manager shows CPU usage, and doesn't even detect any GPU usage unless is related with Windows Aero theme.
GPU-Z is ok to show GPU usage, but GPU usage itself is for sure not a great correlation of parameter strength.