How good is the RTX 2080 Ti for Leela?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Hai
Posts: 361
Joined: Sun Aug 04, 2013 11:19 am

How good is the RTX 2080 Ti for Leela?

Post by Hai » Sat Sep 15, 2018 9:12 pm

How good is the RTX 2080 Ti for Leela?

Robert Pope
Posts: 433
Joined: Sat Mar 25, 2006 7:27 pm

Re: How good is the RTX 2080 Ti for Leela?

Post by Robert Pope » Sat Sep 15, 2018 11:43 pm

Nobody has one, so nobody knows. According to Milos, it's only going to gain from the clock speed, so maybe 10%-15% faster. We'll know for sure once they are actually in the market.

ankan
Posts: 61
Joined: Sun Apr 21, 2013 1:29 pm
Location: Pune, India
Contact:

Re: How good is the RTX 2080 Ti for Leela?

Post by ankan » Sun Sep 16, 2018 3:46 am

It should be very similar to a Titan V for lc0.

It has tensor cores enabled, and it's peak fp16 tensor math throughput is almost exactly same as a Titan V (114 Tflops vs 110 Tflops):
https://www.anandtech.com/show/13282/nv ... eep-dive/6

I have one, but I can't post any benchmarks before reviews are out :)

Note that right now lc0 can't make use of int8 (or int4) math, but google did it with A0 on their TPUs so its something lc0 team wants to try in future. If successful, we hope to get another 2x speedup.

Werewolf
Posts: 1063
Joined: Thu Sep 18, 2008 8:24 pm

Re: How good is the RTX 2080 Ti for Leela?

Post by Werewolf » Mon Sep 17, 2018 4:05 am

ankan wrote:
Sun Sep 16, 2018 3:46 am
It should be very similar to a Titan V for lc0.

It has tensor cores enabled, and it's peak fp16 tensor math throughput is almost exactly same as a Titan V (114 Tflops vs 110 Tflops):
https://www.anandtech.com/show/13282/nv ... eep-dive/6

I have one, but I can't post any benchmarks before reviews are out :)

Note that right now lc0 can't make use of int8 (or int4) math, but google did it with A0 on their TPUs so its something lc0 team wants to try in future. If successful, we hope to get another 2x speedup.
I’m not accusing you of lying but why would Nvidia cripple the CUDA cores on the 2080 Ti for FP16 (presumably to protect Quadro) and then allow the tensor cores to run full speed?

In a week or two Lc0’s speed on this card will finally be revealed- I hope you’re right

ankan
Posts: 61
Joined: Sun Apr 21, 2013 1:29 pm
Location: Pune, India
Contact:

Re: How good is the RTX 2080 Ti for Leela?

Post by ankan » Mon Sep 17, 2018 4:25 am

Werewolf wrote:
Mon Sep 17, 2018 4:05 am
I’m not accusing you of lying but why would Nvidia cripple the CUDA cores on the 2080 Ti for FP16 (presumably to protect Quadro) and then allow the tensor cores to run full speed?

In a week or two Lc0’s speed on this card will finally be revealed- I hope you’re right
I don't know from where people got the rumors that Nvidia crippled non-tensor fp16 math on 2080Ti.

See page 8/9 of this document for full specs:
https://www.nvidia.com/content/dam/en-z ... epaper.pdf

The only thing that is different from Quadro is "Peak FP16 Tensor TFLOPS with FP32 Accumulate" which lc0 doesn't use.

Milos has no idea what he is talking about...

Werewolf
Posts: 1063
Joined: Thu Sep 18, 2008 8:24 pm

Re: How good is the RTX 2080 Ti for Leela?

Post by Werewolf » Mon Sep 17, 2018 4:37 am

ankan wrote:
Mon Sep 17, 2018 4:25 am

I don't know from where people got the rumors that Nvidia crippled non-tensor fp16 math on 2080Ti.

Milos has no idea what he is talking about...
But it's not from Milos, it's from Wikipedia:

https://en.wikipedia.org/wiki/List_of_N ... _20_series

Unless you're saying the wiki page is wrong the CUDA cores are crippled for FP16. In addition to that there's also the debate as to whether LC0 can use tensor cores, but that's not something I know about.

Error323
Posts: 10
Joined: Sun Jun 17, 2018 4:35 pm

Re: How good is the RTX 2080 Ti for Leela?

Post by Error323 » Mon Sep 17, 2018 7:26 am

Werewolf wrote:
Mon Sep 17, 2018 4:37 am
ankan wrote:
Mon Sep 17, 2018 4:25 am

I don't know from where people got the rumors that Nvidia crippled non-tensor fp16 math on 2080Ti.

Milos has no idea what he is talking about...
But it's not from Milos, it's from Wikipedia:

https://en.wikipedia.org/wiki/List_of_N ... _20_series

Unless you're saying the wiki page is wrong the CUDA cores are crippled for FP16. In addition to that there's also the debate as to whether LC0 can use tensor cores, but that's not something I know about.
It's not the CUDA cores that will be doing the FP16 computations, but the tensorcores. They are specifically designed for neural network inference, because that's what the new raytracing technique is using to make it work in realtime. Fortunately for us those cores are perfect for Lc0 as we use a very similar neural network architecture for chess (convolutional layers).

Also, you should listen to ankan, he's got that 2080 for a reason ;) And he wrote our cudnn backend!

Werewolf
Posts: 1063
Joined: Thu Sep 18, 2008 8:24 pm

Re: How good is the RTX 2080 Ti for Leela?

Post by Werewolf » Mon Sep 17, 2018 8:40 am

Error323 wrote:
Mon Sep 17, 2018 7:26 am

It's not the CUDA cores that will be doing the FP16 computations, but the tensorcores. They are specifically designed for neural network inference, because that's what the new raytracing technique is using to make it work in realtime. Fortunately for us those cores are perfect for Lc0 as we use a very similar neural network architecture for chess (convolutional layers).

Also, you should listen to ankan, he's got that 2080 for a reason ;) And he wrote our cudnn backend!
Well if that's correct it's great news for everyone, I'm not complaining!
However, there do seem to be some differences with the CUDA cores between Quadro and Geforce.

Werewolf
Posts: 1063
Joined: Thu Sep 18, 2008 8:24 pm

Re: How good is the RTX 2080 Ti for Leela?

Post by Werewolf » Mon Sep 17, 2018 10:25 am

Werewolf wrote:
Mon Sep 17, 2018 8:40 am

However, there do seem to be some differences with the CUDA cores between Quadro and Geforce.
Forget that comment - I see why it's wrong now.

What's FP16 accumulate?

ankan
Posts: 61
Joined: Sun Apr 21, 2013 1:29 pm
Location: Pune, India
Contact:

Re: How good is the RTX 2080 Ti for Leela?

Post by ankan » Mon Sep 17, 2018 3:29 pm

Werewolf wrote:
Mon Sep 17, 2018 10:25 am
Werewolf wrote:
Mon Sep 17, 2018 8:40 am

However, there do seem to be some differences with the CUDA cores between Quadro and Geforce.
Forget that comment - I see why it's wrong now.

What's FP16 accumulate?
Tensor cores perform small matrix multiplies and accumulate. See https://devblogs.nvidia.com/programming ... es-cuda-9/ for more details.
They support two modes - either you can do everything in fp16, or you can do the multiply in fp16 and the accumulation in fp32. From the whitepaper it seems for gaming cards (RTX 20xx), the performance of fp32 accumulate mode has been cut to half compared to quadro cards. AFAIK, 32 bit accumulation mode is more useful for training. For inference doing everything in fp16 is generally sufficient (and that's what we use for lc0).

Post Reply