CUDA benchmarks

corres · Post by **corres** » Mon Aug 13, 2018 11:55 pm

CMCanavessi wrote: ↑Mon Aug 13, 2018 10:49 pm
corres wrote: ↑Mon Aug 13, 2018 10:38 pm I should like your opinion about CPU frequency of remote server.
How much effect has the CPU frequency on the chess power of LC0?
Why does not use more threads than two for LC0?
1) Almost nothing if you're using GPU
2) Because 2 are enough to drive the GPU to 100%. If you use 2 gpus like in TCEC, than 4 threads are ideal.

0.16.0 is the latest official release. The one currently playing in tcec is an experimental version with some improvements that will probably make it to the next official release.

Thanks for your answers.

Werewolf · Post by **Werewolf** » Tue Aug 14, 2018 9:20 pm

CMCanavessi wrote: ↑Mon Aug 13, 2018 7:21 pm
Google 4 TPU would probably be 20x more powerful than 2x 1080Ti, however Leela ratio is based on SF vs A0 search speed ratio in their match. With Lco crem compile, it is about 8x faster than previous Leela Zero. So in practical speed term, leela ratio become 0.35( it would be 0.043 in case with Leela Zero)

I'm curious to know, I read on a recent post that Lc0 now supports half precision, is this where the 8x speedup comes from?

Does that mean that as well as really fancy cards like GV100 a (still very expensive but much cheaper) Titan would also benefit from the 8x speedup? And presumably normal cards, like a 1060, will be unaffected?

Werewolf · Post by **Werewolf** » Tue Aug 14, 2018 10:49 pm

Sorry the question above was muddled.
Half precision can only give 2x speed up.

However, any card which supports half precision would do, right? Including Titan?

Does it just happen by itself or do the settings need tweaking?
I might be able to test a Titan at work.

Milos · Post by **Milos** » Wed Aug 15, 2018 12:13 am

Werewolf wrote: ↑Tue Aug 14, 2018 10:49 pm Sorry the question above was muddled.
Half precision can only give 2x speed up.

However, any card which supports half precision would do, right? Including Titan?

Does it just happen by itself or do the settings need tweaking?
I might be able to test a Titan at work.

Which Titan?
If it is Titan V that it is essentially the same thing as Tesla V100 and Quadro GV100.
All 3 are exactly the same GPU (Volta or in CUDA naming 7.0 capable devices) architecture packed in different packages. The only real difference is the amount of RAM (and RAM bandwidth to some extent).
Titan V is consumer type graphics card has the lowest amount of 12GB (and is 3k$), Quadro GV100 is advertised as a workstation GPU and has 32GB or RAM (and costs 9k$) and finally Tesla V100 is advertised as data centre GPU for HPC and has 16GB or 32GB of RAM and comes also in NVlink version (not only PCIe) and costs over 10k$.
Since Lc0 only uses 2GB or RAM its performance would be virtually identical on all 3 cards.

Werewolf · Post by **Werewolf** » Wed Aug 15, 2018 10:12 am

Yes I meant Titan V, and that info is very interesting - thanks.

It looks likely a new Titan (RTX) will be announced by the end of the month btw.

CUDA benchmarks

Re: CUDA benchmarks

Re: CUDA benchmarks

Re: CUDA benchmarks

Re: CUDA benchmarks

Re: CUDA benchmarks