Is Blas, or OpenCL really the right way to go

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
AdminX
Posts: 6340
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Is Blas, or OpenCL really the right way to go

Post by AdminX »

I've been wondering is Blas or OpenCL really the right way to go for LC0 on CPU? Because when I look at the performance of Komodo MCTS on CPU, Komodo appears to be light years ahead of LC0 as far as MCTS is concerned. Of course Komodo is no match for LC0 on GPU, but on CPU, LC0 would get smashed by Komodo.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
dkappe
Posts: 1631
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Is Blas, or OpenCL really the right way to go

Post by dkappe »

AdminX wrote: Tue Jun 18, 2019 2:14 pm I've been wondering is Blas or OpenCL really the right way to go for LC0 on CPU? Because when I look at the performance of Komodo MCTS on CPU, Komodo appears to be light years ahead of LC0 as far as MCTS is concerned. Of course Komodo is no match for LC0 on GPU, but on CPU, LC0 would get smashed by Komodo.
You are assuming that Komodo MCTS is using a NN. I don’t believe that is the case.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
User avatar
AdminX
Posts: 6340
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Re: Is Blas, or OpenCL really the right way to go

Post by AdminX »

dkappe wrote: Tue Jun 18, 2019 2:18 pm
AdminX wrote: Tue Jun 18, 2019 2:14 pm I've been wondering is Blas or OpenCL really the right way to go for LC0 on CPU? Because when I look at the performance of Komodo MCTS on CPU, Komodo appears to be light years ahead of LC0 as far as MCTS is concerned. Of course Komodo is no match for LC0 on GPU, but on CPU, LC0 would get smashed by Komodo.
You are assuming that Komodo MCTS is using a NN. I don’t believe that is the case.
No I am not. But I am wondering if Blas or OpenCL are really the best options for CPU that LC0 has.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
Robert Pope
Posts: 558
Joined: Sat Mar 25, 2006 8:27 pm

Re: Is Blas, or OpenCL really the right way to go

Post by Robert Pope »

AdminX wrote: Tue Jun 18, 2019 2:21 pm
dkappe wrote: Tue Jun 18, 2019 2:18 pm
AdminX wrote: Tue Jun 18, 2019 2:14 pm I've been wondering is Blas or OpenCL really the right way to go for LC0 on CPU? Because when I look at the performance of Komodo MCTS on CPU, Komodo appears to be light years ahead of LC0 as far as MCTS is concerned. Of course Komodo is no match for LC0 on GPU, but on CPU, LC0 would get smashed by Komodo.
You are assuming that Komodo MCTS is using a NN. I don’t believe that is the case.
No I am not. But I am wondering if Blas or OpenCL are really the best options for CPU that LC0 has.
Then what does comparing it to Komodo MCTS have to do with anything?
dkappe
Posts: 1631
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Is Blas, or OpenCL really the right way to go

Post by dkappe »

AdminX wrote: Tue Jun 18, 2019 2:21 pm
dkappe wrote: Tue Jun 18, 2019 2:18 pm
You are assuming that Komodo MCTS is using a NN. I don’t believe that is the case.
No I am not. But I am wondering if Blas or OpenCL are really the best options for CPU that LC0 has.
OK. You had me confused.

If we use a NN for leela, then the best way of computing that is a really, really fast linear algebra library. On CPU, blas is one of the top performers.

Now if you want to do something other than blas, then you’ll have to abandon the NN. Maybe use something like a shallow ab search to drive the mcts. But at this point it’s a very different animal than leela.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
User avatar
AdminX
Posts: 6340
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Re: Is Blas, or OpenCL really the right way to go

Post by AdminX »

dkappe wrote: Tue Jun 18, 2019 5:42 pm
AdminX wrote: Tue Jun 18, 2019 2:21 pm
dkappe wrote: Tue Jun 18, 2019 2:18 pm
You are assuming that Komodo MCTS is using a NN. I don’t believe that is the case.
No I am not. But I am wondering if Blas or OpenCL are really the best options for CPU that LC0 has.
OK. You had me confused.

If we use a NN for leela, then the best way of computing that is a really, really fast linear algebra library. On CPU, blas is one of the top performers.

Now if you want to do something other than blas, then you’ll have to abandon the NN. Maybe use something like a shallow ab search to drive the mcts. But at this point it’s a very different animal than leela.
Thank You, So for people with Haswell or later Intel CPU's does MKL perform better much better than than the Blas only version or just slightly so? I have not been testing the CPU versions all that much untill resently, but would enjoy getting the most out of the CPU version as I can.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
Dann Corbit
Posts: 12540
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Is Blas, or OpenCL really the right way to go

Post by Dann Corbit »

It is an important and interesting question.
Here is a benchmark link for various blas type bundles:
https://medium.com/datathings/benchmark ... 57fb1c6dc7
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Dann Corbit
Posts: 12540
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Is Blas, or OpenCL really the right way to go

Post by Dann Corbit »

Also interesting is the cost per dollar calculation for various cards here:
https://timdettmers.com/2019/04/03/whic ... -learning/
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Max
Posts: 247
Joined: Tue Apr 13, 2010 10:41 am

Re: Is Blas, or OpenCL really the right way to go

Post by Max »

Recently compiled Lc0 0.21.2 with the three different blas variants on my mac and run a small test on network 11258-32x4-se.pb.gz.

My MacBook Air with Intel Core i5 from 2015 shows after "go nodes 50000" about
  • 2400 nps with Apple vecLib, thread=1
    3500 nps with Apple vecLib, threads=2

    2350 nps with openBLAS 0.3.6 from Homebrew, thread=1
    4000 nps with openBLAS 0.3.6 from Homebrew, threads=2

    2750 nps with MKL 2019u4 from Intel, thread=1
    4600 nps with MKL 2019u4 from Intel, threads=2
With network netT40.T8.610 only ~50 nps are possible with BLAS.

On my MacBook Air the Intel HD6000 gpu acts slower with openCL. Not to mention the 35+ watts are "killing" the notebook within seconds. Using Intel Power Gadget for this scenario seems an absolute must.
Hope we're not just the biological boot loader for digital super intelligence. Unfortunately, that is increasingly probable - Elon Musk
User avatar
AdminX
Posts: 6340
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Re: Is Blas, or OpenCL really the right way to go

Post by AdminX »

Thank Dann,

I was aware of the second link as I saw it posted on Discord, but not the first. Good info.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers