A more interesting question about GPU verses CPU

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

A more interesting question about GPU verses CPU

Post by Dann Corbit »

For a chess engine, which gives more Elo per dollar?
The other measurements are not nearly so important.

An even better measure would be the quality of analysis per dollar, but that would be much harder to pin down.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Sesse
Posts: 300
Joined: Mon Apr 30, 2018 11:51 pm

Re: A more interesting question about GPU verses CPU

Post by Sesse »

Right now, obviously CPU, as Stockfish on CPU is better than any engine on GPU.

In a thought experiment where Leela Zero gets to a similar efficiency as Alpha Zero: It's going to depend on how many dollars. Stockfish on a $50 CPU is going to outperform Leela Zero on a $50 GPU, but if you're talking about $1000 CPU vs. $1000 GPU, it might very well be different.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: A more interesting question about GPU verses CPU

Post by syzygy »

In my view the more important measure is Elo per Watt (which happens to be equivalent to Elo per dollar if you ignore the initial investment).
Sesse
Posts: 300
Joined: Mon Apr 30, 2018 11:51 pm

Re: A more interesting question about GPU verses CPU

Post by Sesse »

Well, Google claims TPUs run at 75 W, and AlphaZero ran on four TPUs… I think that would be hard to beat.
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: A more interesting question about GPU verses CPU

Post by Dann Corbit »

syzygy wrote: Sat Aug 11, 2018 8:45 pm In my view the more important measure is Elo per Watt (which happens to be equivalent to Elo per dollar if you ignore the initial investment).
That is a very important consideration.

The real calculation should be:

Elo / (initial_price + electricity_cost / device_duration)
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: A more interesting question about GPU verses CPU

Post by Daniel Shawul »

Dann Corbit wrote: Sat Aug 11, 2018 2:58 am For a chess engine, which gives more Elo per dollar?
The other measurements are not nearly so important.

An even better measure would be the quality of analysis per dollar, but that would be much harder to pin down.
For me the most important measure is tera-flops, as I am only interested in algorithmic comparisons.
In that regard the 2 x 1080 Ti are about 22 Tflops while a 44-core machine is about 1 tflops which is like a 22x difference,
and then lczero still won't match stockfish. That clearly shows there is no comparison between NN engines and CPU engines
on the same hardware. The hand-written evaluation is a much more efficient way of doing things algorithmic wise.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: A more interesting question about GPU verses CPU

Post by hgm »

To 'compare' algorithms you should first define a metric for algorithmic complexity. You seem to focus (completely arbitrarily) on the number of multiplications. One might just as well only consider the number of branches. With the same number of branches per second, Stockfish would not be a match for LC0 at all.

So without an objective criterion for selecting the metric, you can make the comparison come out any way you want.
Uri Blass
Posts: 10267
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: A more interesting question about GPU verses CPU

Post by Uri Blass »

Sesse wrote: Sat Aug 11, 2018 10:17 am Right now, obviously CPU, as Stockfish on CPU is better than any engine on GPU.

In a thought experiment where Leela Zero gets to a similar efficiency as Alpha Zero: It's going to depend on how many dollars. Stockfish on a $50 CPU is going to outperform Leela Zero on a $50 GPU, but if you're talking about $1000 CPU vs. $1000 GPU, it might very well be different.
I think that it is not obvious because Stockfish on the slowest cpu is weaker than Lc0 on the fastest GPU

I even read the following result of 50% inspite of the fact that stockfish ran on 6 threads and not on 1 thread so stockfish did not run on the slowest cpu.

viewtopic.php?f=2&t=67719&start=230
Leela 10520 - SF dev +8 -8 =69
Leela Lc0 with 10520 net running on 1 GV100, versus Stockfish dev 18080112 running on 6 threads i7 5820k @4.2 GHz
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: A more interesting question about GPU verses CPU

Post by syzygy »

1 flop is more than enough for Stockfish since it doesn't use floating point operations at all (except for some time-management calculations but they could be avoided easily).

Clearly algorithmic considerations are not enough and suitability of the algorithm for the current state of technology has to be taken into account. I can't see any better measure than Watt per Elo. Cost of the hardware depends more on market factors than on technology.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: A more interesting question about GPU verses CPU

Post by Daniel Shawul »

hgm wrote: Sun Aug 12, 2018 4:42 pm To 'compare' algorithms you should first define a metric for algorithmic complexity. You seem to focus (completely arbitrarily) on the number of multiplications. One might just as well only consider the number of branches. With the same number of branches per second, Stockfish would not be a match for LC0 at all.

So without an objective criterion for selecting the metric, you can make the comparison come out any way you want.
Well LC0 eval is a gigantic matrix multiplication and clearly needs a hardware optimized for that to perform well. As far as I am concerned, GPU is a speciality hardware, one could also design FGPA, like deep blue did, to accelerate stockfish eval by the same amount. Sure the FGPA would be much more expensive than the GPU but I am not concerned about cost. Cost is driven by need anyway and your assessment of algorithms based on that is going to vary with it.

LC0 probably does the same number of branching in the search part as Stockfish. The only difference is in the massively vectorized evaluation function that is suitable for the GPU. LC0 resorted to an inefficient vectorized eval (interms of number of operations) to exploit GPUs, Stockish could benefit from a different hardware while keeping the branching in its eval.