Page 1 of 2

A more interesting question about GPU verses CPU

Posted: Sat Aug 11, 2018 2:58 am
by Dann Corbit
For a chess engine, which gives more Elo per dollar?
The other measurements are not nearly so important.

An even better measure would be the quality of analysis per dollar, but that would be much harder to pin down.

Re: A more interesting question about GPU verses CPU

Posted: Sat Aug 11, 2018 10:17 am
by Sesse
Right now, obviously CPU, as Stockfish on CPU is better than any engine on GPU.

In a thought experiment where Leela Zero gets to a similar efficiency as Alpha Zero: It's going to depend on how many dollars. Stockfish on a $50 CPU is going to outperform Leela Zero on a $50 GPU, but if you're talking about $1000 CPU vs. $1000 GPU, it might very well be different.

Re: A more interesting question about GPU verses CPU

Posted: Sat Aug 11, 2018 8:45 pm
by syzygy
In my view the more important measure is Elo per Watt (which happens to be equivalent to Elo per dollar if you ignore the initial investment).

Re: A more interesting question about GPU verses CPU

Posted: Sat Aug 11, 2018 9:55 pm
by Sesse
Well, Google claims TPUs run at 75 W, and AlphaZero ran on four TPUs… I think that would be hard to beat.

Re: A more interesting question about GPU verses CPU

Posted: Sun Aug 12, 2018 2:58 am
by Dann Corbit
syzygy wrote: Sat Aug 11, 2018 8:45 pm In my view the more important measure is Elo per Watt (which happens to be equivalent to Elo per dollar if you ignore the initial investment).
That is a very important consideration.

The real calculation should be:

Elo / (initial_price + electricity_cost / device_duration)

Re: A more interesting question about GPU verses CPU

Posted: Sun Aug 12, 2018 4:00 pm
by Daniel Shawul
Dann Corbit wrote: Sat Aug 11, 2018 2:58 am For a chess engine, which gives more Elo per dollar?
The other measurements are not nearly so important.

An even better measure would be the quality of analysis per dollar, but that would be much harder to pin down.
For me the most important measure is tera-flops, as I am only interested in algorithmic comparisons.
In that regard the 2 x 1080 Ti are about 22 Tflops while a 44-core machine is about 1 tflops which is like a 22x difference,
and then lczero still won't match stockfish. That clearly shows there is no comparison between NN engines and CPU engines
on the same hardware. The hand-written evaluation is a much more efficient way of doing things algorithmic wise.

Re: A more interesting question about GPU verses CPU

Posted: Sun Aug 12, 2018 4:42 pm
by hgm
To 'compare' algorithms you should first define a metric for algorithmic complexity. You seem to focus (completely arbitrarily) on the number of multiplications. One might just as well only consider the number of branches. With the same number of branches per second, Stockfish would not be a match for LC0 at all.

So without an objective criterion for selecting the metric, you can make the comparison come out any way you want.

Re: A more interesting question about GPU verses CPU

Posted: Sun Aug 12, 2018 10:06 pm
by Uri Blass
Sesse wrote: Sat Aug 11, 2018 10:17 am Right now, obviously CPU, as Stockfish on CPU is better than any engine on GPU.

In a thought experiment where Leela Zero gets to a similar efficiency as Alpha Zero: It's going to depend on how many dollars. Stockfish on a $50 CPU is going to outperform Leela Zero on a $50 GPU, but if you're talking about $1000 CPU vs. $1000 GPU, it might very well be different.
I think that it is not obvious because Stockfish on the slowest cpu is weaker than Lc0 on the fastest GPU

I even read the following result of 50% inspite of the fact that stockfish ran on 6 threads and not on 1 thread so stockfish did not run on the slowest cpu.

viewtopic.php?f=2&t=67719&start=230
Leela 10520 - SF dev +8 -8 =69
Leela Lc0 with 10520 net running on 1 GV100, versus Stockfish dev 18080112 running on 6 threads i7 5820k @4.2 GHz

Re: A more interesting question about GPU verses CPU

Posted: Sun Aug 12, 2018 10:12 pm
by syzygy
1 flop is more than enough for Stockfish since it doesn't use floating point operations at all (except for some time-management calculations but they could be avoided easily).

Clearly algorithmic considerations are not enough and suitability of the algorithm for the current state of technology has to be taken into account. I can't see any better measure than Watt per Elo. Cost of the hardware depends more on market factors than on technology.

Re: A more interesting question about GPU verses CPU

Posted: Mon Aug 13, 2018 3:15 pm
by Daniel Shawul
hgm wrote: Sun Aug 12, 2018 4:42 pm To 'compare' algorithms you should first define a metric for algorithmic complexity. You seem to focus (completely arbitrarily) on the number of multiplications. One might just as well only consider the number of branches. With the same number of branches per second, Stockfish would not be a match for LC0 at all.

So without an objective criterion for selecting the metric, you can make the comparison come out any way you want.
Well LC0 eval is a gigantic matrix multiplication and clearly needs a hardware optimized for that to perform well. As far as I am concerned, GPU is a speciality hardware, one could also design FGPA, like deep blue did, to accelerate stockfish eval by the same amount. Sure the FGPA would be much more expensive than the GPU but I am not concerned about cost. Cost is driven by need anyway and your assessment of algorithms based on that is going to vary with it.

LC0 probably does the same number of branching in the search part as Stockfish. The only difference is in the massively vectorized evaluation function that is suitable for the GPU. LC0 resorted to an inefficient vectorized eval (interms of number of operations) to exploit GPUs, Stockish could benefit from a different hardware while keeping the branching in its eval.