Higher than expected by me efficiency of Ponder ON

Laskos · Post by **Laskos** » Mon Mar 06, 2017 3:26 pm

In Cutechess-Cli, I compared the benefit of Ponder ON to the benefit of doubling time control (effective speed-up of 2) for Komodo 10.3:

Code: Select all

Ponder ON vs OFF 5s+0.05s&#58;

Score of Komodo vs Komodo&#58; 836 - 174 - 990  &#91;0.665&#93; 2000
ELO difference&#58; 119.50 +/- 10.74

Score of Komodo vs Komodo&#58; 813 - 202 - 985  &#91;0.653&#93; 2000
ELO difference&#58; 109.64 +/- 10.80
Finished match



Ponder OFF 10s+0.1 vs 5s+0.05s&#58;

Score of Komodo vs Komodo&#58; 942 - 125 - 933  &#91;0.704&#93; 2000
ELO difference&#58; 150.72 +/- 11.06

Score of Komodo vs Komodo&#58; 935 - 133 - 932  &#91;0.701&#93; 2000
ELO difference&#58; 147.60 +/- 11.07

I played first 2000 games each, but then decided to play another 2000. So, in 4000 games each match, 115 ELO points for Ponder ON, 149 ELO points for doubling time control. An efficiency of Ponder ON of 115/149 ~ 77%. This is significantly higher than about 65% Ponder hit rate. This ponder hit rate of 65% is what to be expected of LTC and multicore games among top engines. So those 35% Ponder misses were not completely useless. Searching a wrong move still fills up the hashtable with useful information, like that leading to a transposition later. And that raises the issue of the best use of multicore PCs for play and testing. 2^0.77 ~ 1.71 speed-up due to Pondering. The speed-up due to doubling the number of threads is given for Lazy SMP, as used in Stockfish and probably some other engines, by the Amdahl's Law, and derived here:
http://www.talkchess.com/forum/viewtopic.php?t=62146

Effective Speed-Up = 1 / (1 - 0.955 + 0.955/n_cores)

This gives speedups of:

2 --> 4 cores: 1.84
4 --> 8 cores: 1.73
8 --> 16 cores: 1.57
16 --> 32 cores: 1.40
32 --> 64 cores: 1.25

Therefore 4 --> 8 cores is almost even with Ponder ON 4 cores each, and anything above, say 6 --> 12 , 8 --> 16 is better played Ponder ON on half cores than Ponder OFF on full number of cores. In the TCEC case, the improvement from Ponder ON is substantial over doubling number of cores Ponder OFF (1.71 compared to 1.40 or 1.25). Also, worth noting than the Ponder hit rate only increases going to LTC and more cores, therefore the benefit of Ponder ON.

Nordlandia · Post by **Nordlandia** » Mon Mar 06, 2017 3:53 pm

Interesting Kai

It looks like 60% ponder accuracy comes with an 20% penalty in overall playing strength, if shared cores.

"If the two engines share CPU resources, then you never really get increased time. Half your time is spent handicapped to half the CPU, and the other half is spent handicapped to half the CPU and the penalty for having to guess at the opponent's move. You'll never do as well as having the entire CPU (and no guessing penalty) for half the length of time."

If you could guess opponent's moves with 100% accuracy then pondering would be the same as just splitting CPU time. But anything less than 100% you are performing worse.

-----------

In my opinion TCEC should consider renting two above average servers so they can benefit from ponder.

2x above mediocre servers should be affordable.

In reality ponder need identical hardware to make testing results reliable.

http://chess.stackexchange.com/question ... nent-brain

Laskos · Post by **Laskos** » Mon Mar 06, 2017 3:58 pm

Nordlandia wrote:Interesting Kai

It looks like 60% ponder accuracy comes with an 20% penalty in overall playing strength, if shared cores.

"If the two engines share CPU resources, then you never really get increased time. Half your time is spent handicapped to half the CPU, and the other half is spent handicapped to half the CPU and the penalty for having to guess at the opponent's move. You'll never do as well as having the entire CPU (and no guessing penalty) for half the length of time. – intx13 Jun 28 at 20:48"

If you could guess opponent's moves with 100% accuracy then pondering would be the same as just splitting CPU time. But anything less than 100% you are performing worse.

That's wrong even if engines share CPU resources. If parallel speedup on many cores was perfect, it would be correct. But when the parallel speedup form doubling number of cores is 1.3 or 1.6, which is the case for many cores, Ponder ON comes in handy with 1.71 speedup.

-----------

In my opinion TCEC should consider renting two above average servers so they can benefit from ponder.

Above mediocre servers should be affordable.

In reality ponder need identical hardware to make testing results reliable.

http://chess.stackexchange.com/question ... nent-brain

Colin-G · Post by **Colin-G** » Mon Mar 06, 2017 7:35 pm

Since both engines were the same, and "think" in the same way, is it not likely that the move that is guessed during pondering is more likely to be correct than when different engines play each other?

Vinvin · Post by **Vinvin** » Mon Mar 06, 2017 8:33 pm

Colin-G wrote:Since both engines were the same, and "think" in the same way, is it not likely that the move that is guessed during pondering is more likely to be correct than when different engines play each other?

It's my opinion too : the pondering engine guess the played move very often with the same engine.

Laskos · Post by **Laskos** » Mon Mar 06, 2017 8:48 pm

Vinvin wrote:
Colin-G wrote:Since both engines were the same, and "think" in the same way, is it not likely that the move that is guessed during pondering is more likely to be correct than when different engines play each other?
It's my opinion too : the pondering engine guess the played move very often with the same engine.

Self-similarity of Komodo is not that high at ultra-fast controls, about 65%, or the expected Ponder hit ratio of top engines at longer time controls and on many cores.

Nordlandia · Post by **Nordlandia** » Mon Mar 06, 2017 9:33 pm

Someone with dual cpu mobo, please test ponder for us

Xeon with 24-core, simply use 12-core per engine and enable ponder.

Adam Hair · Post by **Adam Hair** » Tue Mar 07, 2017 1:39 am

Laskos wrote:
Vinvin wrote:
Colin-G wrote:Since both engines were the same, and "think" in the same way, is it not likely that the move that is guessed during pondering is more likely to be correct than when different engines play each other?
It's my opinion too : the pondering engine guess the played move very often with the same engine.
Self-similarity of Komodo is not that high at ultra-fast controls, about 65%, or the expected Ponder hit ratio of top engines at longer time controls and on many cores.

I can confirm this is true for Komodo.

Laskos · Post by **Laskos** » Tue Mar 07, 2017 7:52 am

Vinvin wrote:
Colin-G wrote:Since both engines were the same, and "think" in the same way, is it not likely that the move that is guessed during pondering is more likely to be correct than when different engines play each other?
It's my opinion too : the pondering engine guess the played move very often with the same engine.

I left this night Stockfish 8 versus Komodo 10.3 to play ultra-fast games with Ponder OFF, Ponder ON and double time, where Ponder-hit in ultra-fast games is 55%. The efficiency of Ponder came at 71%. If we consider than Ponder-hit between top engines in slow games is about 65%, we are again back to 77% or even more efficiency of Ponder ON as in test with Komodo self-play.

Code: Select all

3+0.03 vs 3+0.03 Ponder OFF
Score of Stockfish 8 vs Komodo 10.3&#58; 464 - 169 - 367  &#91;0.647&#93; 1000
ELO difference&#58; 105.63 +/- 17.42

3+0.03 vs 3+0.03 w Ponder ON
Score of Stockfish 8 vs Komodo 10.3&#58; 284 - 313 - 403  &#91;0.486&#93; 1000
ELO difference&#58; -10.08 +/- 16.63

3+0.03 vs 6+0.06 Ponder OFF
Score of Stockfish 8 vs Komodo 10.3&#58; 214 - 375 - 411  &#91;0.419&#93; 1000
ELO difference&#58; -56.43 +/- 16.58

Doubling: 162.06 ELO points
Ponder ON: 115.71 Elo points

Efficiency of Ponder ON: 71.4% in ultra-fast games with Ponder-hit rate 55%.

Laskos · Post by **Laskos** » Tue Mar 07, 2017 8:20 am

Interesting to note that if this 1.71 effective speedup from Ponder ON holds, then 16 core Ponder ON is not only better than 32 core Ponder OFF, it is equal to 64 core Ponder OFF.

Higher than expected by me efficiency of Ponder ON

Higher than expected by me efficiency of Ponder ON

Re: Higher than expected by me efficiency of Ponder ON

Re: Higher than expected by me efficiency of Ponder ON

Re: Higher than expected by me efficiency of Ponder ON

Re: Higher than expected by me efficiency of Ponder ON

laskos

Re: laskos

Re: laskos

Re: Higher than expected by me efficiency of Ponder ON

Re: Higher than expected by me efficiency of Ponder ON