Tests: asmFish v. BrainFish & Stockfish 8

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Gusev
Posts: 1476
Joined: Mon Jan 28, 2013 2:51 pm

Tests: asmFish v. BrainFish & Stockfish 8

Post by Gusev »

I have uploaded three November 2046-game matches played on an 8-core i7-5960X, TC 20"+1", hash=4096, threads=16, HT=ON, Ponder=ON, including the PGN files and the 1023-game 8-ply test suite, to http://www.firenzina.org/asmFishTests_2016-11.zip
The first of the matches completed Nov. 2 tested BrainFish_161025_x64_bmi2 with the up-to-date Cerebellum against AsmFishW_2016-10-17_bmi2 with Fauzi 4.6 opening book. BrainFish prevailed +203-172=1671, +5 Elo.
The second match completed Nov. 7 tested Stockfish_8_x64_bmi2 against AsmFishW_2016-10-17_bmi2. AsmFish won, +195-169=1682, +4 Elo
The third match completed Nov. 16 tested Stockfish_8_x64_bmi2 against AsmFishW_2016-11-04_bmi2. AsmFish won barely, +194-187=1665, +1 Elo.
ernest
Posts: 2053
Joined: Wed Mar 08, 2006 8:30 pm

Re: Tests: asmFish v. BrainFish & Stockfish 8

Post by ernest »

Gusev wrote:The third match completed Nov. 16 tested Stockfish_8_x64_bmi2 against AsmFishW_2016-11-04_bmi2. AsmFish won barely, +194-187=1665, +1 Elo.
Yes, impressive matches, Dmitri.

But I still don't understand why the faster AsmFish only got +1 Elo against Stockfish.

I just ran a match between pedantFishW_2016-11-04_base and SF8.
248 games on my 3 GHz dualcore, no ponder, 2'+1", Noomen Short Lines book

As usual, pedantFish (or AsmFish) was runnig 20%-25% faster than Stockfish (the "official" fast Stockfish 8 compile)

And I got the "normal" Elo advantage for pedantFish : +20 Elo
+28 -14 =248 result 131-117 or 52.8% for pedantFish, meaning +20 Elo

This is the same old question : how does speed translate into Elo ?
mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: Tests: asmFish v. BrainFish & Stockfish 8

Post by mjlef »

ernest wrote:
Gusev wrote:The third match completed Nov. 16 tested Stockfish_8_x64_bmi2 against AsmFishW_2016-11-04_bmi2. AsmFish won barely, +194-187=1665, +1 Elo.
Yes, impressive matches, Dmitri.

But I still don't understand why the faster AsmFish only got +1 Elo against Stockfish.

I just ran a match between pedantFishW_2016-11-04_base and SF8.
248 games on my 3 GHz dualcore, no ponder, 2'+1", Noomen Short Lines book

As usual, pedantFish (or AsmFish) was runnig 20%-25% faster than Stockfish (the "official" fast Stockfish 8 compile)

And I got the "normal" Elo advantage for pedantFish : +20 Elo
+28 -14 =248 result 131-117 or 52.8% for pedantFish, meaning +20 Elo

This is the same old question : how does speed translate into Elo ?
The answer is "not enough data". the number of games you played lets you narrow down the error margin to +/- 30 elo. (this is about 2 standard deviations or about a 95% certainty that one program is stronger than the other if the elo is more than 30 apart). Your 20 elo result is not enough. More games will (slowly) narrow the error.
ernest
Posts: 2053
Joined: Wed Mar 08, 2006 8:30 pm

Re: Tests: asmFish v. BrainFish & Stockfish 8

Post by ernest »

mjlef wrote:The answer is "not enough data". the number of games you played lets you narrow down the error margin to +/- 30 elo. (this is about 2 standard deviations or about a 95% certainty that one program is stronger than the other if the elo is more than 30 apart). Your 20 elo result is not enough. More games will (slowly) narrow the error.
Thanks for your answer, Mark !
I agree that more games in that match would have been better, but I disagree with your +/- 30 Elo for 2 SD.
My calculation for 2 SD is sqrt(W+L) divided by N number of games
(perfect approximation when the result is in the 40% to 60% range)
which gives 2.6% or +/-18 Elo

Anyway, the SPCC tests (Stefan Pohl) for asmFish give a 30 elo advantage compared to the corresponding Stockfish, using his fast TC.
This is why Dmitri Gusev's result, +1 Elo only, with his fast TC 20"+1", is hard to believe (the fact that ponder was used cannot explain the discrepancy).
And Andreas Strangmüller, in
http://www.talkchess.com/forum/viewtopi ... 84&start=0
gives the result of 133 Elo for doubling (20 sec to 40 sec TC)