SPCC: Testrun of Stockfish 210117 finished

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
pohl4711
Posts: 2433
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

SPCC: Testrun of Stockfish 210117 finished

Post by pohl4711 »

AB-testrun of Stockfish 210117 finished - a bad regression...

https://www.sp-cc.de

(Perhaps you have to clear your browsercache or reload the website)
User avatar
Ozymandias
Posts: 1532
Joined: Sun Oct 25, 2009 2:30 am

Re: SPCC: Testrun of Stockfish 210117 finished

Post by Ozymandias »

It falls inside the margin of error, but ncm testing also indicates that something went wrong on the 11th. Not the worst we've seen lately, though.
User avatar
pohl4711
Posts: 2433
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Stockfish 210117 finished

Post by pohl4711 »

Ozymandias wrote: Wed Jan 20, 2021 11:42 am It falls inside the margin of error, but ncm testing also indicates that something went wrong on the 11th. Not the worst we've seen lately, though.
Stockfish 210111 is strong in my testings. The regression came later. Only 2 patches followed. One of them a "non-functional" patch. So the regression is the latest patch from 210117 ("Add penalty for doubled pawns in agile structure").
Jouni
Posts: 3281
Joined: Wed Mar 08, 2006 8:15 pm

Re: SPCC: Testrun of Stockfish 210117 finished

Post by Jouni »

NCM has peak rating 8.1. with "Update copyright years". Sadly not possible before 2022 again :) .
Jouni
RubiChess
Posts: 584
Joined: Fri Mar 30, 2018 7:20 am
Full name: Andreas Matthies

Re: SPCC: Testrun of Stockfish 210117 finished

Post by RubiChess »

pohl4711 wrote: Wed Jan 20, 2021 11:49 am
Ozymandias wrote: Wed Jan 20, 2021 11:42 am It falls inside the margin of error, but ncm testing also indicates that something went wrong on the 11th. Not the worst we've seen lately, though.
Stockfish 210111 is strong in my testings. The regression came later. Only 2 patches followed. One of them a "non-functional" patch. So the regression is the latest patch from 210117 ("Add penalty for doubled pawns in agile structure").
"Add penalty for doubled pawns in agile structure" is a handmade evaluation patch and should have almost no effect when using NNUE.
I guess it is just some bad luck combined with error bars.

Andreas