A question about SPRT

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

A question about SPRT

Post by AndrewGrant »

I'm looking at the implementation of SPRT posted here https://chessprogramming.wikispaces.com ... stics#toc7

I through that the number of games needed would be related to the bounds of the test, elo0 and elo1. So if we ran a test using the bounds [0,3], we would be trying to prove that the new version was at least 3 elo better than the second version. The same idea for the using the bounds of say, [0,5]. Now I thought that since it seems like it would be easier to prove a 3 elo gain than a 5 elo gain, the test using [0,3] should take fewer games. However, in practice this appears wrong. Additionally, looking at that implementation, my assumption is also wrong mathematically.

So my question is, where am I going wrong?

Is this implementation flawed?
Do I missunderstand the meaning of elo0 and elo1?

Thanks
Andrew
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
kbhearn
Posts: 411
Joined: Thu Dec 30, 2010 4:48 am

Re: A question about SPRT

Post by kbhearn »

simply put what SPRT is testing is 'is it more likely that this is a elo0 patch or an elo1 patch?' - patches that have true elo that falls between the bounds will tend to take more games than patches outside the bounds but variance being what it is there is chances that any patch runs long or returns a false positive. Predicting the effect of changing bounds on various elo patches (likelihood they get accepted/rejected, average number of games to reach a conclusion) is difficult - you're best off running some simulations at a few different typical elos.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: A question about SPRT

Post by Michel »

Predicting the effect of changing bounds on various elo patches (likelihood they get accepted/rejected, average number of games to reach a conclusion) is difficult
No it is not difficult at all. There are standard formulas. They are implemented for example in this script

http://hardy.uhasselt.be/Toga/sprta.py

It has been translated to javascript here

http://chess-sprt-calc.azurewebsites.net/

Note this script can not immediately be applied to the OP's problem since it takes BayesElo inputs. However it can be easily modified to take logistic elo inputs.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: A question about SPRT

Post by AndrewGrant »

Thanks for the clarification.

So, as a general note, I should be choosing the elo bounds based on how well I think the change might be? IE, tweak a few values, use [-1, 1], or I make some large change (add a bunch of evaluation terms for pawns) and use [0, 20]?
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )