cutechess-cli suggestion (to Ilari)

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Zlaire
Posts: 62
Joined: Mon Oct 03, 2011 9:40 pm

Re: cutechess-cli suggestion (to Ilari)

Post by Zlaire »

lucasart wrote:Do you know where I can find a mathematical proof of that ? I can understand that one would want to believe this intuitively, but it doesn't seem trivial to me.
Did some practical tests on this and was quite surprised.

The basis is player1 having exactly a 55% win rate over player2 and it's cut off if the opponent has at least 20 wins and LOS reached 95%.

Results over 10,000 iterations was:

Player 1 (determined the winner): 9,719
Player 2: 281

Average number of games to reach a conclusion was 193.

Removing the "warmup" with >20 opponent wins is not so good though:

Player 1: 7,690
Player 2: 2,310

(no draws here, but they didn't affect anything when I added them after this)
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: cutechess-cli suggestion (to Ilari)

Post by lucasart »

281 is less than the 5% error. So your simulations are saying that my method is correct, contrary to what was suggested then ?

PS: This is really an interesting probability problem. I like it :-)
User avatar
Zlaire
Posts: 62
Joined: Mon Oct 03, 2011 9:40 pm

Re: cutechess-cli suggestion (to Ilari)

Post by Zlaire »

lucasart wrote:281 is less than the 5% error. So your simulations are saying that my method is correct, contrary to what was suggested then ?

PS: This is really an interesting probability problem. I like it :-)
Yes, your method seems to be correct, at least in this simplified implementation.
User avatar
Zlaire
Posts: 62
Joined: Mon Oct 03, 2011 9:40 pm

Re: cutechess-cli suggestion (to Ilari)

Post by Zlaire »

Raising LOS to 100% yielded (same setup as before):

Player 1: 9,973
Player 2: 0

Average number of games to reach a conclusion 5,630.

(one iteration has max 10,000 "games" here, so the missing 27 results apparently needed more to reach 100% LOS)
User avatar
marcelk
Posts: 348
Joined: Sat Feb 27, 2010 12:21 am

Re: cutechess-cli suggestion (to Ilari)

Post by marcelk »

Zlaire wrote:
lucasart wrote:Do you know where I can find a mathematical proof of that ? I can understand that one would want to believe this intuitively, but it doesn't seem trivial to me.
Did some practical tests on this and was quite surprised.

The basis is player1 having exactly a 55% win rate over player2 and it's cut off if the opponent has at least 20 wins and LOS reached 95%.

Results over 10,000 iterations was:

Player 1 (determined the winner): 9,719
Player 2: 281

Average number of games to reach a conclusion was 193.

Removing the "warmup" with >20 opponent wins is not so good though:

Player 1: 7,690
Player 2: 2,310

(no draws here, but they didn't affect anything when I added them after this)
Can you explain how '281' can end up be so much smaller than 5%? It looks that first sight that you found a magic bullet.

What does it look like if you don't use intermediate termination? Does it match 5%?
John Major
Posts: 27
Joined: Fri Dec 11, 2009 10:23 pm

Re: cutechess-cli suggestion (to Ilari)

Post by John Major »

User avatar
Zlaire
Posts: 62
Joined: Mon Oct 03, 2011 9:40 pm

Re: cutechess-cli suggestion (to Ilari)

Post by Zlaire »

marcelk wrote:Can you explain how '281' can end up be so much smaller than 5%? It looks that first sight that you found a magic bullet.

What does it look like if you don't use intermediate termination? Does it match 5%?
Without intermediate termination I got:

Player1: 9,646
Player2: 0

For 10,000 iterations of 1,000 "games". But for 10,000 games it got:

Player1: 10,000
Player2: 0

So this has nothing to do with the 5%, it's just getting the LOS correct all the time with that many games.

-

Isn't the 281 wrong conclusions a result of the early termination? That is yet another variable introduced due to it.

(I'm not attempting any math here :), just letting my little program do the work)
User avatar
marcelk
Posts: 348
Joined: Sat Feb 27, 2010 12:21 am

Re: cutechess-cli suggestion (to Ilari)

Post by marcelk »

Zlaire wrote:
marcelk wrote:Can you explain how '281' can end up be so much smaller than 5%? It looks that first sight that you found a magic bullet.

What does it look like if you don't use intermediate termination? Does it match 5%?
Without intermediate termination I got:

Player1: 9,646
Player2: 0

For 10,000 iterations of 1,000 "games". But for 10,000 games it got:

Player1: 10,000
Player2: 0

So this has nothing to do with the 5%, it's just getting the LOS correct all the time with that many games.

-

Isn't the 281 wrong conclusions a result of the early termination? That is yet another variable introduced due to it.

(I'm not attempting any math here :), just letting my little program do the work)
Shouldn't LOS be 100% in this case then? That is what worries me...
User avatar
Zlaire
Posts: 62
Joined: Mon Oct 03, 2011 9:40 pm

Re: cutechess-cli suggestion (to Ilari)

Post by Zlaire »

marcelk wrote:Shouldn't LOS be 100% in this case then? That is what worries me...
Well it's a bit backwards.

When terminating early I check every "game" if we reached 95% LOS, if so break.

When not terminating early I check if we reached 95% after all 10,000 games has been played.

-

So basically when not terminating early and 1,000 games, it can determine that LOS is >95%, 96.5% of the time. For 10,000 games it can determine it 100% of the time.

Bit messy.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: cutechess-cli suggestion (to Ilari)

Post by lucasart »

Zlaire wrote:
marcelk wrote:Shouldn't LOS be 100% in this case then? That is what worries me...
Well it's a bit backwards.

When terminating early I check every "game" if we reached 95% LOS, if so break.

When not terminating early I check if we reached 95% after all 10,000 games has been played.

-

So basically when not terminating early and 1,000 games, it can determine that LOS is >95%, 96.5% of the time. For 10,000 games it can determine it 100% of the time.

Bit messy.
For the LOS, you don't use 1.96*sigma (which would be 97.5% i/o 95% and perhaps explain why 281 is so much smaller than 5%). confusing unilateral and bilateral tests is a mistake one cab easily make.
Also LOS 100% doesn't really make sense. This will only stop after 20 games when the score is 20-0-0. Any other trajectory will be infinite, since it means that after 20 games the score of B is non zero. And to compute this LOS(100%) limit you must have calculated qnorm(100%), where qnorm is then quantile function of the gaussian distribution, no ? (of course tjhis computation fails numerically and mathematically this value is +infinity)