CCRL 40/4 lists updated (11th August 2012)

ernest · Post by **ernest** » Thu Aug 16, 2012 9:10 pm

Ajedrecista wrote: this is why I consider the worst case.

OK, now you make it clear.
But I still find more useful 3 results, for 40%, 50% and 60% draw rates.

Of course it's very simple to multiply your "worst case" result by sqrt (1-d), d being the draw rate.

Ajedrecista · Post by **Ajedrecista** » Thu Aug 16, 2012 9:27 pm

Hi again!

ernest wrote:
Ajedrecista wrote: this is why I consider the worst case.
OK, now you make it clear.
But I still find more useful 3 results, for 40%, 50% and 60% draw rates.

Of course it's very simple to multiply your "worst case" result by sqrt (1-d), d being the draw rate.

You were faster than me. Taking pencil and paper, I reach to an expression valid for small Elo gains (I mean, scores very near to 50%-50%). If I call K = gain · sqrt(n) (the rule of thumb posted by Kai):

Code: Select all

z: parameter of the confidence interval in a normal distribution.
D: the draw ratio.

K(z, D) = 800z·sqrt(1 - D)/ln(10)

For 95% confidence ~ 1.96-sigma confidence and different draw ratios:

Code: Select all

K(z = 1.96, D = 0) ~ 681
...
K(z = 1.96, D = 0.2) ~ 609.1
K(z = 1.96, D = 0.3) ~ 569.7
K(z = 1.96, D = 0.4) ~ 527.4
K(z = 1.96, D = 0.5) ~ 481.5
K(z = 1.96, D = 0.6) ~ 430.7
...
K(z = 1.96, D = 1) = 0 (bad model for high draw ratios).

n = [K(z, D)/gain]²

I slightly differ from Kai's numbers (in numerators) but they are all good overall. Sorry for these off-topic posts! Thanks for your understanding.

Regards from Spain.

Ajedrecista.

Modern Times · Post by **Modern Times** » Thu Aug 16, 2012 11:30 pm

Well, for what it is worth, I ran just the Komodo 5 40/40 games through EloStat (because I'm not sure how to use bayeselo) with AMD and Intel separated out, and got this:

Code: Select all

 Program                          Elo    +   -   Games   Score   Av.Op.  Draws

 
    Komodo 5 64-bit Intel Non-SSE  : 2462   21  21   457    58.4 %   2403   56.5 %
    Komodo 5 64-bit AMD-SSE4       : 2422   19  19   600    62.5 %   2333   51.7 %

Quite a big difference, but very big error margins too. Can't draw any conclusions, but it tells me that the AMD factor is worth exploring more.

Laskos · Post by **Laskos** » Thu Aug 16, 2012 11:33 pm

Modern Times wrote:Well, for what it is worth, I ran just the Komodo 5 40/40 games through EloStat (because I'm not sure how to use bayeselo) with AMD and Intel separated out, and got this:
Code: Select all
 Program                          Elo    +   -   Games   Score   Av.Op.  Draws

 
    Komodo 5 64-bit Intel Non-SSE  : 2462   21  21   457    58.4 %   2403   56.5 %
    Komodo 5 64-bit AMD-SSE4       : 2422   19  19   600    62.5 %   2333   51.7 %
Quite a big difference, but very big error margins too.

The difference is significant at >95% confidence level, in fact something like 99%, could you detail the test conditions?

lkaufman · Post by **lkaufman** » Thu Aug 16, 2012 11:50 pm

Modern Times wrote:Well, for what it is worth, I ran just the Komodo 5 40/40 games through EloStat (because I'm not sure how to use bayeselo) with AMD and Intel separated out, and got this:
Code: Select all
 Program                          Elo    +   -   Games   Score   Av.Op.  Draws

 
    Komodo 5 64-bit Intel Non-SSE  : 2462   21  21   457    58.4 %   2403   56.5 %
    Komodo 5 64-bit AMD-SSE4       : 2422   19  19   600    62.5 %   2333   51.7 %
Quite a big difference, but very big error margins too. Can't draw any conclusions, but it tells me that the AMD factor is worth exploring more.

What ratings are you using for the opponents? They show averages of 2403 and 2333, obviously they are not normal CCRL ratings. Are you sure that the opponents are rated consistently between these two runs?

Larry

Modern Times · Post by **Modern Times** » Thu Aug 16, 2012 11:53 pm

Of course they aren't normal CCRL ratings, they use Elostat's default 2400 start rating. The numbers come from a single pgn of just the K5 games.

lkaufman · Post by **lkaufman** » Fri Aug 17, 2012 12:36 am

Modern Times wrote:Of course they aren't normal CCRL ratings, they use Elostat's default 2400 start rating. The numbers come from a single pgn of just the K5 games.

So just to be clear, the same opposing engine has the same rating on both lists, right? Then this is highly significant, though not in agreement with our own observations. It is not only 99% significant, but even more so because the Intel machines did not have SSE4 and the AMD machines did. Could you perhaps do the same thing with Komodo 4 data to confirm your finding? If confirmed, we need to investigate this further.

Modern Times · Post by **Modern Times** » Fri Aug 17, 2012 7:12 am

One list of 1057 games. It simply may not be valid to do an Elo calculation on that. Few or no common opponents, just two gauntlets in one pgn. But what I know is this, on 4040 Komodo 5 only started to show good Elo performance once the Intel games were added to the mix. It is worth you investigating further, doing some proper tests. I'm not spending any more time on it.

Modern Times · Post by **Modern Times** » Fri Aug 17, 2012 8:22 am

The only way to do this is to run an off-line calculation on the entire CCRL database, with Komodo 5 separated out between AMD and Intel. But with just 500 games roughly for each, the statistical error margins will be huge. If I get time I will post the result here, but in any case there is no substitute for some proper, controlled testing on this issue.

Sven · Post by **Sven** » Fri Aug 17, 2012 9:16 am

Modern Times wrote:One list of 1057 games. It simply may not be valid to do an Elo calculation on that. Few or no common opponents, just two gauntlets in one pgn.

No common opponents would indeed invalidate any Elo comparison between the two K5 versions since their games would not be connected, you would have two disjoint rating pools.

Sven

CCRL 40/4 lists updated (11th August 2012)

Re: CCRL 40/4 lists updated (11th August 2012).

Re: Rule of thumb posted by Kai.

Re: Rule of thumb posted by Kai.

Re: Rule of thumb posted by Kai.

Re: Rule of thumb posted by Kai.

Re: Rule of thumb posted by Kai.

Re: Rule of thumb posted by Kai.

Re: Rule of thumb posted by Kai.

Re: Rule of thumb posted by Kai.

Re: Rule of thumb posted by Kai.