Hello Dear Chess Friends !
What is your estimation:
-Which participant is stronger:Deep Rybka 4.1 x64 6c or Houdini 2.0b Pro x64 4c ?
http://www.sedatcanbaz.com/chess/scct-auto232-poll/
Some Notes:
-Deep Rybka 4.1 x64 6c’s testing is just began in SCCT Auto232:
http://www.sedatcanbaz.com/chess/ratings/scct-auto232/
-Deep Rybka 4.1 x64 6 cores will be played on i7 980X @4.33 GHz (HT OFF/Large Pages ON)
-Houdini 2.0b Pro x64 4 cores is played on i7 920 @3.0GHz/QX9650 @3.66GHz (HT OFF/Large Pages OFF)
Thanks in advance,
Sedat Canbaz
Poll:Which participant is stronger ?
Moderator: Ras
-
- Posts: 3018
- Joined: Thu Mar 09, 2006 11:58 am
- Location: Antalya/Turkey
-
- Posts: 3018
- Joined: Thu Mar 09, 2006 11:58 am
- Location: Antalya/Turkey
Re: Poll:Which participant is stronger ?
Current standings:Deep Rybka 4.1 x64 6 cores is the Leader (in both tables
)

Best Wishes,
Sedat


Best Wishes,
Sedat
-
- Posts: 3018
- Joined: Thu Mar 09, 2006 11:58 am
- Location: Antalya/Turkey
Re: Poll:Which participant is stronger ?
Deep Rybka 4.1 x64 6 core's performance so far:
http://www.sedatcanbaz.com/chess/ratings/scct-auto232/
Some notes about the current match (its too early for any conclusions-more games are needed to be played out):
-But anyway,Deep Rybka 4.1 x64 6 cores seems to be slightly stronger than Houdini 2.0b Pro x64 4 cores
-The current score shows that Deep Rybka 4.1 x64 6 cores is approximately 70 elo stronger than Deep Rybka 4.1 x64 4 cores
-The latest release:Houdini 2.0b Pro x64's performance is around 10 elo weaker than Houdini 2.0 Pro x64 (probably due to its tb bug)
-Deep Rybka 4.1 x64 and Houdini 2.0b Pro x64 are playing with 4-MEN EGTB (DVD Endgame Turbo 3/Gaviota TB are disabled)
And here the current voting table

Best,
Sedat
http://www.sedatcanbaz.com/chess/ratings/scct-auto232/
Some notes about the current match (its too early for any conclusions-more games are needed to be played out):
-But anyway,Deep Rybka 4.1 x64 6 cores seems to be slightly stronger than Houdini 2.0b Pro x64 4 cores
-The current score shows that Deep Rybka 4.1 x64 6 cores is approximately 70 elo stronger than Deep Rybka 4.1 x64 4 cores
-The latest release:Houdini 2.0b Pro x64's performance is around 10 elo weaker than Houdini 2.0 Pro x64 (probably due to its tb bug)
-Deep Rybka 4.1 x64 and Houdini 2.0b Pro x64 are playing with 4-MEN EGTB (DVD Endgame Turbo 3/Gaviota TB are disabled)
And here the current voting table

Best,
Sedat
-
- Posts: 2123
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: Poll:Which participant is stronger ?
Hello Sedat:
I took a look on the PGN file and I found this (hoping no typos and/or repeated games):
I hope I have counted correctly. I read in ImmortalChess Forum a long time ago a way for calculate 'uncertainties' in rating difference according with number of games and draw ratio (only the formula of the standard deviation). Here I go with the complete method, as I usually implement it (almost everything written below the standard deviation is of my own, so who knows if I do wrong assumptions):
(Sorry for my unpleasant notation). So, around ± 35 Elo of 'uncertainty' and an interval of ~ ]-22, +49[ with 205 games, a draw ratio of ~ 47.32% and Rybka scoring ~ 51.95% (with 2-sigma confidence). Is it true? Or am I wrong? I do not know if the formula I have applied is from BayesElo, EloStat or something that someone 'invented' in ImmortalChess and posted it. Could you give the intervals of rating difference with +58 -50 =97 (the result of this (unfinished?) match), please?
I think I have not got any programmes to calculate this; the numbers I give in the code box were calculated with a normal Casio calculator (and results are rounded), so please keep in mind that they can be full of errors... although it seems that an uncertainty of ± 35 Elo is a logical error for 205 games. Thanks in advance and please keep up this great work!
Regards from Spain.
Ajedrecista.
I agree with you: more games are needed.Sedat Canbaz wrote:Current standings:Deep Rybka 4.1 x64 6 cores is the Leader (in both tables)
Best Wishes,
Sedat
I took a look on the PGN file and I found this (hoping no typos and/or repeated games):
Code: Select all
Deep Rybka 4.1 x64 6c vs. Houdini 2.0b Pro x64 4c.
Rybka - Houdini: +40 -18 =45 (103 games).
Houdini - Rybka: +32 -18 =52 (102 games).
TOTAL: +58 -50 =97 (205 games). Rybka is ahead by ~ 14 Elo.
Code: Select all
Number of games = n = 58 + 50 + 97 = 205
Draw_ratio = (number of draws)/n = 97/205
Score_Rybka = (58 + 97/2)/n = 106.5/205 = 1 - Score_Houdini
Score_Houdini = (50 + 97/2)/n = 98.5/205 = 1 - Score_Rybka
Rating_difference = rd = 400·log(Score_Rybka/Score_Houdini) ~ 13.57
Standard_deviation = sd = sqrt{[(Score_Rybka)·(Score_Houdini) - (Draw_ratio)/4]/n} ~ 0.02531 ~ 2.531% of the n games.
(Referring to the n games, and taking a confidence interval of 95.45%, that is 2·sd):
2n·(sd) ~ 10.3773
rd(+) ~ 400·log[(106.5 + 10.3773)/(98.5 - 10.3773)] ~ 49.06
rd(-) ~ 400·log[(106.5 - 10.3773)/(98.5 + 10.3773)] ~ -21.64
Error(+) = rd(+) - rd ~ 35.49
Error(-) = rd(-) - rd ~ -35.21
|Error| = [|error(+)| + |error(-)|]/2 ~ 35.35 ; error ~ ± 35.35
With confidence ~ 95.45%:
Deep Rybka 4.1 x64 6c is stronger than Houdini 2.0b x64 4c by the Elo points given in the following interval:
13.57 (+35.49, -35.21) ~ ]-21.64, +49.06[
Or taking the average error (± 35.35):
13.57 ± 35.35 ~ ]-21.78, +48.92[ (Few changes, as expected).
I think I have not got any programmes to calculate this; the numbers I give in the code box were calculated with a normal Casio calculator (and results are rounded), so please keep in mind that they can be full of errors... although it seems that an uncertainty of ± 35 Elo is a logical error for 205 games. Thanks in advance and please keep up this great work!
Regards from Spain.
Ajedrecista.
-
- Posts: 3018
- Joined: Thu Mar 09, 2006 11:58 am
- Location: Antalya/Turkey
Re: Poll:Which participant is stronger ?
Hello dear Jesús Muñoz,
You are welcome,its my pleasure...
Probably this test will be helpful for those who are considering to update their hardwares
And now about the current issue:
Yes...agreed with you,your countings are right (as far as i've checked)
Individual statistics:
Deep Rybka 4.1 x64 6 cores test is still running and for a better conclusion is needed at least 400-500 games
BTW, the current its opponent is Deep Rybka 4.1 x64 4 cores
And after 38 games,Rybka 6 cores is clearly ahead:+11=25-2 (+84 elo)
Another interesting note is that when we compare the both calculation programs (Elostat and Bayeselo),
then we see approx.15-20 elo difference between each other of the both calculation programs
Calculation by Bayeselo 0056 (start elo:3270):approx.70 elo difference between i7 980X @4.33GHz and i7 920 @3.0GHz
Calculation by EloStat 1.3 (start elo:3270):approx.80 elo difference between i7 980X @4.33GHz and i7 920 @3.0GHz
Best Regards,
Sedat
You are welcome,its my pleasure...
Probably this test will be helpful for those who are considering to update their hardwares
And now about the current issue:
Yes...agreed with you,your countings are right (as far as i've checked)
Individual statistics:
Code: Select all
2 Deep Rybka 4.1 x64 6c : 3380 328 (+101,=173,- 54), 57.2 %
IvanHoe 47c GH x64 4c : 123 (+ 43,= 76,- 4), 65.9 %
Houdini 2.0b Pro x64 4c : 205 (+ 58,= 97,- 50), 52.0 %
Deep Rybka 4.1 x64 6 cores test is still running and for a better conclusion is needed at least 400-500 games
BTW, the current its opponent is Deep Rybka 4.1 x64 4 cores
And after 38 games,Rybka 6 cores is clearly ahead:+11=25-2 (+84 elo)
Another interesting note is that when we compare the both calculation programs (Elostat and Bayeselo),
then we see approx.15-20 elo difference between each other of the both calculation programs
Calculation by Bayeselo 0056 (start elo:3270):approx.70 elo difference between i7 980X @4.33GHz and i7 920 @3.0GHz
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Houdini 2.0b Pro x64 6c 3417 18 18 1009 69% 3298 46%
2 Deep Rybka 4.1 x64 6c 3364 29 29 328 57% 3323 53%
3 Houdini 2.0 Pro x64 4c 3360 16 16 1131 61% 3291 49%
4 Houdini 2.0b Pro x64 4c 3349 19 19 851 53% 3329 47%
5 Houdini 1.5a x64 4c 3342 17 17 1086 62% 3270 48%
6 Deep Rybka 4.1 x64 4c 3290 15 15 1318 49% 3297 56%
7 Critter 1.2 x64 4c 3286 16 16 1037 50% 3287 58%
8 IvanHoe 47c GH x64 4c 3279 16 16 1110 49% 3286 60%
9 Fire 2.2 xTreme x64 4c 3273 15 15 1165 41% 3326 59%
10 IvanHoe 0B.09.18 x64 4c 3269 17 17 924 47% 3286 58%
11 DeepSaros 2.3i x64 4c 3266 17 17 888 50% 3268 61%
12 IvanHoe B47d x64 4c 3265 18 18 886 44% 3302 57%
13 IvanHoe B47f02 x64 4c 3258 17 17 984 48% 3270 59%
14 Houdini 2.0 Pro x64 1c 3252 14 14 1543 54% 3228 48%
15 Stockfish 2.1.1 JA x64 4c 3251 17 17 941 47% 3266 55%
16 Strelka 5.1 x64 1c 3248 20 20 715 57% 3205 52%
17 Rybka 4.1 x64 1c 3194 20 20 677 49% 3202 54%
18 Komodo 3.0 x64 1c 3184 15 15 1360 40% 3250 44%
19 Ivanhoe B46a x64 1c 3176 20 20 716 44% 3214 54%
20 Stockfish 111026 x64 1c 3175 22 22 613 45% 3207 50%
21 Naum 4.2 x64 4c 3172 18 18 898 36% 3256 46%
Calculation by EloStat 1.3 (start elo:3270):approx.80 elo difference between i7 980X @4.33GHz and i7 920 @3.0GHz
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Houdini 2.0b Pro x64 6c : 3440 16 16 1009 69.0 % 3301 46.0 %
2 Deep Rybka 4.1 x64 6c : 3380 26 26 328 57.2 % 3330 52.7 %
3 Houdini 2.0 Pro x64 4c : 3371 14 14 1131 61.1 % 3293 49.2 %
4 Houdini 2.0b Pro x64 4c : 3361 17 17 851 53.2 % 3339 47.4 %
5 Houdini 1.5a x64 4c : 3351 15 15 1086 61.6 % 3269 47.7 %
6 Deep Rybka 4.1 x64 4c : 3295 12 12 1318 49.2 % 3300 56.1 %
7 Critter 1.2 x64 4c : 3287 14 14 1037 49.8 % 3289 57.6 %
8 IvanHoe 47c GH x64 4c : 3280 13 13 1110 48.8 % 3288 60.4 %
9 Fire 2.2 xTreme x64 4c : 3273 13 13 1165 41.3 % 3333 59.0 %
10 IvanHoe 0B.09.18 x64 4c : 3268 15 15 924 47.3 % 3287 57.7 %
11 DeepSaros 2.3i x64 4c : 3265 14 14 888 49.7 % 3267 60.7 %
12 IvanHoe B47d x64 4c : 3264 15 15 886 44.0 % 3306 56.5 %
13 IvanHoe B47f02 x64 4c : 3255 14 14 984 48.0 % 3269 59.0 %
14 Houdini 2.0 Pro x64 1c : 3246 13 13 1543 53.8 % 3220 48.0 %
15 Stockfish 2.1.1 JA x64 4c : 3246 15 15 941 47.3 % 3265 54.8 %
16 Strelka 5.1 x64 1c : 3243 18 18 715 57.1 % 3193 51.6 %
17 Rybka 4.1 x64 1c : 3180 18 18 677 48.7 % 3189 53.6 %
18 Komodo 3.0 x64 1c : 3171 14 14 1360 39.6 % 3245 44.5 %
19 Stockfish 111026 x64 1c : 3159 19 19 613 44.8 % 3195 50.1 %
20 Ivanhoe B46a x64 1c : 3158 17 17 716 43.6 % 3203 53.9 %
21 Naum 4.2 x64 4c : 3155 17 17 898 36.4 % 3252 45.8 %
Best Regards,
Sedat
-
- Posts: 2123
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: Poll:Which participant is stronger ?
Hello again:
Regards from Spain.
Ajedrecista.
Thanks for your fast answer. I did a quick calculation with 328 games and my average error is ~ ± 26.4 Elo (for 2-sigma confidence, ~ 95.45% confidence) an ~ ± 25.86 Elo (for 95% confidence, ~ 1.96-sigma confidence), so I guess that my uncertainties match more with EloStat than with BayesElo (I read a while ago that BayesElo is a bit better than EloStat). For getting ± 29 Elo with the formula I apply, the confidence should be (hoping no errors in my calculation) around 2.2-sigma confidence (or around 97.25% confidence, more less).Sedat Canbaz wrote:Hello dear Jesús Muñoz,
You are welcome,its my pleasure...
Probably this test will be helpful for those who are considering to update their hardwares
And now about the current issue:
Yes...agreed with you,your countings are right (as far as i've checked)
Individual statistics:Code: Select all
2 Deep Rybka 4.1 x64 6c : 3380 328 (+101,=173,- 54), 57.2 % IvanHoe 47c GH x64 4c : 123 (+ 43,= 76,- 4), 65.9 % Houdini 2.0b Pro x64 4c : 205 (+ 58,= 97,- 50), 52.0 %
Deep Rybka 4.1 x64 6 cores test is still running and for a better conclusion is needed at least 400-500 games
BTW, the current its opponent is Deep Rybka 4.1 x64 4 cores
And after 38 games,Rybka 6 cores is clearly ahead:+11=25-2 (+84 elo)
Another interesting note is that when we compare the both calculation programs (Elostat and Bayeselo),
then we see approx.15-20 elo difference between each other of the both calculation programs
Calculation by Bayeselo 0056 (start elo:3270):approx.70 elo difference between i7 980X @4.33GHz and i7 920 @3.0GHz
Code: Select all
Rank Name Elo + - games score oppo. draws 1 Houdini 2.0b Pro x64 6c 3417 18 18 1009 69% 3298 46% 2 Deep Rybka 4.1 x64 6c 3364 29 29 328 57% 3323 53% 3 Houdini 2.0 Pro x64 4c 3360 16 16 1131 61% 3291 49% 4 Houdini 2.0b Pro x64 4c 3349 19 19 851 53% 3329 47% 5 Houdini 1.5a x64 4c 3342 17 17 1086 62% 3270 48% 6 Deep Rybka 4.1 x64 4c 3290 15 15 1318 49% 3297 56% 7 Critter 1.2 x64 4c 3286 16 16 1037 50% 3287 58% 8 IvanHoe 47c GH x64 4c 3279 16 16 1110 49% 3286 60% 9 Fire 2.2 xTreme x64 4c 3273 15 15 1165 41% 3326 59% 10 IvanHoe 0B.09.18 x64 4c 3269 17 17 924 47% 3286 58% 11 DeepSaros 2.3i x64 4c 3266 17 17 888 50% 3268 61% 12 IvanHoe B47d x64 4c 3265 18 18 886 44% 3302 57% 13 IvanHoe B47f02 x64 4c 3258 17 17 984 48% 3270 59% 14 Houdini 2.0 Pro x64 1c 3252 14 14 1543 54% 3228 48% 15 Stockfish 2.1.1 JA x64 4c 3251 17 17 941 47% 3266 55% 16 Strelka 5.1 x64 1c 3248 20 20 715 57% 3205 52% 17 Rybka 4.1 x64 1c 3194 20 20 677 49% 3202 54% 18 Komodo 3.0 x64 1c 3184 15 15 1360 40% 3250 44% 19 Ivanhoe B46a x64 1c 3176 20 20 716 44% 3214 54% 20 Stockfish 111026 x64 1c 3175 22 22 613 45% 3207 50% 21 Naum 4.2 x64 4c 3172 18 18 898 36% 3256 46%
Calculation by EloStat 1.3 (start elo:3270):approx.80 elo difference between i7 980X @4.33GHz and i7 920 @3.0GHz
Code: Select all
Program Elo + - Games Score Av.Op. Draws 1 Houdini 2.0b Pro x64 6c : 3440 16 16 1009 69.0 % 3301 46.0 % 2 Deep Rybka 4.1 x64 6c : 3380 26 26 328 57.2 % 3330 52.7 % 3 Houdini 2.0 Pro x64 4c : 3371 14 14 1131 61.1 % 3293 49.2 % 4 Houdini 2.0b Pro x64 4c : 3361 17 17 851 53.2 % 3339 47.4 % 5 Houdini 1.5a x64 4c : 3351 15 15 1086 61.6 % 3269 47.7 % 6 Deep Rybka 4.1 x64 4c : 3295 12 12 1318 49.2 % 3300 56.1 % 7 Critter 1.2 x64 4c : 3287 14 14 1037 49.8 % 3289 57.6 % 8 IvanHoe 47c GH x64 4c : 3280 13 13 1110 48.8 % 3288 60.4 % 9 Fire 2.2 xTreme x64 4c : 3273 13 13 1165 41.3 % 3333 59.0 % 10 IvanHoe 0B.09.18 x64 4c : 3268 15 15 924 47.3 % 3287 57.7 % 11 DeepSaros 2.3i x64 4c : 3265 14 14 888 49.7 % 3267 60.7 % 12 IvanHoe B47d x64 4c : 3264 15 15 886 44.0 % 3306 56.5 % 13 IvanHoe B47f02 x64 4c : 3255 14 14 984 48.0 % 3269 59.0 % 14 Houdini 2.0 Pro x64 1c : 3246 13 13 1543 53.8 % 3220 48.0 % 15 Stockfish 2.1.1 JA x64 4c : 3246 15 15 941 47.3 % 3265 54.8 % 16 Strelka 5.1 x64 1c : 3243 18 18 715 57.1 % 3193 51.6 % 17 Rybka 4.1 x64 1c : 3180 18 18 677 48.7 % 3189 53.6 % 18 Komodo 3.0 x64 1c : 3171 14 14 1360 39.6 % 3245 44.5 % 19 Stockfish 111026 x64 1c : 3159 19 19 613 44.8 % 3195 50.1 % 20 Ivanhoe B46a x64 1c : 3158 17 17 716 43.6 % 3203 53.9 % 21 Naum 4.2 x64 4c : 3155 17 17 898 36.4 % 3252 45.8 %
Best Regards,
Sedat
Regards from Spain.
Ajedrecista.