LS-rankinglist: Komodo 5

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
pohl4711
Posts: 2434
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

LS-rankinglist: Komodo 5

Post by pohl4711 »

The LS-rankinglist (LightSpeed-rankinglist), now with Komodo 5. +52 Elo. Nice. But still no MP-version – not nice...

Intel i7-2630QM (SSE42 support, Windows 7 64bit, Quadcore, FritzMark=20.2), 64 MB Hash, 1 core per engine (Hyperthreading off), no ponder, no endgame-bases, no resign. 500 selected opening-positions (all 8 moves deep, from Frank Q.-database)
Elos calculated with bayeselo (mm 0 1)(fixpoint Robbolito 0.085g3 3000 Elo). LittleBlitzerGUI (gauntlet-mode only, because this GUI chooses opening-positions per random in the round-robin-mode from the PGN-file...)
Time: 20''+250ms Fischerbonus (= 40 seconds per game/engine).

LS-rankinglist with best engine-versions only (no betas, no setting-tests, no development-versions):

Code: Select all

Rank Name                    Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64       3119    5    6  9000   63%  3025   41% 
   2 Strelka 5.5 x64        3077    5    5  9000   57%  3030   49% (singlecore)
   3 Critter 1.6a x64       3067    5    5  9000   55%  3031   50% 
   4 Komodo 5 x64           3053    5    6  9000   53%  3033   41% (singlecore)
   5 Ivanhoe 46h x64        3027    5    5  9000   49%  3036   53% (best open source)
   6 Robbolito 0.10 x64s    3025    5    5  9000   48%  3036   53% 
   7 Rybka 4.1 x64s         3012    5    5  9000   46%  3037   43% 
   8 Robbolito 0.085g3 x64  3000    5    5  9000   44%  3039   50% (singlecore)(Ippolit 2009)
   9 Saros 3.0 x64          2994    5    5  9000   44%  3039   45% 
  10 Stockfish 2.2.2 x64s   2973    6    5  9000   41%  3041   40% 
The crosstable for that list:

Code: Select all

                                 |     01    |     02    |     03    |     04    |     05    |     06    |     07    |     08    |     09    |     10    |
                                +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+--------------
01) Houdini 2.0c x64            |     **    |548.5-451.5|563.0-437.0|591.0-409.0|619.0-381.0|642.5-357.5|644.0-356.0|685.5-314.5|674.5-325.5|692.5-307.5| 5660.5/9000
02) Strelka 5.5 x64             |451.5-548.5|     **    |498.5-501.5|527.5-472.5|586.0-414.0|565.6-434.5|597.0-403.0|610.0-390.0|638.0-362.0|635.5-364.5| 5109.5/9000
03) Critter 1.6a x64            |437.0-563.0|501.5-498.5|     **    |524.5-475.5|566.5-433.5|554.0-446.0|569.5-430.5|593.5-406.5|587.0-413.0|630.0-370.0| 4963.5/9000
04) Komodo 5 x64                |409.0-591.0|472.5-527.5|475.5-524.5|     **    |530.5-469.5|547.0-453.0|552.0-448.0|582.0-418.0|585.0-415.0|593.0-407.0| 4746.5/9000
05) Ivanhoe 46h x64             |381.0-619.0|414.0-586.0|433.5-566.5|469.5-530.5|     **    |505.0-495.0|527.0-473.0|543.0-457.0|550.5-449.5|564.5-435.5| 4388.0/9000
06) Robbolito 0.10 x64s         |357.5-642.5|434.5-565.5|446.0-554.0|453.0-547.0|495.0-505.0|     **    |507.0-493.0|555.0-445.0|543.0-457.0|566.0-434.0| 4357.0/9000
07) Rybka 4.1 x64s              |356.0-644.0|403.0-597.0|430.5-569.5|448.0-552.0|473.0-527.0|493.0-507.0|     **    |502.5-497.5|521.5-478.5|555.5-444.5| 4183.0/9000
08) Robbolito 0.085g3 x64       |314.5-685.5|390.0-610.0|406.5-593.0|418.0-582.0|457.0-543.0|445.0-555.0|497.5-502.5|     **    |521.0-479.0|545.0-455.0| 3994.5/9000
09) Saros 3.0 x64               |325.5-674.5|362.0-638.0|413.0-587.0|415.0-585.0|449.5-550.5|457.0-543.0|478.5-521.5|479.0-521.0|     **    |553.0-447.0| 3932.5/9000
10) Stockfish 2.2.2 x64s        |307.5-692.5|364.5-635.5|370.0-630.0|407.0-593.0|435.5-564.5|434.5-566.0|444.5-555.4|455.0-545.0|447.0-553.0|     **    | 3665.0/9000
                                +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+--------------
LS-rankinglist with all engines/versions/settings tested so far:

Code: Select all

Rank Name                    Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64       3119    5    5 13000   64%  3019   39% 
   2 Houdini 1.5a x64       3086    5    5 10000   58%  3030   42% (best freeware (multicore))
   3 Strelka 5.5 x64        3077    5    5 13000   58%  3023   47% (singlecore)
   4 Critter 1.6a x64       3068    5    5 13000   56%  3023   48% 
   5 Komodo 5 x64           3052    6    6  9000   53%  3033   41% (singlecore)
   6 Ivanhoe 46h x64        3029    5    4 13000   50%  3026   50% (best open source)
   7 Robbolito 0.10 x64s    3023    5    5 13000   49%  3027   51% 
   8 Rybka 4.1 x64s         3012    5    5 13000   48%  3028   42% 
   9 Robbolito 0.085g3 x64  3000    5    5 13000   46%  3029   48% (singlecore)(Ippolit 2009)
  10 Komodo 4 x64s          3000    5    5 12000   46%  3027   38% (singlecore)
  11 Saros 3.0 x64          2995    5    5 12000   44%  3035   44% 
  12 Stockfish 120622 x64s  2979    5    6  9000   42%  3036   41% 
  13 Stockfish 2.2.2 x64s   2974    5    5 12000   42%  3035   39% 
  14 Saros 3.2 x64          2959    6    5  9000   39%  3033   43% 
(x64=64bit version, x64s=64bit SSE42-version)

If you want to get the games from the LS-list, send me an eMail-adress per PM. I will send the games (PGN-file) as soon as possible...

Greetings – Stefan
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: LS-rankinglist: Komodo 5

Post by Sedat Canbaz »

pohl4711 wrote:The LS-rankinglist (LightSpeed-rankinglist), now with Komodo 5. +52 Elo. Nice. But still no MP-version – not nice...

Intel i7-2630QM (SSE42 support, Windows 7 64bit, Quadcore, FritzMark=20.2), 64 MB Hash, 1 core per engine (Hyperthreading off), no ponder, no endgame-bases, no resign. 500 selected opening-positions (all 8 moves deep, from Frank Q.-database)
Elos calculated with bayeselo (mm 0 1)(fixpoint Robbolito 0.085g3 3000 Elo). LittleBlitzerGUI (gauntlet-mode only, because this GUI chooses opening-positions per random in the round-robin-mode from the PGN-file...)
Time: 20''+250ms Fischerbonus (= 40 seconds per game/engine).

LS-rankinglist with best engine-versions only (no betas, no setting-tests, no development-versions):

Code: Select all

Rank Name                    Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64       3119    5    6  9000   63%  3025   41% 
   2 Strelka 5.5 x64        3077    5    5  9000   57%  3030   49% (singlecore)
   3 Critter 1.6a x64       3067    5    5  9000   55%  3031   50% 
   4 Komodo 5 x64           3053    5    6  9000   53%  3033   41% (singlecore)
   5 Ivanhoe 46h x64        3027    5    5  9000   49%  3036   53% (best open source)
   6 Robbolito 0.10 x64s    3025    5    5  9000   48%  3036   53% 
   7 Rybka 4.1 x64s         3012    5    5  9000   46%  3037   43% 
   8 Robbolito 0.085g3 x64  3000    5    5  9000   44%  3039   50% (singlecore)(Ippolit 2009)
   9 Saros 3.0 x64          2994    5    5  9000   44%  3039   45% 
  10 Stockfish 2.2.2 x64s   2973    6    5  9000   41%  3041   40% 
The crosstable for that list:

Code: Select all

                                 |     01    |     02    |     03    |     04    |     05    |     06    |     07    |     08    |     09    |     10    |
                                +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+--------------
01) Houdini 2.0c x64            |     **    |548.5-451.5|563.0-437.0|591.0-409.0|619.0-381.0|642.5-357.5|644.0-356.0|685.5-314.5|674.5-325.5|692.5-307.5| 5660.5/9000
02) Strelka 5.5 x64             |451.5-548.5|     **    |498.5-501.5|527.5-472.5|586.0-414.0|565.6-434.5|597.0-403.0|610.0-390.0|638.0-362.0|635.5-364.5| 5109.5/9000
03) Critter 1.6a x64            |437.0-563.0|501.5-498.5|     **    |524.5-475.5|566.5-433.5|554.0-446.0|569.5-430.5|593.5-406.5|587.0-413.0|630.0-370.0| 4963.5/9000
04) Komodo 5 x64                |409.0-591.0|472.5-527.5|475.5-524.5|     **    |530.5-469.5|547.0-453.0|552.0-448.0|582.0-418.0|585.0-415.0|593.0-407.0| 4746.5/9000
05) Ivanhoe 46h x64             |381.0-619.0|414.0-586.0|433.5-566.5|469.5-530.5|     **    |505.0-495.0|527.0-473.0|543.0-457.0|550.5-449.5|564.5-435.5| 4388.0/9000
06) Robbolito 0.10 x64s         |357.5-642.5|434.5-565.5|446.0-554.0|453.0-547.0|495.0-505.0|     **    |507.0-493.0|555.0-445.0|543.0-457.0|566.0-434.0| 4357.0/9000
07) Rybka 4.1 x64s              |356.0-644.0|403.0-597.0|430.5-569.5|448.0-552.0|473.0-527.0|493.0-507.0|     **    |502.5-497.5|521.5-478.5|555.5-444.5| 4183.0/9000
08) Robbolito 0.085g3 x64       |314.5-685.5|390.0-610.0|406.5-593.0|418.0-582.0|457.0-543.0|445.0-555.0|497.5-502.5|     **    |521.0-479.0|545.0-455.0| 3994.5/9000
09) Saros 3.0 x64               |325.5-674.5|362.0-638.0|413.0-587.0|415.0-585.0|449.5-550.5|457.0-543.0|478.5-521.5|479.0-521.0|     **    |553.0-447.0| 3932.5/9000
10) Stockfish 2.2.2 x64s        |307.5-692.5|364.5-635.5|370.0-630.0|407.0-593.0|435.5-564.5|434.5-566.0|444.5-555.4|455.0-545.0|447.0-553.0|     **    | 3665.0/9000
                                +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+--------------
LS-rankinglist with all engines/versions/settings tested so far:

Code: Select all

Rank Name                    Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64       3119    5    5 13000   64%  3019   39% 
   2 Houdini 1.5a x64       3086    5    5 10000   58%  3030   42% (best freeware (multicore))
   3 Strelka 5.5 x64        3077    5    5 13000   58%  3023   47% (singlecore)
   4 Critter 1.6a x64       3068    5    5 13000   56%  3023   48% 
   5 Komodo 5 x64           3052    6    6  9000   53%  3033   41% (singlecore)
   6 Ivanhoe 46h x64        3029    5    4 13000   50%  3026   50% (best open source)
   7 Robbolito 0.10 x64s    3023    5    5 13000   49%  3027   51% 
   8 Rybka 4.1 x64s         3012    5    5 13000   48%  3028   42% 
   9 Robbolito 0.085g3 x64  3000    5    5 13000   46%  3029   48% (singlecore)(Ippolit 2009)
  10 Komodo 4 x64s          3000    5    5 12000   46%  3027   38% (singlecore)
  11 Saros 3.0 x64          2995    5    5 12000   44%  3035   44% 
  12 Stockfish 120622 x64s  2979    5    6  9000   42%  3036   41% 
  13 Stockfish 2.2.2 x64s   2974    5    5 12000   42%  3035   39% 
  14 Saros 3.2 x64          2959    6    5  9000   39%  3033   43% 
(x64=64bit version, x64s=64bit SSE42-version)

If you want to get the games from the LS-list, send me an eMail-adress per PM. I will send the games (PGN-file) as soon as possible...

Greetings – Stefan
Dear Stefan,

Thank you for your efforts...

But please dont waste your valuable time with such handicapped conditions

Because, Houdini 2.0c is not nearly 70 Elo stronger than Komodo 5

I suggest you to use a popular time control,ultra fast ratings are meaningless!


Best,
Sedat
Uri Blass
Posts: 10281
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: LS-rankinglist: Komodo 5

Post by Uri Blass »

I disagree and have no problem with the fact that stefan test in his conditions.

I do not think that I have the right to tell testers the conditions that they need to use and every knowledge about results in some conditions when I know the conditions is interesting.
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: LS-rankinglist: Komodo 5

Post by Sedat Canbaz »

Uri Blass wrote:I disagree and have no problem with the fact that stefan test in his conditions.

I do not think that I have the right to tell testers the conditions that they need to use and every knowledge about results in some conditions when I know the conditions is interesting.
Sure, my posting was just a SUGGESTION, not a dictate !

It seems,some engines+ultra fast time controls don't play at full performance,for more deatails:
http://rybkaforum.net/cgi-bin/rybkaforu ... 25148;pg=1

Btw,what about 1 second game,is there a such rating ?

Regards,
Sedat
gerold
Posts: 10121
Joined: Thu Mar 09, 2006 12:57 am
Location: van buren,missouri

Re: LS-rankinglist: Komodo 5

Post by gerold »

Many thanks Stefan.

Best,
Gerold.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: LS-rankinglist: Komodo 5

Post by lkaufman »

Sedat Canbaz wrote:
pohl4711 wrote:The LS-rankinglist (LightSpeed-rankinglist), now with Komodo 5. +52 Elo. Nice. But still no MP-version – not nice...

Intel i7-2630QM (SSE42 support, Windows 7 64bit, Quadcore, FritzMark=20.2), 64 MB Hash, 1 core per engine (Hyperthreading off), no ponder, no endgame-bases, no resign. 500 selected opening-positions (all 8 moves deep, from Frank Q.-database)
Elos calculated with bayeselo (mm 0 1)(fixpoint Robbolito 0.085g3 3000 Elo). LittleBlitzerGUI (gauntlet-mode only, because this GUI chooses opening-positions per random in the round-robin-mode from the PGN-file...)
Time: 20''+250ms Fischerbonus (= 40 seconds per game/engine).

LS-rankinglist with best engine-versions only (no betas, no setting-tests, no development-versions):

Code: Select all

Rank Name                    Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64       3119    5    6  9000   63%  3025   41% 
   2 Strelka 5.5 x64        3077    5    5  9000   57%  3030   49% (singlecore)
   3 Critter 1.6a x64       3067    5    5  9000   55%  3031   50% 
   4 Komodo 5 x64           3053    5    6  9000   53%  3033   41% (singlecore)
   5 Ivanhoe 46h x64        3027    5    5  9000   49%  3036   53% (best open source)
   6 Robbolito 0.10 x64s    3025    5    5  9000   48%  3036   53% 
   7 Rybka 4.1 x64s         3012    5    5  9000   46%  3037   43% 
   8 Robbolito 0.085g3 x64  3000    5    5  9000   44%  3039   50% (singlecore)(Ippolit 2009)
   9 Saros 3.0 x64          2994    5    5  9000   44%  3039   45% 
  10 Stockfish 2.2.2 x64s   2973    6    5  9000   41%  3041   40% 
The crosstable for that list:

Code: Select all

                                 |     01    |     02    |     03    |     04    |     05    |     06    |     07    |     08    |     09    |     10    |
                                +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+--------------
01) Houdini 2.0c x64            |     **    |548.5-451.5|563.0-437.0|591.0-409.0|619.0-381.0|642.5-357.5|644.0-356.0|685.5-314.5|674.5-325.5|692.5-307.5| 5660.5/9000
02) Strelka 5.5 x64             |451.5-548.5|     **    |498.5-501.5|527.5-472.5|586.0-414.0|565.6-434.5|597.0-403.0|610.0-390.0|638.0-362.0|635.5-364.5| 5109.5/9000
03) Critter 1.6a x64            |437.0-563.0|501.5-498.5|     **    |524.5-475.5|566.5-433.5|554.0-446.0|569.5-430.5|593.5-406.5|587.0-413.0|630.0-370.0| 4963.5/9000
04) Komodo 5 x64                |409.0-591.0|472.5-527.5|475.5-524.5|     **    |530.5-469.5|547.0-453.0|552.0-448.0|582.0-418.0|585.0-415.0|593.0-407.0| 4746.5/9000
05) Ivanhoe 46h x64             |381.0-619.0|414.0-586.0|433.5-566.5|469.5-530.5|     **    |505.0-495.0|527.0-473.0|543.0-457.0|550.5-449.5|564.5-435.5| 4388.0/9000
06) Robbolito 0.10 x64s         |357.5-642.5|434.5-565.5|446.0-554.0|453.0-547.0|495.0-505.0|     **    |507.0-493.0|555.0-445.0|543.0-457.0|566.0-434.0| 4357.0/9000
07) Rybka 4.1 x64s              |356.0-644.0|403.0-597.0|430.5-569.5|448.0-552.0|473.0-527.0|493.0-507.0|     **    |502.5-497.5|521.5-478.5|555.5-444.5| 4183.0/9000
08) Robbolito 0.085g3 x64       |314.5-685.5|390.0-610.0|406.5-593.0|418.0-582.0|457.0-543.0|445.0-555.0|497.5-502.5|     **    |521.0-479.0|545.0-455.0| 3994.5/9000
09) Saros 3.0 x64               |325.5-674.5|362.0-638.0|413.0-587.0|415.0-585.0|449.5-550.5|457.0-543.0|478.5-521.5|479.0-521.0|     **    |553.0-447.0| 3932.5/9000
10) Stockfish 2.2.2 x64s        |307.5-692.5|364.5-635.5|370.0-630.0|407.0-593.0|435.5-564.5|434.5-566.0|444.5-555.4|455.0-545.0|447.0-553.0|     **    | 3665.0/9000
                                +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+--------------
LS-rankinglist with all engines/versions/settings tested so far:

Code: Select all

Rank Name                    Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64       3119    5    5 13000   64%  3019   39% 
   2 Houdini 1.5a x64       3086    5    5 10000   58%  3030   42% (best freeware (multicore))
   3 Strelka 5.5 x64        3077    5    5 13000   58%  3023   47% (singlecore)
   4 Critter 1.6a x64       3068    5    5 13000   56%  3023   48% 
   5 Komodo 5 x64           3052    6    6  9000   53%  3033   41% (singlecore)
   6 Ivanhoe 46h x64        3029    5    4 13000   50%  3026   50% (best open source)
   7 Robbolito 0.10 x64s    3023    5    5 13000   49%  3027   51% 
   8 Rybka 4.1 x64s         3012    5    5 13000   48%  3028   42% 
   9 Robbolito 0.085g3 x64  3000    5    5 13000   46%  3029   48% (singlecore)(Ippolit 2009)
  10 Komodo 4 x64s          3000    5    5 12000   46%  3027   38% (singlecore)
  11 Saros 3.0 x64          2995    5    5 12000   44%  3035   44% 
  12 Stockfish 120622 x64s  2979    5    6  9000   42%  3036   41% 
  13 Stockfish 2.2.2 x64s   2974    5    5 12000   42%  3035   39% 
  14 Saros 3.2 x64          2959    6    5  9000   39%  3033   43% 
(x64=64bit version, x64s=64bit SSE42-version)

If you want to get the games from the LS-list, send me an eMail-adress per PM. I will send the games (PGN-file) as soon as possible...

Greetings – Stefan
Dear Stefan,

Thank you for your efforts...

But please dont waste your valuable time with such handicapped conditions

Because, Houdini 2.0c is not nearly 70 Elo stronger than Komodo 5

I suggest you to use a popular time control,ultra fast ratings are meaningless!


Best,
Sedat
My own view is that such ultra-fast tests are useful for comparing upgrades of one engine to another (although the elo gain will be overstated a bit), but not so useful for comparing unrelated engines. All engines using the Ippo code or search algorithms seem to be extremely strong at such very fast levels, less so at normal levels.
As long as there is an increment large enough to make setup time and first iteration time unimportant the test is at least meaningful. I suppose 250 ms is enough for this. Tests like game/10" with no increment really tell us almost nothing.
User avatar
pohl4711
Posts: 2434
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: LS-rankinglist: Komodo 5

Post by pohl4711 »

Before this version of the LS-rankinglist, I had the TEB-rankinglist and a LS-rankinglist, which were the same except the thinking time. TEB had 60''+750ms. The results were the same (all differences were inside errorbar), so I decided the TEB-speed is to slow, it is better to have more games than long games.
Only effect of more thinking time is, that all results are closer to 50%, which means the elo-distances in a rankinglistlist with more thinking time are smaller. But the order in the list stays the same.
By the way: Before the TEB-list, I had the two NEBB-lists (half of TEB-time and a list with 4'+2'' (a so called "regular time"). The order of the engines in that lists were the same, too (except Komodo 4). So it really doesnt matter, if you use 4'+2'' or 60''+750ms or 20''+250ms. Only the distances between the engines in a list are getting bigger. And thats not a big problem (in my opinion), because I am interested in the order of the engines, not in virtual "Elo"-numbers...

With longer thinking times in engine testing you get only 1 profit (better chess), but 2 problems: a) less number of games -> bigger errorbar and b) smaller distances between engines in a rankinglist, which is another source of uncertainty, if you have some engines wich are nearly at the same level of playing strength... For that problem see your own Sedat-rankinglist and the results of the Houdini T3 and z-Setting: So close to Houdini default and so big errorbar: pure random, that t3 and z are better than default-Houdini 2. That is a big waste of your valuable time! Robert Houdart played a much bigger number of (ultrafast) games with the z-setting and it was not stronger (slightly weaker) than the default-engine.

Best - Stefan

"Randomness is a monster and you beat it by volume." - Ed Schroeder
User avatar
pohl4711
Posts: 2434
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: LS-rankinglist: Komodo 5

Post by pohl4711 »

lkaufman wrote: My own view is that such ultra-fast tests are useful for comparing upgrades of one engine to another (although the elo gain will be overstated a bit), but not so useful for comparing unrelated engines. All engines using the Ippo code or search algorithms seem to be extremely strong at such very fast levels, less so at normal levels.
As long as there is an increment large enough to make setup time and first iteration time unimportant the test is at least meaningful. I suppose 250 ms is enough for this. Tests like game/10" with no increment really tell us almost nothing.
Thats correct. In my LS-rankinglist the average thinking time (displayed by the LittleBlitzerGUI) is aorund 450ms per move (more at the beginning of a game, less in the endgame) and the average search depths of the engines are around 14-15 plys (Rybka less, Stockfish some more). That is more than enough for "real" chess (whatever that means...).

Best - Stefan
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: LS-rankinglist: Komodo 5

Post by Sedat Canbaz »

lkaufman wrote:
My own view is that such ultra-fast tests are useful for comparing upgrades of one engine to another (although the elo gain will be overstated a bit), but not so useful for comparing unrelated engines. All engines using the Ippo code or search algorithms seem to be extremely strong at such very fast levels, less so at normal levels.
As long as there is an increment large enough to make setup time and first iteration time unimportant the test is at least meaningful. I suppose 250 ms is enough for this. Tests like game/10" with no increment really tell us almost nothing.

Its ok...such ultra fast ratings can be useful for programmers, but not for users

For example,after checking the LS-rankinglist's results,we see that Houdini 2.0c is 33 Elo stronger than Houdini 1.5a

And it seems, Robert Houdart is probably right that Houdini 2 is at least 25 Elo stronger than H1.5a, but only for ultra fast time controls

But, unfortunately in almost all major rating lists, we can not see any significant Elo improvement between both Houdini versions

Probably we (SCCT,CEGT,CCRL...) need to run at least 8.000 games for appearing a such Elo improvement...


Best,
Sedat
Uri Blass
Posts: 10281
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: LS-rankinglist: Komodo 5

Post by Uri Blass »

pohl4711 wrote:Before this version of the LS-rankinglist, I had the TEB-rankinglist and a LS-rankinglist, which were the same except the thinking time. TEB had 60''+750ms. The results were the same (all differences were inside errorbar), so I decided the TEB-speed is to slow, it is better to have more games than long games.
Only effect of more thinking time is, that all results are closer to 50%, which means the elo-distances in a rankinglistlist with more thinking time are smaller. But the order in the list stays the same.
By the way: Before the TEB-list, I had the two NEBB-lists (half of TEB-time and a list with 4'+2'' (a so called "regular time"). The order of the engines in that lists were the same, too (except Komodo 4). So it really doesnt matter, if you use 4'+2'' or 60''+750ms or 20''+250ms. Only the distances between the engines in a list are getting bigger. And thats not a big problem (in my opinion), because I am interested in the order of the engines, not in virtual "Elo"-numbers...

With longer thinking times in engine testing you get only 1 profit (better chess), but 2 problems: a) less number of games -> bigger errorbar and b) smaller distances between engines in a rankinglist, which is another source of uncertainty, if you have some engines wich are nearly at the same level of playing strength... For that problem see your own Sedat-rankinglist and the results of the Houdini T3 and z-Setting: So close to Houdini default and so big errorbar: pure random, that t3 and z are better than default-Houdini 2. That is a big waste of your valuable time! Robert Houdart played a much bigger number of (ultrafast) games with the z-setting and it was not stronger (slightly weaker) than the default-engine.

Best - Stefan

"Randomness is a monster and you beat it by volume." - Ed Schroeder
When the difference between engines is big the order is the same if you multiply the time control by 3.

When the difference is less than 20 elo and still significant I expect to see some differences when you multiply the time control by 3.