Komodo 2.01 is out!

Don · Post by **Don** » Sun Jun 12, 2011 11:16 pm

stevenaaus wrote:I've added per-game time control to Scid vs. PC's tournament feature if anyone wants to mess around. (Komodo's per-move time control still seems broke) \

I will check that out, I would like for that to work.

At the moment it's alpha code, for UCI only... I think i'm doing it right. Code is in svn.

Laskos · Post by **Laskos** » Mon Jun 13, 2011 12:18 am

Don wrote:
Laskos wrote:
Adam Hair wrote:Here is a partial result from CCRL 40/4 1 CPU testing for Komodo 2.01 and Stockfish 2.1.1:
Code: Select all
 5 Houdini 1.5a 64-bit                      3268   15   14  1973   73%  3090   28%
17 Rybka 4.1 64-bit                         3208   21   21   910   75%  3027   33%
20 Stockfish 2.1.1 64-bit                   3190   39   38   242   70%  3051   40%
22 Stockfish 2.0.1 64-bit                   3188   20   20  1128   77%  2976   28%
24 Rybka 3 64-bit                           3179    9    8  6001   74%  2994   30%
25 Critter 1.01 64-bit                      3173   20   20  1056   77%  2961   28%
32 Komodo 2.01 64-bit                       3157   39   38   231   65%  3054   36%
35 Critter 0.90 64-bit                      3147   19   18  1209   75%  2959   30%
60 Komodo 1.3 64-bit                        3087   12   12  2422   55%  3046   39%
66 Naum 4.2 64-bit                          3084   13   13  2022   57%  3039   41%
Thanks, CCRL 40/4 gives 70 +/- 38 Elo points (95%) improvement, CEGT 40/20 gives 110 +/- 27 points (95%), maybe they will converge to something inside mine 93 +/- 15 (95%) result, I am really curious if my result is valid.

Kai
We have thousands of games that says it's 100 ELO, but that is at time controls that are around 40 seconds per game on very fast hardware (overclocked i7 6 core.)

YMMV

Don

Don, I think that tests at 40s per game on fast hardware (with a solid increment, say equivalent to 2.5s + 0.25s) are very representative, excluding time management which is another problem. I am using 1s + 0.1s (~15s per game) on not so fast hardware. Measuring the _difference_ between two engines it never let me down with more than 10 Elo points beyond error margins (say 30,000 games, ~3 Elo points error margins 95% conf.).

Kai

Don · Post by **Don** » Mon Jun 13, 2011 12:18 am

Laskos wrote:Wow, seems a serious improvement. After 400 games 1s + 0.1s
Code: Select all
    Program                            Score     %      Elo    +   -    Draws

  1 Komodo64 2.01 64 bit           &#58; 262.5/400  65.6    3256   29  29   31.2 %
  2 Komodo64 1.3 JA                &#58; 137.5/400  34.4    3144   29  29   31.2 %
112 +/- 29 Elo points (95% confidence) improvement in self-play, probably a little less in a gauntlet, but the new Reptilian seems the level of SF 2.01. Will leave it for more games, then a gauntlet.

Kai

Kai,

When we test at that pace it makes a big difference how we set "move overhead milliseconds" and the default is current 20. To be honest, on our tester we don't forfeit even when setting it to zero. It will DEFINITELY affect the result at 1 second games like you are testing. Komodo 1.3 did not have any overhead built in so in your match komodo 2 was playing handicapped.

The purpose of "move overhead milliseconds" is to deal with slower graphical interfaces which I believe could be inadvertently imposing a penalty on each move. And if you are manually operating the computer to play a game with a physical clock you could set it to 2000 or more.

Laskos · Post by **Laskos** » Mon Jun 13, 2011 12:24 am

Don wrote:
Laskos wrote:Wow, seems a serious improvement. After 400 games 1s + 0.1s
Code: Select all
    Program                            Score     %      Elo    +   -    Draws

  1 Komodo64 2.01 64 bit           &#58; 262.5/400  65.6    3256   29  29   31.2 %
  2 Komodo64 1.3 JA                &#58; 137.5/400  34.4    3144   29  29   31.2 %
112 +/- 29 Elo points (95% confidence) improvement in self-play, probably a little less in a gauntlet, but the new Reptilian seems the level of SF 2.01. Will leave it for more games, then a gauntlet.

Kai
Kai,

When we test at that pace it makes a big difference how we set "move overhead milliseconds" and the default is current 20. To be honest, on our tester we don't forfeit even when setting it to zero. It will DEFINITELY affect the result at 1 second games like you are testing. Komodo 1.3 did not have any overhead built in so in your match komodo 2 was playing handicapped.

The purpose of "move overhead milliseconds" is to deal with slower graphical interfaces which I believe could be inadvertently imposing a penalty on each move. And if you are manually operating the computer to play a game with a physical clock you could set it to 2000 or more.

Interesting, the average time per move was 102ms or so for 2.01. Meaning 82ms? This is a handicap of ~25 Elo points. Thanks for the info.

Kai

Don · Post by **Don** » Mon Jun 13, 2011 12:31 am

Laskos wrote:
Don wrote:
Laskos wrote:
Adam Hair wrote:Here is a partial result from CCRL 40/4 1 CPU testing for Komodo 2.01 and Stockfish 2.1.1:
Code: Select all
 5 Houdini 1.5a 64-bit                      3268   15   14  1973   73%  3090   28%
17 Rybka 4.1 64-bit                         3208   21   21   910   75%  3027   33%
20 Stockfish 2.1.1 64-bit                   3190   39   38   242   70%  3051   40%
22 Stockfish 2.0.1 64-bit                   3188   20   20  1128   77%  2976   28%
24 Rybka 3 64-bit                           3179    9    8  6001   74%  2994   30%
25 Critter 1.01 64-bit                      3173   20   20  1056   77%  2961   28%
32 Komodo 2.01 64-bit                       3157   39   38   231   65%  3054   36%
35 Critter 0.90 64-bit                      3147   19   18  1209   75%  2959   30%
60 Komodo 1.3 64-bit                        3087   12   12  2422   55%  3046   39%
66 Naum 4.2 64-bit                          3084   13   13  2022   57%  3039   41%
Thanks, CCRL 40/4 gives 70 +/- 38 Elo points (95%) improvement, CEGT 40/20 gives 110 +/- 27 points (95%), maybe they will converge to something inside mine 93 +/- 15 (95%) result, I am really curious if my result is valid.

Kai
We have thousands of games that says it's 100 ELO, but that is at time controls that are around 40 seconds per game on very fast hardware (overclocked i7 6 core.)

YMMV

Don
Don, I think that tests at 40s per game on fast hardware (with a solid increment, say equivalent to 2.5s + 0.25s) are very representative, excluding time management which is another problem. I am using 1s + 0.1s (~15s per game) on not so fast hardware. Measuring the _difference_ between two engines it never let me down with more than 10 Elo points beyond error margins (say 30,000 games, ~3 Elo points error margins 95% conf.).

Kai

Hey, I just send a response to an earlier post of yours. Good timing.

Our experience is that MOST of the time fast time controls are very representative. But we found that sometimes it's misleading. We are often trying to measure just 2 or 3 ELO and there are many ideas we have tried that work really nicely at game in 2 or 3 seconds but very poorly at long time controls. And some of the things we do absolutely require much longer time controls to really exercise certain algorithms. For example if you look at stockfish, they have progressively more aggressive LMR with depth, so you cannot fully test that algorithm without playing longer time controls. If, for instance, if it was a really bad idea it might look great at 1 + 0.1. (Of course it's not a bad idea, but this was just an example.)

Sometimes I run 7 ply games to get a feel for an idea, but I don't really use the results. On my home brewed tester I get 10 games per second at that depth. But a typical depth we run at is game in 3 seconds + 0.03 increment which is probably faster than your 1 + 0.1 - that gives us a feel and from there we decide whether to proceed to longer tests. Keep in mind that we are most of time only looking for 2 or 3 ELO and we cannot afford to be off by 10 ELO.

But I agree with in principle, we are always looking for ways to cheat in the testing as this is the primary bottleneck in making progress.

Don · Post by **Don** » Mon Jun 13, 2011 12:37 am

Laskos wrote:
Don wrote:
Laskos wrote:Wow, seems a serious improvement. After 400 games 1s + 0.1s
Code: Select all
    Program                            Score     %      Elo    +   -    Draws

  1 Komodo64 2.01 64 bit           &#58; 262.5/400  65.6    3256   29  29   31.2 %
  2 Komodo64 1.3 JA                &#58; 137.5/400  34.4    3144   29  29   31.2 %
112 +/- 29 Elo points (95% confidence) improvement in self-play, probably a little less in a gauntlet, but the new Reptilian seems the level of SF 2.01. Will leave it for more games, then a gauntlet.

Kai
Kai,

When we test at that pace it makes a big difference how we set "move overhead milliseconds" and the default is current 20. To be honest, on our tester we don't forfeit even when setting it to zero. It will DEFINITELY affect the result at 1 second games like you are testing. Komodo 1.3 did not have any overhead built in so in your match komodo 2 was playing handicapped.

The purpose of "move overhead milliseconds" is to deal with slower graphical interfaces which I believe could be inadvertently imposing a penalty on each move. And if you are manually operating the computer to play a game with a physical clock you could set it to 2000 or more.
Interesting, the average time per move was 102ms or so for 2.01. Meaning 82ms? This is a handicap of ~25 Elo points. Thanks for the info.

It's not quite that bad. If the move overhead in reality is only 5 ms, it's not like you give away the 15 ms, it is still left on your clock. However, there is a delay in getting to use that time and we have found that it takes too long before the extra 15 ms gets to have much of an impact (you only get a fraction of it back on each move) so in reality you are still playing a big part of the game too fast. So it does have a crippling affect. And you can lose the game before this extra time is built up enough to make much of a difference.

Kai

swami · Post by **swami** » Tue Jun 14, 2011 3:33 am

El Gringo wrote:Hi,

results test suite sts 1-13 :
Code: Select all
Komodo 2.01 x64	90	85	79	85	92	86	76	66	81	85	79	76	83	1063
Komodo 1.3 x64 	94	80	77	77	86	84	79	71	77	81	76	83	81	1046
Q6660 @ 3Ghz

Best
Johan

Hi Johan,

Thanks for posting the results. So it is 9-4 in favor of Komodo 2.01, which is a good improvement indeed - considering that it's certainly difficult to improve so easily at higher level (3000 elo and above)

Note that STS 14 was released recently and is available for download.

Best regards,
Swami

Andre · Post by **Andre** » Tue Jun 14, 2011 3:43 am

[D]4rk2/pp3p2/7N/2rp2pQ/qb2n3/3BP3/PP3PPP/R2K3R w - - 0 21
In this position from So-Grandelius... Komodo puts Bc2 as first choice with a +- score (#1998) when Qxc2 actually mates.

Interesting game by the way

[Event "19th Sigeman & Co"]
[Site "Malmo SWE"]
[Date "2011.06.12"]
[Round "4"]
[White "So, W."]
[Black "Grandelius, N."]
[Result "0-1"]
[ECO "E35"]
[WhiteElo "2667"]
[BlackElo "2547"]
[PlyCount "62"]
[EventDate "2011.06.09"]

1. c4 e6 2. Nc3 Nf6 3. d4 Bb4 4. Qc2 d5 5. cxd5 exd5 6. Bg5 c5 7. dxc5 h6 8.
Bh4 g5 9. Bg3 Ne4 10. e3 Qa5 11. Nge2 Bf5 12. Be5 O-O 13. Nd4 Re8 14. Bxb8 Nxc3
15. Nxf5 Ne4+ 16. Kd1 Raxb8 17. Nxh6+ Kf8 18. Bd3 Rbc8 19. Qe2 Rxc5 20. Qh5
Qa4+ 21. b3 Qd7 22. Ke2 Qe6 23. Rac1 Qf6 24. Ng4 Qb2+ 25. Kf3 Re6 26. Rxc5 Rf6+
27. Nxf6 Qxf6+ 28. Kg4 Qe6+ 29. Kf3 Qf5+ 30. Ke2 Qxf2+ 31. Kd1 Qd2# 0-1

Komodo 2.01 is out!

Re: Komodo 2.01 is out!

Re: Komodo 2.01 is out!

Re: Komodo 2.01 is out!

Re: Komodo 2.01 is out!

Re: Komodo 2.01 is out!

Re: Komodo 2.01 is out!

Re: Komodo 2.01 is out!

Re: Komodo 2.01 is out!