Page 2 of 9

Re: Komodo 2.03 release is imminent

Posted: Wed Jun 15, 2011 9:33 am
by stevenaaus
Losing to a pawn attack by Stockfish-2.1
[D] 1r6/1P3k2/1PP1N2p/3nPP1K/8/8/8/8 b - - 0 68

[Event "Scid vs. PC"]
[Site "?"]
[Date "2011.06.15"]
[Round "5"]
[White "Stockfish 2.1 JA"]
[Black "Komodo64 2.03 JA"]
[Result "1-0"]
[TimeControl "1:06/0"]

1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O Bc5 5. Bxc6 dxc6 6. d3 Qe7 7. Nbd2 O-O 8. Nc4 Nd7 9. Qe1 f6 10. Be3 Nb6 11. Bxc5 Qxc5 12. Ne3 Be6 13. Qd2 Rfd8 14. a4 a5 15. Rfd1 Nc8 16. Qe2 Nd6 17. Rdb1 Re8 18. c3 Rad8 19. Nd2 Re7 20. h3 Qb6 21. Nf3 Red7 22. b3 Qa6 23. Ra2 c5 24. h4 Nf7 25. Ne1 Qc6 26. Rd2 Rd6 27. g3 R6d7 28. f4 exf4 29. gxf4 Nd6 30. Rdb2 b5 31. axb5 Nxb5 32. Rc1 Nd6 33. c4 f5 34. e5 Nf7 35. Ra1 Rd4 36. Qf2 Qb6 37. N3c2 R4d7 38. Rba2 Ra8 39. Qg2 c6 40. Qd2 Rda7 41. Ra3 Nd8 42. Nf3 Bf7 43. d4 Rd7 44. dxc5 Rxd2 45. cxb6 Rxc2 46. Rxa5 Rc1+ 47. Rxc1 Rxa5 48. Rd1 Ra8 49. Nd4 g6 50. Kf2 Kf8 51. c5 Bd5 52. h5 gxh5 53. Nxf5 Ne6 54. Rxd5 cxd5 55. b7 Rb8 56. c6 Nc7 57. Nd4 h4 58. Kg2 Kf7 59. Kh3 Ke7 60. Kxh4 Rg8 61. f5 Na6 62. Ne6 d4 63. Nxd4 Nc7 64. Kh5 h6 65. b4 Kf7 66. b5 Nd5 67. Ne6 Rb8 68. b6 Nxb6 69. c7 Rxb7 70. Nd8+ Ke7 71. f6+ Kd7 72. Nxb7 Nc8 73. Nd6 Nb6 74. f7 Kxc7 75. f8=Q Kc6 76. e6 Nd5 77. e7 Nxe7 78. Qxe7 Kd5 79. Kxh6 Kc6 80. Kg5 Kb6 81. Kf4 Kc6 82. Ke4 Kb6 83. Kd5 Ka5 84. Kc4 Kb6 85. Qb7+ Ka5 86. Qb5#
{White time 2833, Black time 9006}
1-0

Re: Komodo 2.03 release is imminent

Posted: Wed Jun 15, 2011 9:55 am
by Graham Banks
Thanks to you and your team. :)

Re: Komodo 2.03 release is imminent

Posted: Wed Jun 15, 2011 2:12 pm
by Don
Graham Banks wrote:Thanks to you and your team. :)
Don't start testing yet until we get the SSE42 (ABM) version ready. Preliminary results show that it is about 8% faster and these binaries appear to run even better on AMD. But you can evaluate this for yourself when it's ready.

Don

Re: Komodo 2.03 release is imminent

Posted: Wed Jun 15, 2011 3:01 pm
by Albert Silver
Don wrote:
Graham Banks wrote:Thanks to you and your team. :)
Don't start testing yet until we get the SSE42 (ABM) version ready. Preliminary results show that it is about 8% faster and these binaries appear to run even better on AMD. But you can evaluate this for yourself when it's ready.

Don
Will it support multi-core?

Re: Komodo 2.03 release is imminent

Posted: Wed Jun 15, 2011 3:04 pm
by Don
Albert Silver wrote:
Don wrote:
Graham Banks wrote:Thanks to you and your team. :)
Don't start testing yet until we get the SSE42 (ABM) version ready. Preliminary results show that it is about 8% faster and these binaries appear to run even better on AMD. But you can evaluate this for yourself when it's ready.

Don
Will it support multi-core?
Not yet.

Re: Komodo 2.03 release is imminent

Posted: Wed Jun 15, 2011 7:07 pm
by Laskos
The gauntlet with Komodo64 2.03 JA (no SSE42) finished, 3,000 games at 1s + 0.1s (average game length ~15-16 secs).

Code: Select all

    Program                            Score     %    Av.Op.  Elo    +   -    Draws

  Komodo64 2.03 JA               : 1248.0/3000  41.6   3200   3141   11  11   29.3 %

  1 Houdini 1.5a x64               : 401.5/601  66.8   3141   3262   25  25   25.5 %
  2 Ivanhoe B47cBx64-1             : 373.0/601  62.1   3141   3226   24  24   29.6 %
  3 Deep Rybka 4.1 x64             : 360.5/601  60.0   3141   3211   23  23   31.1 %
  4 Stockfish 2.1 JA 64bit         : 333.0/600  55.5   3141   3179   23  23   30.7 %
  5 Critter 1.01 64-bit            : 284.0/597  47.6   3141   3124   23  23   29.5 %
+111 +/- 15 Elo points (95% confidence) improvement over Komodo 1.3.

+18 +/- 15 Elo points (95% confidence), +18 +/- 7 Elo points (68% confidence) improvement over Komodo 2.01

Reptile moves swiftly, it seems. Still somewhere between Stockfish and Critter.

Kai

Re: Komodo 2.03 release is imminent

Posted: Wed Jun 15, 2011 7:33 pm
by Laskos
Laskos wrote:The gauntlet with Komodo64 2.03 JA (no SSE42) finished, 3,000 games at 1s + 0.1s (average game length ~15-16 secs).

Code: Select all

    Program                            Score     %    Av.Op.  Elo    +   -    Draws

  Komodo64 2.03 JA               : 1248.0/3000  41.6   3200   3141   11  11   29.3 %

  1 Houdini 1.5a x64               : 401.5/601  66.8   3141   3262   25  25   25.5 %
  2 Ivanhoe B47cBx64-1             : 373.0/601  62.1   3141   3226   24  24   29.6 %
  3 Deep Rybka 4.1 x64             : 360.5/601  60.0   3141   3211   23  23   31.1 %
  4 Stockfish 2.1 JA 64bit         : 333.0/600  55.5   3141   3179   23  23   30.7 %
  5 Critter 1.01 64-bit            : 284.0/597  47.6   3141   3124   23  23   29.5 %
+111 +/- 15 Elo points (95% confidence) improvement over Komodo 1.3.

+18 +/- 15 Elo points (95% confidence), +18 +/- 7 Elo points (68% confidence) improvement over Komodo 2.01

Reptile moves swiftly, it seems. Still somewhere between Stockfish and Critter.

Kai
Forgot about the issue with "move overhead milliseconds" starting with 2.01 version. Maybe it will affect the results compared to version 1.3, hope that not by much, and my numbers then are the minimal improvement compared to 1.3.

Kai

Re: Komodo 2.03 release is imminent

Posted: Wed Jun 15, 2011 8:07 pm
by lkaufman
You are basically testing 0.1 sec/move, so the overhead default is 20% of the time, which is huge at such fast levels. I think you will see a large increase if you cut the 20 to 1 millisecond, assuming that this does not trigger time forfeits in your GUI. Talking more generally, your testing level is really too fast to compare engines, because you are testing such things as how much overhead the various engines provide to allow for GUI inaccuracy and how much time is spent preparing the program to start searching. If you are trying to compare the engines, you need to use something like 10 seconds plus half a second increment or the equivalent to minimize the importance of these factors that have no effect on play in any normal time limit.

Re: Komodo 2.03 release is imminent

Posted: Wed Jun 15, 2011 10:00 pm
by Laskos
lkaufman wrote:You are basically testing 0.1 sec/move, so the overhead default is 20% of the time, which is huge at such fast levels. I think you will see a large increase if you cut the 20 to 1 millisecond, assuming that this does not trigger time forfeits in your GUI. Talking more generally, your testing level is really too fast to compare engines, because you are testing such things as how much overhead the various engines provide to allow for GUI inaccuracy and how much time is spent preparing the program to start searching. If you are trying to compare the engines, you need to use something like 10 seconds plus half a second increment or the equivalent to minimize the importance of these factors that have no effect on play in any normal time limit.
Don't really know what to say. Don, as I maybe wrongly understood, told me that the difference is not exactly 20ms (or ~25 Elo points in my case), but somewhat smaller and erratic. Using other engines I had no real problems with overheads, my results exited in line with latter normal controls results. The difference between engines may be exaggerated because of smaller number of draws, that was my main problem, but to establish even a tiny difference (say 3 Elo points, ~30,000 games) between two similar versions of one same engine I had no real problems. Even between two completely different engines, the systematic error was less than 10 Elo points. The proposal to have 10s + 0.5s TC would give games of ~90 seconds, even 3,000 of them would take ~4 days, 30,000 ~40 days, that is beyond my scope.

Basically, I hope I can cheat on testing.

Kai

Re: Komodo 2.03 release is imminent

Posted: Wed Jun 15, 2011 10:22 pm
by Don
Laskos wrote:
lkaufman wrote:You are basically testing 0.1 sec/move, so the overhead default is 20% of the time, which is huge at such fast levels. I think you will see a large increase if you cut the 20 to 1 millisecond, assuming that this does not trigger time forfeits in your GUI. Talking more generally, your testing level is really too fast to compare engines, because you are testing such things as how much overhead the various engines provide to allow for GUI inaccuracy and how much time is spent preparing the program to start searching. If you are trying to compare the engines, you need to use something like 10 seconds plus half a second increment or the equivalent to minimize the importance of these factors that have no effect on play in any normal time limit.
Don't really know what to say. Don, as I maybe wrongly understood, told me that the difference is not exactly 20ms (or ~25 Elo points in my case), but somewhat smaller and erratic. Using other engines I had no real problems with overheads, my results exited in line with latter normal controls results. The difference between engines may be exaggerated because of smaller number of draws, that was my main problem, but to establish even a tiny difference (say 3 Elo points, ~30,000 games) between two similar versions of one same engine I had no real problems. Even between two completely different engines, the systematic error was less than 10 Elo points. The proposal to have 10s + 0.5s TC would give games of ~90 seconds, even 3,000 of them would take ~4 days, 30,000 ~40 days, that is beyond my scope.

Basically, I hope I can cheat on testing.

Kai
The overhead was put there as a conservative design decision to prevent losses with slow user interfaces. Your test setup is clearly not slow and neither is ours. We can test with zero overhead and never lose on time just as you probably can with this tester. But try this with arena, a very slow interface. (if you can even set that kind of time control.)

But it's always best if the engine actually knows how much time it has and can depend on that and we estimated that 20 might be the average lost time due to communication overheads and such and that at any realistic time control it pretty much amounts to nothing. In 40/2 we don't care if we wasted a 1/10 of second over the entire game.

Of course the 20 ms still stays with the program as I say, but it is like a slow leak when you are really thirsty, you need it more quickly. The program is better served getting to use it right away. When the game is over those accumulated 15-20 ms (or whatever is left) forms a wasted pool of time.

But I would not worry about it. It's easy to change if you want to as it's a configurable option but you don't have to. We just know that you are probably giving up a few ELO by using the default when test so very fast. No crime has been committed :-)