Testing Stockfish 11-03-13. 480 Games.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Kohflote
Posts: 240
Joined: Wed Sep 19, 2007 11:07 am
Location: Singapore

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Kohflote »

Dear Tom,

For SF280614 is measured by 480 games (same opponents and 480 games faced by SF041014 and SF071014) may I ask what is its estimated Elo performance?

Thank you for your hard and good work.

Best regards,
Koh, Kah Huat
User avatar
pohl4711
Posts: 2819
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Testing Stockfish 11-03-13. 480 Games.

Post by pohl4711 »

Kohflote wrote:Dear Tom,

For SF280614 is measured by 480 games (same opponents and 480 games faced by SF041014 and SF071014) may I ask what is its estimated Elo performance?

Thank you for your hard and good work.

Best regards,
Koh, Kah Huat

Take a look on my website...http://spcc.beepworld.de
There you find a lot of Elos of a lot Stockfish-development versions based on 5000 games each (!). Mention, that on my website the date of Stockfish-dev-versions are written like a version-number (backwards) (140628 = June, 06, 2014).

Stefan
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

Kohflote wrote:Dear Tom,

For SF280614 is measured by 480 games (same opponents and 480 games faced by SF041014 and SF071014) may I ask what is its estimated Elo performance?

Thank you for your hard and good work.

Best regards,
Koh, Kah Huat
Hello my friend,

Thanks for your kind words.

If you take the first test as a reference, the EEP is 3283, but the larger the number of games, the greater the precision of the test. For this reason I suggest you to take the whole 1920 games - 4 tests work as the best estimate of SF 280614 performance: 3280.

Best regards from Barcelona.

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

pohl4711 wrote:
Kohflote wrote:Dear Tom,

For SF280614 is measured by 480 games (same opponents and 480 games faced by SF041014 and SF071014) may I ask what is its estimated Elo performance?

Thank you for your hard and good work.

Best regards,
Koh, Kah Huat

Take a look on my website...http://spcc.beepworld.de
There you find a lot of Elos of a lot Stockfish-development versions based on 5000 games each (!). Mention, that on my website the date of Stockfish-dev-versions are written like a version-number (backwards) (140628 = June, 06, 2014).

Stefan
Hi, Stefan.

Your website is an excellent reference for the computer test lovers. Well done!. My tests are, of course, less organized than yours, although I hope they add some value to our colleagues in this forum. The number of downloads is substantial. :-)

The main difference is the legth of games. 3 and a half minutes per game in your tests and almost 9 minutes per game in my tests. The second biggest difference is the number of cores used: 4 and 6 in my tests.

Keep your great work, Stefan!.

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING STOCKFISH DEVELOPMENT 231014

Timestamp: 1414084211 Bench: 6816504

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014t – Limit 8 moves. Sedat –
No tablebases. No RTB used.
Hash 512
Relative Speed: 29.54
Knodes per second: 14.177

Time Control = 4+0

Stockfish 231014 64 SSE4.2D - Houdini 4 x64_st_X6_CT0 22.0 - 18.0 +9/=26/-5 55.00%
Stockfish 231014 64 SSE4.2D - Komodo 8 64-bit_6_NOB 21.5 - 18.5 +10/=23/-7 53.75%
Stockfish 231014 64 SSE4.2D - Gull 3 x64 XP 25.5 - 14.5 +15/=21/-4 63.75%

Time Control= 2+2

Stockfish 231014 64 SSE4.2D - Houdini 4 x64_st_X6_CT0 20.5 - 19.5 +5/=31/-4 51.25%
Stockfish 231014 64 SSE4.2D - Komodo 8 64-bit_6_NOB 23.0 - 17.0 +10/=26/-4 57.50%
Stockfish 231014 64 SSE4.2D - Gull 3 x64 XP 25.0 - 15.0 +14/=22/-4 62.50%


Score using 6 cores: 137.5 – 102.5= 57.29%
240 Games =
http://www.mediafire.com/view/e9s70ii1d ... stTest.pgn

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012t –Limit 8 moves. Sedat-
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control: 4+0

Stockfish 231014 64 SSE4.2D - Houdini 4 Pro x64-A_OK 26.0 - 14.0 +16/=20/-4 65.00%
Stockfish 231014 64 SSE4.2D - Komodo 8 64-bit_4_NOB 22.5 - 17.5 +11/=23/-6 56.25%
Stockfish 231014 64 SSE4.2D - Gull 3 x64 XP 25.5 - 14.5 +16/=19/-5 63.75%

Time Control= 2+2

Stockfish 231014 64 SSE4.2D - Houdini 4 Pro x64-A_OK 20.5 - 19.5 +7/=27/-6 51.25%
Stockfish 231014 64 SSE4.2D - Komodo 8 64-bit_4_NOB 21.5 - 18.5 +13/=17/-10 53.75%
Stockfish 231014 64 SSE4.2D - Gull 3 x64 XP 25.5 - 14.5 +14/=23/-3 63.75%

Score using 4 cores: 141.5 – 98.5= 58.96%
240 Games:
http://www.mediafire.com/view/71av034v9 ... 0games.pgn

Segmenting by Time Control:

Fixed TC = 133.0 – 107.0 = 55.42%
Incremental TC = 140.0 – 100.0 = 58.33%

GLOBAL SCORE: 279.0 – 201.0 = 58.12%

Against : Houdini 4.0 St. Ct0 (3227) = 53.75% ; Komodo 8 (3266) = 55.62%, Gull 3 XP (3199) = 61.25%

Average Estimated Elo Opponents = 3231
Estimated Elo Performance= 3288


Error bars= +/- 23 EEP

This engine seems a bit stronger than my reference: SF280614. It has performed 8 Estimated Elo Points better, obviously within error bars. Let's test it again.

Kind regards from Barcelona.

Tom.
Kohflote
Posts: 240
Joined: Wed Sep 19, 2007 11:07 am
Location: Singapore

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Kohflote »

Thank you, Tom.

Dear Stefan, I do visit your website. Personally, I prefer longer time control.

Best regards,
Koh, Kah Huat
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING STOCKFISH DEVELOPMENT 231014 2nd. Test (Games 481 to 960)

Timestamp: 1414084211 Bench: 6816504

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014t – Limit 8 moves. Sedat –
No tablebases. No RTB used.
Hash 512
Relative Speed: 29.54
Knodes per second: 14.177

Time Control = 4+0

Stockfish 231014 64 SSE4.2D - Houdini 4 x64_st_X6_CT0 24.5 - 15.5 +15/=19/-6 61.25%
Stockfish 231014 64 SSE4.2D - Komodo 8 64-bit_6_NOB 19.5 - 20.5 +6/=27/-7 48.75%
Stockfish 231014 64 SSE4.2D - Gull 3 x64 XP 23.5 - 16.5 +10/=27/-3 58.75%

Time Control= 2+2

Stockfish 231014 64 SSE4.2D - Houdini 4 x64_st_X6_CT0 21.5 - 18.5 +8/=27/-5 53.75%
Stockfish 231014 64 SSE4.2D - Komodo 8 64-bit_6_NOB 22.5 - 17.5 +11/=23/-6 56.25%
Stockfish 231014 64 SSE4.2D - Gull 3 x64 XP 28.0 - 12.0 +19/=18/-3 70.00%

Score using 6 cores: 139.5 – 100.5= 58.12%

240 Games =
http://www.mediafire.com/view/hd096oc6e ... ndTest.pgn

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012t –Limit 8 moves. Sedat-
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control: 4+0

Stockfish 231014 64 SSE4.2D - Houdini 4 Pro x64-A_OK 22.0 - 18.0 +11/=22/-7 55.00%
Stockfish 231014 64 SSE4.2D - Komodo 8 64-bit_4_NOB 24.0 - 16.0 +14/=20/-6 60.00%
Stockfish 231014 64 SSE4.2D - Gull 3 x64 XP 27.0 - 13.0 +17/=20/-3 67.50%

Time Control= 2+2

Stockfish 231014 64 SSE4.2D - Houdini 4 Pro x64-A_OK 26.0 - 14.0 +13/=26/-1 65.00%
Stockfish 231014 64 SSE4.2D - Komodo 8 64-bit_4_NOB 21.0 - 19.0 +6/=30/-4 52.50%
Stockfish 231014 64 SSE4.2D - Gull 3 x64 XP 22.0 - 18.0 +8/=28/-4 55.00%

Score using 4 cores: 142.0 – 98.0= 59.17%

240 Games:
http://www.mediafire.com/view/a5h1sd8ft ... ndtest.pgn

Segmenting by Time Control:

Fixed TC = 140.5 – 99.5 = 58.54%
Incremental TC = 141.0 – 99.0 = 58.75%

GLOBAL SCORE: 281.5 – 198.5 = 58.65%

Against : Houdini 4.0 St. Ct0 (3227) = 58.75% ; Komodo 8 (3266) = 54.37%%, Gull 3 XP (3199) = 62.81%

Average Estimated Elo Opponents = 3231
Estimated Elo Performance= 3292


Error bars= +/- 23 EEP

----------------------------------------

SUMMARY OF TESTING STOCKFISH DEVELOPMENT 231014: 960 GAMES

GLOBAL SCORE: 560.5 – 399.5 = 58.39%

Average Estimated Elo Opponents = 3231
Estimated Elo Performance= 3290

Error bars= +/- 16 EEP


This 960 games test show 10 EEP of improvement over my SF Development reference so far SF280614.

Kind regards from Barcelona.

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING STOCKFISH DEVELOPMENT 271014 (First test: Games 1 to 480)

Timestamp: 1414410524 Bench: 6615949

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014t – Limit 8 moves. Sedat –
No tablebases. No RTB used.
Hash 512
Relative Speed: 29.54
Knodes per second: 14.177

Time Control = 4+0

Stockfish 271014 64 SSE4.2_D - Houdini 4 x64_st_X6_CT0 20.5 - 19.5 +8/=25/-7 51.25%
Stockfish 271014 64 SSE4.2_D - Komodo 8 64-bit_6_NOB 21.0 - 19.0 +8/=26/-6 52.50%
Stockfish 271014 64 SSE4.2_D - Gull 3 x64 XP 25.5 - 14.5 +14/=23/-3 63.75%

Time Control= 2+2

Stockfish 271014 64 SSE4.2_D - Houdini 4 x64_st_X6_CT0 21.5 - 18.5 +5/=33/-2 53.75%
Stockfish 271014 64 SSE4.2_D - Komodo 8 64-bit_6_NOB 22.0 - 18.0 +7/=30/-3 55.00%
Stockfish 271014 64 SSE4.2_D - Gull 3 x64 XP 21.5 - 18.5 +6/=31/-3 53.75%

Score using 6 cores: 132.0 – 108.0= 55.00%

240 Games =
http://www.mediafire.com/view/8b9j8mfj2 ... stTest.pgn

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012t –Limit 8 moves. Sedat-
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control: 4+0
201410_StockfishDevelopment271014_4+0 2014

Stockfish 271014 64 SSE4.2_4 - Houdini 4 Pro x64-A_OK 23.5 - 16.5 +10/=27/-3 58.75%
Stockfish 271014 64 SSE4.2_4 - Komodo 8 64-bit_4_NOB 22.0 - 18.0 +10/=24/-6 55.00%
Stockfish 271014 64 SSE4.2_4 - Gull 3 x64 XP 23.5 - 16.5 +16/=15/-9 58.75%

Time Control= 2+2

Stockfish 271014 64 SSE4.2_4 - Houdini 4 Pro x64-A_OK 21.0 - 19.0 +7/=28/-5 52.50%
Stockfish 271014 64 SSE4.2_4 - Komodo 8 64-bit_4_NOB 22.0 - 18.0 +8/=28/-4 55.00%
Stockfish 271014 64 SSE4.2_4 - Gull 3 x64 XP 28.0 - 12.0 +18/=20/-2 70.00%

Score using 4 cores: 140.0 – 100.0= 58.33%
240 Games:
http://www.mediafire.com/view/23r0gj8t7 ... 0games.pgn

Segmenting by Time Control:

Fixed TC = 136.0 – 104.0= 56.67%
Incremental TC = 136.0 – 104.0= 56.67%

GLOBAL SCORE: 272.0 – 208.0 = 56.67%

Against : Houdini 4.0 St. Ct0 (3227) = 54.06% ; Komodo 8 (3266) = 54.37%, Gull 3 XP (3199) = 61.56%

Average Estimated Elo Opponents = 3231
Estimated Elo Performance= 3278


Error bars= +/- 23 EEP

Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING STOCKFISH DEVELOPMENT 271014 (Second test: Games 481 to 960)

Timestamp: 1414410524 Bench: 6615949

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014t – Limit 8 moves. Sedat –
No tablebases. No RTB used.
Hash 512
Relative Speed: 29.54
Knodes per second: 14.177

Time Control = 4+0

Stockfish 271014 64 SSE4.2_D - Houdini 4 x64_st_X6_CT0 23.0 - 17.0 +12/=22/-6 57.50%
Stockfish 271014 64 SSE4.2_D - Komodo 8 64-bit_6_NOB 22.0 - 18.0 +10/=24/-6 55.00%
Stockfish 271014 64 SSE4.2_D - Gull 3 x64 XP 25.0 - 15.0 +13/=24/-3 62.50%

Time Control= 2+2

Stockfish 271014 64 SSE4.2_D - Houdini 4 x64_st_X6_CT0 23.0 - 17.0 +12/=22/-6 57.50%
Stockfish 271014 64 SSE4.2_D - Komodo 8 64-bit_6_NOB 23.5 - 16.5 +10/=27/-3 58.75%
Stockfish 271014 64 SSE4.2_D - Gull 3 x64 XP 22.5 - 17.5 +8/=29/-3 56.25%

Score using 6 cores: 139.0 – 101.0= 57.92%

240 Games =
http://www.mediafire.com/view/u2gblgu39 ... ndTest.pgn

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012t –Limit 8 moves. Sedat-
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control: 4+0

Stockfish 271014 64 SSE4.2_4 - Houdini 4 Pro x64-A_OK 23.5 - 16.5 +10/=27/-3 58.75%
Stockfish 271014 64 SSE4.2_4 - Komodo 8 64-bit_4_NOB 24.0 - 16.0 +11/=26/-3 60.00%
Stockfish 271014 64 SSE4.2_4 - Gull 3 x64 XP 26.0 - 14.0 +14/=24/-2 65.00%

Time Control= 2+2

Stockfish 271014 64 SSE4.2_4 - Houdini 4 Pro x64-A_OK 23.0 - 17.0 +10/=26/-4 57.50%
Stockfish 271014 64 SSE4.2_4 - Komodo 8 64-bit_4_NOB 17.5 - 22.5 +2/=31/-7 43.75%
Stockfish 271014 64 SSE4.2_4 - Gull 3 x64 XP 28.0 - 12.0 +16/=24/-0 70.00%

Score using 4 cores: 142.0 – 98.0= 59.17%

240 Games:
http://www.mediafire.com/view/gqblq9yuu ... Test__.pgn

Segmenting by Time Control:

Fixed TC = 143.5 – 96.5 = 59.79%
Incremental TC = 137.5 – 102.5= 57.29%

GLOBAL SCORE: 281.0 – 199.0 = 58.54%

Against : Houdini 4.0 St. Ct0 (3227) = 57.81% ; Komodo 8 (3266) = 54.37%, Gull 3 XP (3199) = 63.44%

Average Estimated Elo Opponents = 3231
Estimated Elo Performance= 3291


Error bars= +/- 23 EEP



SUMMARY AFTER 960 GAMES:

GLOBAL SCORE: 553.0 – 407.0 = 57.60%

Against : Houdini 4.0 St. Ct0 (3227) = 55.93% ; Komodo 8 (3266) = 54.37%, Gull 3 XP (3199) = 62.50%

Average Estimated Elo Opponents = 3231
Estimated Elo Performance= 3284

Error bars= +/- 16 EEP


Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING STOCKFISH MZ 271014 480 GAMES

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014t – Limit 8 moves. Sedat –
No tablebases. No RTB used.
Hash 512
Relative Speed: 29.54
Knodes per second: 14.177

Time Control = 4+0

SF271014MZ x64x6 - Houdini 4 x64_st_X6_CT0 20.5 - 19.5 +5/=31/-4 51.25%
SF271014MZ x64x6 - Komodo 8 64-bit_6_NOB 22.5 - 17.5 +11/=23/-6 56.25%
SF271014MZ x64x6 - Gull 3 x64 XP 25.5 - 14.5 +14/=23/-3 63.75%

Time Control= 2+2

SF271014MZ x64x6 - Houdini 4 x64_st_X6_CT0 21.5 - 18.5 +9/=25/-6 53.75%
SF271014MZ x64x6 - Komodo 8 64-bit_6_NOB 23.0 - 17.0 +10/=26/-4 57.50%
SF271014MZ x64x6 - Gull 3 x64 XP 26.5 - 13.5 +15/=23/-2 66.25%

Score using 6 cores: 139.5 – 101.5 = 58.12%

240 Games =
http://www.mediafire.com/view/p8gq67xzi ... mes_X6.pgn
i7 975 3.33 Ghz.

4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012t –Limit 8 moves. Sedat-
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control: 4+0
SF271014MZ x64x4 - Houdini 4 Pro x64-A_OK 22.0 - 18.0 +9/=26/-5 55.00%
SF271014MZ x64x4 - Komodo 8 64-bit_4_NOB 23.0 - 17.0 +9/=28/-3 57.50%
SF271014MZ x64x4 - Gull 3 x64 XP 23.5 - 16.5 +13/=21/-6 58.75%

Time Control= 2+2

SF271014MZ x64x4 - Houdini 4 Pro x64-A_OK 20.5 - 19.5 +8/=25/-7 51.25%
SF271014MZ x64x4 - Komodo 8 64-bit_4_NOB 22.0 - 18.0 +8/=28/-4 55.00%
SF271014MZ x64x4 - Gull 3 x64 XP 23.5 - 16.5 +11/=25/-4 58.75%

Score using 4 cores: 134.5 – 105.5= 56.04%
240 Games:
http://www.mediafire.com/view/do6cxh5oo ... 0games.pgn

Segmenting by Time Control:

Fixed TC = 137.0 – 103.0= 57.08%
Incremental TC = 137.0 – 103.0= 57.08%

GLOBAL SCORE: 274.0 – 206.0 = 57.08%

Against : Houdini 4.0 St. Ct0 (3227) = 52.81% ; Komodo 8 (3266) = 56.56%, Gull 3 XP (3199) = 61.87%

Average Estimated Elo Opponents = 3231
Estimated Elo Performance= 3281


Error bars= +/- 23 EEP

The score of SF271014MZ in this test (3281) is very similar to the SF271014 Delevolpment (3284).

Regards,

Tom.