Testing Stockfish 11-03-13. 480 Games.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
Ozymandias
Posts: 1537
Joined: Sun Oct 25, 2009 2:30 am

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Ozymandias »

i think the X6 games from the first test are missing this time:
1st test X4 1.5MB
2nd&3rd test X4 2.9MB
2nd&3rd test X6 2.8MB
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

Hi, Juan.

Thanks again for carefully following my tests. You are right. I forgot to post the games 1 to 240 of the first SF 221214 MZ test in my 6 cores computer. Here is the file:

http://www.mediafire.com/download/2mc18 ... amesOK.pgn

Saludos cordiales desde Barcelona.

Tom.
User avatar
Ozymandias
Posts: 1537
Joined: Sun Oct 25, 2009 2:30 am

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Ozymandias »

Well, at least it wasn't another brain fart on my part :wink:
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING STOCKFISH ORCA 281214: 480 Games.

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014t – Limit 8 moves. Sedat –
No tablebases. No RTB used.
Hash 512
Relative Speed: 29.54
Knodes per second: 14.177

Time Control = 4+0

Stockfish 281214 64 POPCNTO - Houdini 4 x64_st_X6_CT0 21.5 - 18.5 +11/=21/-8 53.75%
Stockfish 281214 64 POPCNTO - Komodo 8 64-bit_6_NOB 24.0 - 16.0 +14/=20/-6 60.00%
Stockfish 281214 64 POPCNTO - Gull 3 x64 XP 26.0 - 14.0 +14/=24/-2 65.00%

Time Control= 2+2

Stockfish 281214 64 POPCNTO - Houdini 4 x64_st_X6_CT0 22.5 - 17.5 +11/=23/-6 56.25%
Stockfish 281214 64 POPCNTO - Komodo 8 64-bit_6_NOB 24.0 - 16.0 +15/=18/-7 60.00%
Stockfish 281214 64 POPCNTO - Gull 3 x64 XP 23.0 - 17.0 +11/=24/-5 57.50%

Score using 6 cores: 141.0 – 99.0 = 58.75%

240 Games =
http://www.mediafire.com/download/6g6bm ... 0Games.pgn

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012t –Limit 8 moves. Sedat-
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control: 4+0

Stockfish 281214 64 POPCNTO - Houdini 4 Pro x64-A_OK 23.0 - 17.0 +13/=20/-7 57.50%
Stockfish 281214 64 POPCNTO - Komodo 8 64-bit_4_NOB 23.0 - 17.0 +6/=34/-0 57.50%
Stockfish 281214 64 POPCNTO - Gull 3 x64 XP 26.5 - 13.5 +14/=25/-1 66.25%

Time Control= 2+2

Stockfish 281214 64 POPCNTO - Houdini 4 Pro x64-A_OK 22.0 - 18.0 +6/=32/-2 55.00%
Stockfish 281214 64 POPCNTO - Komodo 8 64-bit_4_NOB 22.5 - 17.5 +10/=25/-5 56.25%
Stockfish 281214 64 POPCNTO - Gull 3 x64 XP 27.0 - 13.0 +19/=16/-5 67.50%

Score using 4 cores: 144.0 – 96.0= 60.00%
240 Games:
http://www.mediafire.com/download/ui9mx ... 0games.pgn

Segmenting by Time Control:

Fixed TC = 144.0 –96.0 = 60.00%
Incremental TC = 141.0 – 99.0 = 58.75%

GLOBAL SCORE: 285.0 – 195.0= 59.37%

Against : Houdini 4.0 St. Ct0 (3227) = 55.62% ; Komodo 8 (3266) = 58.44%, Gull 3 XP (3199) =
64.06%

Average Estimated Elo Opponents = 3231
Estimated Elo Performance= 3297


Error bars= +/- 23 EEP

Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

THE RANKING OF MY TESTS.

Some computer chess friends asked me to post a ranking of my tests. I have decided to start 2015 with this ranking, under my testing conditions in two computers of 6 and 4 cores.

Testing conditions:

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014t – Limit 8 moves. Sedat –
No tablebases. No RTB used.
Hash 512
Relative Speed: 29.54
Knodes per second: 14.177

Time Control = 4+0 and 2+2

And

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012t –Limit 8 moves. Sedat-
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control: 4+0 and 2+2.

GLOBAL RANKING OF MY TESTS AT 01 JANUARY 2015

1.- Stockfish 221214 Marco Z 3209 (1440 games)
2.- Stockfish Development 091114 3201 (1440 games)
3.- Stock-fish-6 6st 3200 (1440 games)
4.- Stockfish Orca 281214 3197 (480 games)
5.- Stockfish Rockwood 190514 3183 (480 games)
6.- Stockfish 5 3172 (960 games)
7.- Komodo 8 3153 (8480 games)
8.- DON 271114 3139 (480 games)
9.- Houdini 4 3136 (10240 games)
10.- Komodo 7 3107 (1760 games)
11.- Gull 3 XP 3103 (10240 games)
12.- Houdini 3 3072
13.- Amitis 16102013 3052
14.- Fire 4 3047
15.- Stockfish 4 3019
16.- Critter 1.6a 3004
17.- Deep Rybka 4.1 2915

Some comments:

1.- You will observe that I have decided to decrease in 100 Elo Points all my measures. I am sure that this estimate is closer to the reality and more consistent with other rankings (more ‘professionals’ than mine).

2.- The best Stockfish compiles are usually between 10 and 12 Elo points better than the Development versions. They provide a good indicator of the strength of SF Development some weeks later.

3.- Some of the engines have been tested intensively, with over 8,000 games. In these cases, the error bars have been reduced to only +/- 5 aproximately.

Since my ranking is a ‘pure amateur’ one, I will appreciate any comment or suggestion to improve it. Thanks in advance and my wish for an EXCELLENT 2015 to all my computer chess friends.

Regards from Barcelona,

Tom.
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by beram »

Tomcass wrote:THE RANKING OF MY TESTS.

Some computer chess friends asked me to post a ranking of my tests. I have decided to start 2015 with this ranking, under my testing conditions in two computers of 6 and 4 cores.

Testing conditions:

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014t – Limit 8 moves. Sedat –
No tablebases. No RTB used.
Hash 512
Relative Speed: 29.54
Knodes per second: 14.177

Time Control = 4+0 and 2+2

And

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012t –Limit 8 moves. Sedat-
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control: 4+0 and 2+2.

GLOBAL RANKING OF MY TESTS AT 01 JANUARY 2015

1.- Stockfish 221214 Marco Z 3209 (1440 games)
2.- Stockfish Development 091114 3201 (1440 games)
3.- Stock-fish-6 6st 3200 (1440 games)
4.- Stockfish Orca 281214 3197 (480 games)
5.- Stockfish Rockwood 190514 3183 (480 games)
6.- Stockfish 5 3172 (960 games)
7.- Komodo 8 3153 (8480 games)
8.- DON 271114 3139 (480 games)
9.- Houdini 4 3136 (10240 games)
10.- Komodo 7 3107 (1760 games)
11.- Gull 3 XP 3103 (10240 games)
12.- Houdini 3 3072
13.- Amitis 16102013 3052
14.- Fire 4 3047
15.- Stockfish 4 3019
16.- Critter 1.6a 3004
17.- Deep Rybka 4.1 2915

Some comments:

1.- You will observe that I have decided to decrease in 100 Elo Points all my measures. I am sure that this estimate is closer to the reality and more consistent with other rankings (more ‘professionals’ than mine).

2.- The best Stockfish compiles are usually between 10 and 12 Elo points better than the Development versions. They provide a good indicator of the strength of SF Development some weeks later.

3.- Some of the engines have been tested intensively, with over 8,000 games. In these cases, the error bars have been reduced to only +/- 5 aproximately.

Since my ranking is a ‘pure amateur’ one, I will appreciate any comment or suggestion to improve it. Thanks in advance and my wish for an EXCELLENT 2015 to all my computer chess friends.

Regards from Barcelona,

Tom.
Thx Tom, keep up the good work
one question houdini 4 in your list is with contempt=0 ?
and what is the measured difference between Houdini 4 and Houdini 4 ct=0 in your testing I remembered it was about 17 ELO ?
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

beram wrote:
Thx Tom, keep up the good work
one question houdini 4 in your list is with contempt=0 ?
and what is the measured difference between Houdini 4 and Houdini 4 ct=0 in your testing I remembered it was about 17 ELO ?
Thanks for your kind words, Bram!

I decided to use two different Houdini 4 for my tests. In my 6 cores computer I use Houdini 4 Standard Contempt 0, whereas in my 4 cores computer I use Houdini 4 Pro x64-A Contempt Default.

Your useful question has made me to compare the score of both Houdinis. I remember I posted that H4 ct0 was better under my testing conditions and having Stockfish as a more frequent rival. Now the figures tell me that I was wrong. Having played 5.120 games for each engine under the same testing conditions, the score I have found is:

Houdini Standard Contempt 0= 3236
Houdini Pro x64-A Contempt Default= 3237

That is both settings have scored almost exactly equal in my tests. I assume that this comparison is not 100% precise, because HSC0 has played all the games in my 6 cores computer and HP-A-CDefault has played all the games in my 4 cores computer. There is the scalability factor for Houdini 4 and for his rivals to be considered if I want to be a bit more precise. With the current information available both Houdinis 4 have the same strength under my testing conditions.

Best regards from Barcelona,

Tom.
Kohflote
Posts: 240
Joined: Wed Sep 19, 2007 11:07 am
Location: Singapore

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Kohflote »

Thank you, Tom. I never miss your post and I rely on your testing's result. Keep up the Good work.

Best wishes,
Koh, Kah Huat
carldaman
Posts: 2287
Joined: Sat Jun 02, 2012 2:13 am

Re: Testing Stockfish 11-03-13. 480 Games.

Post by carldaman »

Very nice testing, Tom :)

I think the latest SF dev version is ripe for an official release! :)

Have a great New Year!

Thanks,
CL
Tomcass wrote:THE RANKING OF MY TESTS.

Some computer chess friends asked me to post a ranking of my tests. I have decided to start 2015 with this ranking, under my testing conditions in two computers of 6 and 4 cores.

Testing conditions:

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014t – Limit 8 moves. Sedat –
No tablebases. No RTB used.
Hash 512
Relative Speed: 29.54
Knodes per second: 14.177

Time Control = 4+0 and 2+2

And

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012t –Limit 8 moves. Sedat-
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control: 4+0 and 2+2.

GLOBAL RANKING OF MY TESTS AT 01 JANUARY 2015

1.- Stockfish 221214 Marco Z 3209 (1440 games)
2.- Stockfish Development 091114 3201 (1440 games)
3.- Stock-fish-6 6st 3200 (1440 games)
4.- Stockfish Orca 281214 3197 (480 games)
5.- Stockfish Rockwood 190514 3183 (480 games)
6.- Stockfish 5 3172 (960 games)
7.- Komodo 8 3153 (8480 games)
8.- DON 271114 3139 (480 games)
9.- Houdini 4 3136 (10240 games)
10.- Komodo 7 3107 (1760 games)
11.- Gull 3 XP 3103 (10240 games)
12.- Houdini 3 3072
13.- Amitis 16102013 3052
14.- Fire 4 3047
15.- Stockfish 4 3019
16.- Critter 1.6a 3004
17.- Deep Rybka 4.1 2915

Some comments:

1.- You will observe that I have decided to decrease in 100 Elo Points all my measures. I am sure that this estimate is closer to the reality and more consistent with other rankings (more ‘professionals’ than mine).

2.- The best Stockfish compiles are usually between 10 and 12 Elo points better than the Development versions. They provide a good indicator of the strength of SF Development some weeks later.

3.- Some of the engines have been tested intensively, with over 8,000 games. In these cases, the error bars have been reduced to only +/- 5 aproximately.

Since my ranking is a ‘pure amateur’ one, I will appreciate any comment or suggestion to improve it. Thanks in advance and my wish for an EXCELLENT 2015 to all my computer chess friends.

Regards from Barcelona,

Tom.
Modern Times
Posts: 3755
Joined: Thu Jun 07, 2012 11:02 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Modern Times »

Bram, the final result of my 6CPU match was:

KOMODO 1339 vs STOCKFISH 221214

189.0 - 211.0 ( +55, =268, -77 ) 47.25%


Details here:
http://kirill-kryukov.com/chess/discuss ... f=7&t=7907

I've posted it here rather than continue in the TCEC thread in the General section in case the mods start to get annoyed about posting match results there...