Testing Stockfish 11-03-13. 480 Games.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 15-03-13. 480 Games.

Post by Tomcass »

STOCKFISH 16-05-13: 480 GAMES

I7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012
Time control: 4 min +0 sec/ game
No tablebases

SF160513 64x6 - Critter 1.6 64-bitx6_nob 9-24-7 21.0-19.0 52.50%
SF160513 64x6 - Deep Rybka 4.1 SSE42 x64 (x6) 17-19-4 26.6-13.5 66.25%
SF160513 64x6 - Houdini 3 Pro x64_ 6-25-9 18.5-21.5 46.25%

I7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012
Time control: 2 min +2 sec/ game
No tablebases

SF160513 64x6 - Critter 1.6 64-bitx6_nob 5-33-2 21.5-18.5 53.75%
SF160513 64x6 - Deep Rybka 4.1 SSE42 x64 (x6) 12-19-9 21.5-18.5 53.75%
SF160513 64x6 - Houdini 3 Pro x64_6 6-19-15 15.5-24.5 38.75%

120 games AX6 = http://www.mediafire.com/?3nd6cq2lz6tqt26
120 games B X6 = http://www.mediafire.com/?k22db0pmcm79705

Overall average with 6 cores: (124.5-115.5) = 51.87%%

i7 975 3.33 Ghz.
4 real cores
GUI: Fritz 12
Book: Fritz 12
Time control: 4 min + 0 sec/ game
Ponder: Off
No tablebases

SF160513 64 - Critter 1.6a 64-bitnob_4 7-24-9 19.0-21.0 47.50%
SF160513 64 - Deep Rybka 4 x64 (x4 13-19-8 22.5-17.5 56.25%
SF160513 64 - Houdini 3 Pro x64_4 6-23-11 17.5-22.5 43.75%

i7 975 3.33 Ghz.
4 real cores
GUI: Fritz 12
Book: Fritz 12
Time control: 2 min + 2 sec/ game
Ponder: Off
No tablebases

SF160513 64 - Critter 1.6a 64-bitnob_4 13-21-6 23.5-16.5 58.75%
SF160513 64 - Deep Rybka 4 x64 4 6-29-5 20.5-19.5 51.25%
SF160513 64 - Houdini 3 Pro x64_4 3-30-7 18.0-22.9 45.00%

240 games X4 = http://www.mediafire.com/?g8anqkjn6w83xc3

Overall average with 4 cores: (121.0-119.0) = 50.42%

Global Score: 245.5-234.5 = 51.15%

Against Critter 1.6: 53.12% Deep Rybka 4: 56.87% Houdini 3.0: 43.44%


The second best result in my tests, only after 27/04/13 version.

Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 15-03-13. 480 Games.

Post by Tomcass »

SUMMARY OF STOCKFISH PERFORMANCE.

A couple of friends of mine have requested me to post a summary of my tests with Stockfish versions to get a global picture of the progress in the development of this engine.

Here you have the summary:

In about six weeks, I tested SF using my two computers, whose details you can find in my previous posts, with 4 and 6 real cores. The opponents are Critter 1.6, Deep Rybka 4 and Houdini 3.0 Pro. Total games played: 6.240. One half of them at 4 minutes/game. The other half at 2 minutes + 2 seconds time control. Average nodes per game (for Stockfish): around 2.5 Billion.

Versions and Global score in percentage

April 7 50.83%
Apr 10 48.96%
(Ipman) Apr 12 50.00%
Apr 14 49.69%
Apr 19 49.07%
Apr 27 52.08%
Mai 2 50.10%
Mai 3 48.33%
(Ipman) Mai 5 50.62%
Mai 9 48.23%
Mai 11 50.84%
Mai 15 49.89%
Mai 16 51.15%

After 6240 games since 0704 version:

Average score all 13 tests: 49.97%
Average score 3 first tests: 49.93%
Average score 3 latest tests: 50.63%
Improvement 3 latests against 3 first tests: 0.70%.

I will keep testing intensively SF until my 4 cores computer permits it. Working non stop, it already shows some heating problems. I know well the concept of statistical noise, but when the sample is getting larger this noise is progressively smaller. I would appreciate if any of my friends here in the forum can translate this improvement from 49.93% to 50.63% to ELO points, please. Thanks.

Best regards from Barcelona.

Tom.
User avatar
Ajedrecista
Posts: 2122
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Testing Stockfish 15-03-13. 480 Games.

Post by Ajedrecista »

Hello Tom:
Tomcass wrote:SUMMARY OF STOCKFISH PERFORMANCE.

A couple of friends of mine have requested me to post a summary of my tests with Stockfish versions to get a global picture of the progress in the development of this engine.

Here you have the summary:

In about six weeks, I tested SF using my two computers, whose details you can find in my previous posts, with 4 and 6 real cores. The opponents are Critter 1.6, Deep Rybka 4 and Houdini 3.0 Pro. Total games played: 6.240. One half of them at 4 minutes/game. The other half at 2 minutes + 2 seconds time control. Average nodes per game (for Stockfish): around 2.5 Billion.

Versions and Global score in percentage

April 7 50.83%
Apr 10 48.96%
(Ipman) Apr 12 50.00%
Apr 14 49.69%
Apr 19 49.07%
Apr 27 52.08%
Mai 2 50.10%
Mai 3 48.33%
(Ipman) Mai 5 50.62%
Mai 9 48.23%
Mai 11 50.84%
Mai 15 49.89%
Mai 16 51.15%

After 6240 games since 0704 version:

Average score all 13 tests: 49.97%
Average score 3 first tests: 49.93%
Average score 3 latest tests: 50.63%
Improvement 3 latests against 3 first tests: 0.70%.

I will keep testing intensively SF until my 4 cores computer permits it. Working non stop, it already shows some heating problems. I know well the concept of statistical noise, but when the sample is getting larger this noise is progressively smaller. I would appreciate if any of my friends here in the forum can translate this improvement from 49.93% to 50.63% to ELO points, please. Thanks.

Best regards from Barcelona.

Tom.
The improvement you asked: 400*[log(0.5063/0.4937) - log(0.4993/0.5007)] ~ 4.86 Elo. Of course I did not calculate error bars, that should be the square root of sum of the two error bars squared if I am not wrong.

For such small improvements with scores near 50%, you can apply the following rule of thumb: (Elo gain) ~ 7*(score gain in percentage); or even more exact: [16/ln(10)]*(score gain in percentage) ~ 6.9487*(score gain in percentage). You will see that results are almost the same.

Thanks for your continuous tests. :)

Regards from Spain.

Ajedrecista.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 15-03-13. 480 Games.

Post by Tomcass »

Thanks for your detailed explanation, Jesús.

Almost 5 points ELO is a good improvement in only six weeks. Following this progression SF will reach Houdini 3.0 Pro level in about... I prefer not to calculate it and keep enjoying my tests. :wink:

Un abrazo!

Tomàs.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 15-03-13. 480 Games.

Post by Tomcass »

STOCKFISH 19-05-13 (version ended in number 13) : 480 GAMES

I7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012
Time control: 4 min +0 sec/ game
No tablebases

SF190513 64x6 - Critter 1.6 64-bitx6_nob 9-24-7 21.0-19.0 52.50%
SF190513 64x6 - Deep Rybka 4.1 SSE42 x64 (x6) 13-24-3 25.0-15.0 62.50%
SF190513 64x6 - Houdini 3 Pro x64_ 7-16-17 15.0-25.0 37.50%

I7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012
Time control: 2 min +2 sec/ game
No tablebases

SF190513 64x6 - Critter 1.6 64-bitx6_nob 7-23-10 18.5-21.5 46.25%
SF190513 64x6 - Deep Rybka 4.1 SSE42 x64 (x6) 9-23-8 20.5-19.5 51.25%
SF190513 64x6 - Houdini 3 Pro x64_6 5-15-20 12.5-27.5 31.25%
X6 190513 240 games: http://www.mediafire.com/?5je1slv4b9usj85

Overall average with 6 cores: (112.5-127.5) = 46.87%
i7 975 3.33 Ghz.
4 real cores
GUI: Fritz 12
Book: Fritz 12
Time control: 4 min + 0 sec/ game
Ponder: Off
No tablebases

SF190513 64 - Critter 1.6a 64-bitnob_ 6-27-7 19.5-20.5 48.75%
SF190513 64 - Deep Rybka 4 x64 (x4 8-26-6 21.0-19.0 52.50%
SF190513 64 - Houdini 3 Pro x64_4 1-30-9 16.0-20.0 40.00%

i7 975 3.33 Ghz.
4 real cores
GUI: Fritz 12
Book: Fritz 12
Time control: 2 min + 2 sec/ game
Ponder: Off
No tablebases

SF190513 64 - Critter 1.6a 64-bitnob_4 10-25-5 22.5-17.5 56.25%
SF190513 64 - Deep Rybka 4 x64 4 8-26-6 21.0-19.0 52.50%
SF190513 64 - Houdini 3 Pro x64_ 5-27-8 18.5-21.5 46.25%

X4 190513 240 games: http://www.mediafire.com/?7efuwfmjy3n3y1n

Overall average with 4 cores: (118.5-121.5) = 50.21%%

Global Score: 231.0-249.0 = 48.13%

Against Critter 1.6: 50.94% Deep Rybka 4: 54.69% Houdini 3.0: 38.75%


A poor score this time. Statistical noise for sure. :)

Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 15-03-13. 480 Games.

Post by Tomcass »

Thanks to all my friends of this forum for this 5,000 views to my tests in this thread.

By the way ... the current test seems really promising!. :-)

Best regards from Barcelona!.

Tom.
Kohflote
Posts: 240
Joined: Wed Sep 19, 2007 11:07 am
Location: Singapore

Re: Testing Stockfish 15-03-13. 480 Games.

Post by Kohflote »

Hi Tom,

For me, it should be I thanking you for your effort in testing. Thank You!!

"By the way ... the current test seems really promising!.
"

- could you please clarify "current test" meaning which version of SF? I thought the latest available version is May 19, 2013 version which you have tested and shared the result.........

Best wishes,
Koh, Kah huat
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 15-03-13. 480 Games.

Post by Tomcass »

Hi Koh,

You can find 4 versions of SF dated 19 05 2013. I tested first this one with a poor result:

-------------------------------------
Author: Marco Costalba
Date: Sun May 19 13:28:25 2013 +0200
Timestamp: 1368962905

Mimic an iterator for looping across MoveList

Seems more conventional.

No functional change.
-------------------------------------

And finally I tested the latest one:

Author: Marco Costalba
Date: Sun May 19 22:00:49 2013 +0200
Timestamp: 1368993649

----------------------------------
Microptimize MoveList loop

Add MOVE_NONE at the tail, this allows to loop
across MoveList checking for *it != MOVE_NONE,
and because *it is used imediately after compiler
is able to reuse it.

With this small patch perft speed increased of 3%

And it is also a semplification !

No functional change.
-----------------------------------

The result I will post in a few minutes corresponds therefore to the latest version available in the Stockfish development web.

Kind regards from Barcelona.

Tom.
Kohflote
Posts: 240
Joined: Wed Sep 19, 2007 11:07 am
Location: Singapore

Re: Testing Stockfish 15-03-13. 480 Games.

Post by Kohflote »

Hi Tom,

Thank you for the clarification and testing. Your effort is much appreciated :D

Best wishes,
Koh, Kah Huat
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 15-03-13. 480 Games.

Post by Tomcass »

STOCKFISH 19-05-13 (latest version of 19 May in SF development site) : 480 GAMES

I7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012
Time control: 4 min +0 sec/ game
No tablebases

SF190513 64x6 - Critter 1.6 64-bitx6_nob 11-25-4 23.5-16.5 58.75%
SF190513 64x6 - Deep Rybka 4.1 SSE42 x64 (x6) 10-22-8 21.0-19.0 52.50%
SF190513 64x6 - Houdini 3 Pro x64 7-19-14 16.5-23.5 41.25%
I7 980 3.33 Ghz.

6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012
Time control: 2 min +2 sec/ game
No tablebases

SF190513 64x6 - Critter 1.6 64-bitx6_nob 9-26-5 22.0-18.0 55.00%
SF190513 64x6 - Deep Rybka 4.1 SSE42 x64 (x6) 8-25-7 20.5-19.5 51.25%
SF190513 64x6 - Houdini 3 Pro x64_ 7-19-14 16.5-23.5 41.25%

240 games x6 = http://www.mediafire.com/?dkcx2m71tw3p8p7

Overall average with 6 cores: (120.0-120.0)= 50,0%

i7 975 3.33 Ghz.
4 real cores
GUI: Fritz 12
Book: Fritz 12
Time control: 4 min + 0 sec/ game
Ponder: Off
No tablebases

SF190513 64 - Critter 1.6a 64-bitnob 9-29-2 23.5-16.5 58.75%
SF190513 64 - Deep Rybka 4 x64 (x4 18-21-1 28.5-11.5 71.25%
SF190513 64 - Houdini 3 Pro x64_4 5-23-12 16.5-23.5 41.25%

i7 975 3.33 Ghz.
4 real cores
GUI: Fritz 12
Book: Fritz 12
Time control: 2 min + 2 sec/ game
Ponder: Off
No tablebases

SF190513 64 - Critter 1.6a 64-bitnob_4 11-28-1 25.0-15.0 62.50%
SF190513 64 - Deep Rybka 4 x64 13-22-5 24.0-16.0 60.00%
SF190513 64 - Houdini 3 Pro x64_ 4-25-11 16.5-23.5 41.25%

240 games X4 = http://www.mediafire.com/?7f0pihbq3035s3t

Overall average with 4 cores: (134.0-106.0)= 55.83%

Global Score: 254.0-226.0 (52.92%)

Against Critter 1.6:58.75% Deep Rybka 4:58.75% Houdini 3.0:41.25%


Although this time the statistical noise probably has played a positive effect in this test:

NEW BEST SCORE FOR STOCKFISH IN MY TESTS

Best regards from Barcelona.

Tom.