Testing Stockfish 11-03-13. 480 Games.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Kohflote
Posts: 240
Joined: Wed Sep 19, 2007 11:07 am
Location: Singapore

Re: no more tests?

Post by Kohflote »

Yes, I miss the test....
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: no more tests?

Post by Tomcass »

Thanks for your interest, my friends. I am currently testing 141013 version of Stockfish with and without Large Pages. I will post the result tonight.

Best regards from Barcelona.

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: no more tests?

Post by Tomcass »

Stockfish 141013 = 480 games.

Bench: 7700683 Timestamp: 1381785869

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control_4+0

Stockfish 141013 64 SSE4.2_ - Critter 1.6a 64-bit_NOB 26.0 - 14.0 +16/=20/-4 65.00%
Stockfish 141013 64 SSE4.2_ - Houdini 3 Pro x64_6 17.5 - 22.5 +6/=23/-11 43.75%
Stockfish 141013 64 SSE4.2_ - Komodo 6 64-bitx6NOB 21.0 - 19.0 +12/=18/-10 52.50%

Time Control_2+2

Stockfish 141013 64 SSE4.2_ - Critter 1.6a 64-bitX6_NOB 30.0 - 10.0 +22/=16/-2 75.00%
Stockfish 141013 64 SSE4.2_ - Houdini 3 Pro x64_6 19.5 - 20.5 +9/=21/-10 48.75%
Stockfish 141013 64 SSE4.2_ - Komodo 6 64-bitx6NOB 20.5 - 19.5 +7/=27/-6 51.25%

240 Games X 6 Cores = http://www.mediafire.com/?435xwe2500w65c0
Score using 6 cores = 134.5-105.5 = 56.04%

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control_4+0

Stockfish 141013 64 SSE4.2_ - Critter 1.6a 64-bitnob_4 26.0 - 14.0 +14/=24/-2 65.00%
Stockfish 141013 64 SSE4.2_ - Houdini 3 Pro x64_4 13.0 - 27.0 +5/=16/-19 32.50%
Stockfish 141013 64 SSE4.2_ - Komodo 6 64-bitx4_NOB 21.5 - 18.5 +9/=25/-6 53.75%

Time Control_2+2

Stockfish 141013 64 SSE4.2_ - Critter 1.6a 64-bitnob_4 28.5 - 11.5 +19/=19/-2 71.25%
Stockfish 141013 64 SSE4.2_ - Houdini 3 Pro x64_4 20.0 - 20.0 +11/=18/-11 50.00%
Stockfish 141013 64 SSE4.2_ - Komodo 6 64-bitx4_NOB 22.0 - 18.0 +9/=26/-5 55.00%


240 games X 4 Cores = http://www.mediafire.com/?scgynm50fk25jgc
Score using 4 Cores = 131.0-109.0 = 54.58%

Segmenting the result by Time Control:

Fixed = 125.0-115.0 = 52.08%
Incremental = 140.5- 99.5 = 58.54%

Global Score: 265.5- 214.5 = 55.31%
Against Critter 1.6a (3093): 69.06% Houdini 3.0 Pro (3172): 43.75% Komodo 6 (3154): 53.12%

Average ELO of opponents= 3.140
Estimated ELO Performance= 3.177


Again a nice score for Stockfish, about 5 ELO points ahead of Houdini 3.0 Pro.

Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: no more tests?

Post by Tomcass »

Stockfish 141013 With Large Pages (Peterpan compile) = 480 games.

Compile + Sources (Prepared by Peterpan) = http://www.mediafire.com/?l4rbw48f6qm6r88

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Large Pages: Allowed.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control_4+0

Stockfish 141013SL 64 SSE4.2L - Critter 1.6a 64-bit_NOB 28.0 - 12.0 +19/=18/-3 70.00%
Stockfish 141013SL 64 SSE4.2L - Houdini 3 Pro x64_6 25.0 - 15.0 +16/=18/-6 62.50%
Stockfish 141013SL 64 SSE4.2L - Komodo 6 64-bitx6NOB 20.0 - 20.0 +9/=22/-9 50.00%

Time Control_2+2

Stockfish 141013SL 64 SSE4.2L - Critter 1.6a 64-bitX6_NOB 25.0 - 15.0 +12/=26/-2 62.50%
Stockfish 141013SL 64 SSE4.2L - Houdini 3 Pro x64_6 22.0 - 18.0 +12/=20/-8 55.00%
Stockfish 141013SL 64 SSE4.2L - Komodo 6 64-bitx6NOB 22.0 - 18.0 +12/=20/-8 55.00%

240 Games X 6 Cores = http://www.mediafire.com/?mtu7wt8y84lpmri
Score using 6 cores = 142.0- 98.0 = 59.17%

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Large Pages Allowed.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control_4+0

Stockfish 141013SL 64 SSE4.2L - Critter 1.6a 64-bitnob_4 21.0 - 19.0 +9/=24/-7 52.50%
Stockfish 141013SL 64 SSE4.2L - Houdini 3 Pro x64_4 22.0 - 18.0 +14/=16/-10 55.00%
Stockfish 141013SL 64 SSE4.2L - Komodo 6 64-bitx4_NOB 22.0 - 18.0 +12/=20/-8 55.00%

Time Control_2+2

Stockfish 141013SL 64 SSE4.2L - Critter 1.6a 64-bitnob_4 22.5 - 17.5 +11/=23/-6 56.25%
Stockfish 141013SL 64 SSE4.2L - Houdini 3 Pro x64_4 23.0 - 17.0 +11/=24/-5 57.50%
Stockfish 141013SL 64 SSE4.2L - Komodo 6 64-bitx4_NOB 25.0 - 15.0 +14/=22/-4 62.50%

240 games X 4 Cores = http://www.mediafire.com/?a33a6g903pa92ax
Score using 4 Cores = 135.5-104.5 = 56.46%

Segmenting the result by Time Control:

Fixed = 138.0-102.0 = 57.50%
Incremental = 139.5 – 100.5 = 58.12%

Global Score: 277.5 – 102.5 = 57.81%
Against Critter 1.6a (3093): 60.31% Houdini 3.0 Pro (3172): 57.50% Komodo 6 (3154): 55.62%

Average ELO of opponents= 3.140
Estimated ELO Performance= 3.195


New best score ever for Stockfish in my tests. :D :D :D

Only a few comments:

1.- This result is simply amazing. Estimated ELO 23 points ahead of Houdini 3.0 Pro, 41 points ahead of Komodo 6 and 18 points ahead of the same engine without Large Pages.

2.- Please note that as a simple tester I don't want state anything about the use of Large Pages. I don't know even how they work. My only purpose is to offer my tests and the games, so that everyone can build their own opinion.

A big THANKS to the Stockfish Team for their incredible work, and also to my friend Peterpan with this stable compile with Large Pages allowed.

Best regards from Barcelona.

Tom
User avatar
M ANSARI
Posts: 3726
Joined: Thu Mar 16, 2006 7:10 pm

Re: no more tests?

Post by M ANSARI »

You cannot really test 2 engines against each other while you have LP on with one and not the other. That would be similar to testing 2 engines with one of them on 10% or 15% faster hardware and then assuming that the engine is outperforming the other. If LP are to be turned ON then they should be done for both engines.

For what its worth, LP are really not worth the trouble as they quickly deragment your memory. And unless you have a good memory deragmenter and realize when your memory needs mainenance, then it is just too much of a headache. This is especially true if you are engine testing as you really have no clue when LP are working properly or when they are not. I do see how they can be useful if you absolutely need the highest performance out of your system for long term analysis ... but even then you could simply improve your cooling and overclock your CPU for easier and better results.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: no more tests?

Post by Tomcass »

Hello M.Ansari.

I partially agree with your comment. The 'face to face' SF- Houdini match should be tested under the same conditions.

But my goal here is to test the development of Stockfish under 'caeteris paribus' methodology. And offer my results to the chess community, and obviously to the SF Team, to be evaluated. In other words, for me it is more relevant that SF with LP pages seems stronger than SF without LP than the exact measure of the difference between SF an Houdini 3.0 Pro. I will continue doing the same in the future.

Thank you very much for your contribution and for your attention to my tests.

Regards,

Tom.
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: no more tests?

Post by beram »

Tomcass wrote:Hello M.Ansari.

I partially agree with your comment. The 'face to face' SF- Houdini match should be tested under the same conditions.

But my goal here is to test the development of Stockfish under 'caeteris paribus' methodology. And offer my results to the chess community, and obviously to the SF Team, to be evaluated. In other words, for me it is more relevant that SF with LP pages seems stronger than SF without LP than the exact measure of the difference between SF an Houdini 3.0 Pro. I will continue doing the same in the future.

Thank you very much for your contribution and for your attention to my tests.

Regards,

Tom.
Hi Tom,

I dont want to offend but are you sure that you have enabled the Large pages in this last test with Peter Pan compile ?
Because you also said before that you didnt know how they work

Nevertheless thx for your testing
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: no more tests?

Post by Tomcass »

Hi Bram!

Not offended at all, of course. I meant that I don't know the internal operation of Large Pages in the computer. But of course I know how to tick "Try Large Pages" in Peterpan's compile. :wink:

Thanks for your constuctive comment. Kind regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: no more tests?

Post by Tomcass »

Stockfish 181013 = 480 Games.

Bench: 8440524 Timestamp: 1382114978

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control_4+0

Stockfish 181013 64 SSE4.2_ - Critter 1.6a 64-bitX6_NOB_ 25.0 - 15.0 +14/=22/-4 62.50%
Stockfish 181013 64 SSE4.2_ - Houdini 3 Pro x64_6 21.5 - 18.5 +12/=19/-9 53.75%
Stockfish 181013 64 SSE4.2_ - Komodo 6 64-bitx6NOB 22.5 - 17.5 +10/=25/-5 56.25%

Time Control_2+2

Stockfish 181013 64 SSE4.2_ - Critter 1.6a 64-bitX6_NOB 22.5 - 17.5 +9/=27/-4 56.25%
Stockfish 181013 64 SSE4.2_ - Houdini 3 Pro x64_6 20.0 - 20.0 +11/=18/-11 50.00%
Stockfish 181013 64 SSE4.2_ - Komodo 6 64-bitx6NOB 20.5 - 19.5 +10/=21/-9 51.25%

240 Games X 6 Cores = http://www.mediafire.com/?94axaavgoec0iv1
Score using 6 cores = 132.0- 108.0 = 55.00%

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control_4+0

Stockfish 181013 64 SSE4.2_ - Critter 1.6a 64-bitnob_4 24.0 - 16.0 +13/=22/-5 60.00%
Stockfish 181013 64 SSE4.2_ - Houdini 3 Pro x64_4 20.5 - 19.5 +11/=19/-10 51.25%
Stockfish 181013 64 SSE4.2_ - Komodo 6 64-bitx4_NOB 26.5 - 13.5 +17/=19/-4 66.25%

Time Control_2+2

Stockfish 181013 64 SSE4.2_ - Critter 1.6a 64-bitnob_4 23.0 - 17.0 +12/=22/-6 57.50%
Stockfish 181013 64 SSE4.2_ - Houdini 3 Pro x64_4 24.0 - 16.0 +12/=24/-4 60.00%
Stockfish 181013 64 SSE4.2_ - Komodo 6 64-bitx4_NOB 22.0 - 18.0 +10/=24/-6 55.00%

240 Games X 4 Cores = http://www.mediafire.com/?pnu9tb35bx2pj5c
Score using 4 Cores = 140.0-100.0 = 58.33%

Segmenting the result by Time Control:

Fixed = 140.0-100.0 = 58.33%
Incremental = 132.0-108.0 = 55.00%

Global Score: 272.0 – 208.0 = 56.67%
Against Critter 1.6a (3093): 59.06% Houdini 3.0 Pro (3172): 53.75% Komodo 6 (3154): 57.19%

Average ELO of opponents= 3.140
Estimated ELO Performance= 3.187


Another brilliant score for Stockfish. The best one without Large Pages and only 8 Estimated Elo Points below the best score reported in my previous test (where Large Pages were used).

Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: no more tests?

Post by Tomcass »

M ANSARI wrote:You cannot really test 2 engines against each other while you have LP on with one and not the other. That would be similar to testing 2 engines with one of them on 10% or 15% faster hardware and then assuming that the engine is outperforming the other. If LP are to be turned ON then they should be done for both engines.

For what its worth, LP are really not worth the trouble as they quickly deragment your memory. And unless you have a good memory deragmenter and realize when your memory needs mainenance, then it is just too much of a headache. This is especially true if you are engine testing as you really have no clue when LP are working properly or when they are not. I do see how they can be useful if you absolutely need the highest performance out of your system for long term analysis ... but even then you could simply improve your cooling and overclock your CPU for easier and better results.
... by the way, M.Ansari, perhaps this link will be useful for you -and for all members that join this excellent forum-:

http://www.cruxis.com/chess/manual/inde ... ersion.htm

You will see that in fact the situation is very different -in fact the opposite- to the one that you assumed as right: Houdini 3.0 uses always Large Pages, providing they are available. :wink:

Regards,

Tom.