open source battle IvanHoe v Stockfish at long time-control

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Carlos Ylich
Posts: 175
Joined: Wed Apr 28, 2010 9:31 pm
Location: Brazil

Re: open source battle IvanHoe v Stockfish at long time-cont

Post by Carlos Ylich »

Great job Pal Thanks !! :D
PawnStormZ
Posts: 880
Joined: Mon Feb 15, 2010 6:43 am

open source battle IvanHoe v Stockfish at long time-cont

Post by PawnStormZ »

 
                       You are welcome Matthias.  Thanks to you for the ChessGUI that is used to run the matches!
 
                               Take care
 
PawnStormZ
Posts: 880
Joined: Mon Feb 15, 2010 6:43 am

open source battle IvanHoe v Stockfish at long time-cont

Post by PawnStormZ »

 
                       Not trying to fight with you George.  I would just like you to say which IvanHoe you think is the best, if you have an opinion.  Listing 7 "that are better than what I am using", and saying that you could name 4 or 5 more really does not help.
 
                       If you have a suggestion, I will take a look for it, if not, I will continue with this one which Norman Schmidt said was about as good as any other version.
 
                                   Take care
 
PawnStormZ
Posts: 880
Joined: Mon Feb 15, 2010 6:43 am

open source battle IvanHoe v Stockfish at long time-cont

Post by PawnStormZ »

 
                       Thank you Carlos, I am glad that you enjoyed the games.
 
                                         Take care
 
IGarcia
Posts: 543
Joined: Mon Jul 05, 2010 10:27 pm

Re: open source battle IvanHoe v Stockfish at long time-cont

Post by IGarcia »

Thanks very much for the games and all the efforts.

PawnStormZ wrote:  
               The 200 games are available here...
   
                           http://www.datafilehost.com/download-9efabfcb.html

 
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: open source battle IvanHoe v Stockfish at long time-cont

Post by Don »

Jouni wrote:I have feeling 200 games was overkill. In first 100 games SF won 52-48 and last 100 it won 50,5-49,5. Did it give any new information to us, I doubt..
I expected Stockfish to win this match because it was longer time control. A faster match Ivanhoe would have won and a really fast match Ivanhoe would have won big.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: open source battle IvanHoe v Stockfish at long time-cont

Post by beram »

Don wrote:
Jouni wrote:I have feeling 200 games was overkill. In first 100 games SF won 52-48 and last 100 it won 50,5-49,5. Did it give any new information to us, I doubt..
I expected Stockfish to win this match because it was longer time control. A faster match Ivanhoe would have won and a really fast match Ivanhoe would have won big.


Dear Don, predicting match results is not your strong point
When you look at results of Ivanhoe 9.46h CCRL 4/40 it scores 46,7 % against Stockfish 2.2.2
Ivanhoe 9.46h CCRL 40/40, it scores 45,1%
So conclusively, at shorter time control, Ivanhoe looses much clearer against Stockfish !

grts Bram
Uri Blass
Posts: 10267
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: open source battle IvanHoe v Stockfish at long time-cont

Post by Uri Blass »

beram wrote:
Don wrote:
Jouni wrote:I have feeling 200 games was overkill. In first 100 games SF won 52-48 and last 100 it won 50,5-49,5. Did it give any new information to us, I doubt..
I expected Stockfish to win this match because it was longer time control. A faster match Ivanhoe would have won and a really fast match Ivanhoe would have won big.


Dear Don, predicting match results is not your strong point
When you look at results of Ivanhoe 9.46h CCRL 4/40 it scores 46,7 % against Stockfish 2.2.2
Ivanhoe 9.46h CCRL 40/40, it scores 45,1%
So conclusively, at shorter time control, Ivanhoe looses much clearer against Stockfish !

grts Bram
CCRL 40/4 are with 6 cpu and not with 4 cpu so they are not relevant for this discussion because I believe that stockfish earns more from more cpu's

I believe 40/4 with 6 cpu's is also not what Don meant by a faster match.
He probably thought about something like 40 moves/1 minute with 4 cpu when really fast match with 4 cpu's is probably something like 40 moves/10 seconds)

Uri
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: open source battle IvanHoe v Stockfish at long time-cont

Post by beram »

Uri Blass wrote:
beram wrote:
Don wrote:
Jouni wrote:I have feeling 200 games was overkill. In first 100 games SF won 52-48 and last 100 it won 50,5-49,5. Did it give any new information to us, I doubt..
I expected Stockfish to win this match because it was longer time control. A faster match Ivanhoe would have won and a really fast match Ivanhoe would have won big.


Dear Don, predicting match results is not your strong point
When you look at results of Ivanhoe 9.46h CCRL 4/40 it scores 46,7 % against Stockfish 2.2.2
Ivanhoe 9.46h CCRL 40/40, it scores 45,1%
So conclusively, at shorter time control, Ivanhoe looses much clearer against Stockfish !

grts Bram
CCRL 40/4 are with 6 cpu and not with 4 cpu so they are not relevant for this discussion because I believe that stockfish earns more from more cpu's

I believe 40/4 with 6 cpu's is also not what Don meant by a faster match.
He probably thought about something like 40 moves/1 minute with 4 cpu when really fast match with 4 cpu's is probably something like 40 moves/10 seconds)

Uri
Well perhaps this data is more of your liking: Stockfisch - Ivanhoe at ultrafast TC: + 81 = 120 - 67
http://www.talkchess.com/forum/viewtopi ... +stockfish

Code: Select all

      pal larkin Joined: 15 Feb 2010
Posts: 831

Posted: Sun Jan 22, 2012 5:53 am    Post subject: 800 10-sec games: RobboLito - IvanHoe - Stockfish	   
         This match was basically to test if the new JA 2.2.2 sse4.2 "Intel" compile would no longer lose on time.  Well, in its 536 games there were 8 losses on time by Stockfish.  I do not know if that (1.5%) would be considered good or bad.  I guess there is still at least a small problem since neither of the others had any time losses. 
  
           The time-control was game in 10 seconds plus 1 second per move on Intel core I7 920 at 2.66GHz using 4 cores HT off; Win 7 64 Home Premium; 6G RAM; pc doing nothing but the match while a game is being played. 

           Each engine used its default settings; no large pages; no end-game tablebases; 512 hash each; 4 cores each; ponder off.  The openings are played using random choices from the "top200.pgn" created by Sedat Canbaz as part of his "perfect_2011" opening book; each opening is played as Black and White by each engine. 
  
           The below results are adjusted to change 5 of the 8 losses (3 v Robbo; 2 v Ivan) from losses to draws based on the evals of the final positions.  The original result remains in the pgn which can be downloaded below. 
  
           The surprise for me was that Stockfish defeated both Ivanhoe and RobboLito in their head-to-head matches, and came a close 2nd overall in these super-fast games; RobboLito beating IvanHoe was not expected either. 
  
                 Engine                              Pts        RobboLito 0.10        Stockfish 2.2.2          IvanHoe999946h 
         1    RobboLito 0.10 SMP x64    279.0                                      + 71 = 123 - 74      + 51 = 191 - 26 
         2    Stockfish 2.2.2 JA SSE42    276.5    + 74 = 123 - 71                                        + 81 = 120 - 67 
         3    IvanHoe999946h x64        248.5    + 26 = 191 - 51         +67 = 120 - 81 

                           Games here...    http://www.datafilehost.com/download-dccf8772.html 
PawnStormZ
Posts: 880
Joined: Mon Feb 15, 2010 6:43 am

open source battle IvanHoe v Stockfish at long time-cont

Post by PawnStormZ »

Don wrote:
Jouni wrote:I have feeling 200 games was overkill. In first 100 games SF won 52-48 and last 100 it won 50,5-49,5. Did it give any new information to us, I doubt..
I expected Stockfish to win this match because it was longer time control. A faster match Ivanhoe would have won and a really fast match Ivanhoe would have won big.
 
                   You know Don, lately I have been wondering what, if anything, these "tests" actually tell us.
 
                   I had become interested in all the Ivanhoe versions and wanted to see which was the strongest.  I ran a tourney of 2800 30-second +1sec games among 8 of the versions that I had.  I took the top 4 and then re-ran (15 sec games) and got 2 to "test" further (B52aF and B46fC if anyone cares).

                   I decided to try and play with the many parameter settings to see if I could make the engine stronger.  I made a copy of B52aF and started changing settings and playing matches against a default version of the same engine.  These were 400 game matches at 5 seconds +0.25.

                   Most of the changes resulted in a close loss: 5 or 6 games. Then I found one that won by 18 games.  I thought I was "on to something" and made further changes which fell back to losing.  I wanted to get back to my "winner" and try something else but something told me to first try the exact same changes that won, to see if they won by a similar margin a second time.
 
                   Let me say that all the matches were run using exactly the same conditions, and the openings were even played in the same order.  Guess what?  NO improvement by my modifyed version the 2nd time; it even lost by 4 or 5 games!  I know you "statistics guys" are probably laughing at me, but I was truly surprised (and disappointed by what this means for test results).
 
                   So, if a 400 game match could end so differently using the exact same conditions, then my 200 game match between Stockfish and IvanHoe which ended only +5 for SF probably does not mean anything at all!  If the same match were run again, it might end up with Ivan winning by 8.  Some of the changes that I made to Ivan which "failed" may just as well have shown as "better" if the matches were re-run, so what did I "learn"?
 
                   Does anyone have any "facts" on how many games are really needed to get "meaningful" results?  I know Bob Hyatt runs 30,000 games to test things but even if the game lasted only 1 minute that would take 3 weeks of 24\7 to test just one change (lacking a cluster here  :) ).  Some changes (like split-depth?) probably need longer games to see the full effect.
 
                   Is there even any point in publishing results from 400 or 500 game matches, let alone the 20 or 50 game ones that are common here; does it give us any "information" at all?    :(