I would be disappointed with less than 40 Elo improvement for Houdini 3 in IPON.
Either way, we're very close to the 50 Elo gain I "officially" announced, you cannot expect any more precision from any rating list nor from my own development testing gauntlets (which measured 50 to 55 Elo).
Robert
Hello,
After now 2040 of 10000 games played in my LS-rankinglist-gauntlet of Houdini 3, the score of Houdini 3 is 68.2%. Houdini 2.0c final result after 10000 games against the same 10 strong opponents was 62%.
+6.2% is around +43 Elo. Houdini 3 seems to be very strong, but for any conclusions I want to play the complete 10000 games.
Final result here (hopefully) sunday evening...
I believe that the ELO gain would be bigger if we don´t count weaker opponents. Rating for top engines should be calculated only from 5 best engines. It is the same for people. Anand and Carlsen would never have so high ratings if they play common open tournaments with many weakers players. Simply they are not able to beat everybody...playing style of new Houdini 3 looks really amazing
I got a few similar request already. As I am interested in that by myself I will run a short test to see where it would be in the IPON. Nonetheless I will not include it officialy.
Ingo, please after this match would you mind to play a shorter match with H3 tactical vs non-top engines (let say 100 games against : Deep Rybka 4.1, Deep Fritz 13 32b, Naum 4.2, Chiron 1.1a, HIARCS 14 WCSC 32b and Gull 1.2) ?
I didn't see any rating for H3tactical in games yet ...
That thread is very long so you must have patience. You can read it and try to extract your own conclusions, though it could be difficult due to the large amount of stuff posted there. Good luck!
I would be disappointed with less than 40 Elo improvement for Houdini 3 in IPON.
Either way, we're very close to the 50 Elo gain I "officially" announced, you cannot expect any more precision from any rating list nor from my own development testing gauntlets (which measured 50 to 55 Elo).
Robert
Hello,
After now 2040 of 10000 games played in my LS-rankinglist-gauntlet of Houdini 3, the score of Houdini 3 is 68.2%. Houdini 2.0c final result after 10000 games against the same 10 strong opponents was 62%.
+6.2% is around +43 Elo. Houdini 3 seems to be very strong, but for any conclusions I want to play the complete 10000 games.
Final result here (hopefully) sunday evening...
Best - Stefan
Thank you very much for your LS lists. I stay tuned for the final results!
I also saw yesterday than improving from 62% to 68.2% was not improving 43 Elo but a little more (obviating the error bars). I did not want to post about it, but I see that Robert also realized (thanks for Houdini 3 and the table).
Stefan, I recommend you to follow the own definition of Elo differences: if you have two scores of 62% - 38% and 68.2% - 31.8%, then the expected difference is 400*[log(68.2/31.8) - log(62/38)] ~ 47.5 Elo. This equation can be applied for whatever two scores you want except the extreme cases of 0% - 100% and viceversa, of course.