Our blitz results are showing +6 but with only 600 games so far you can't rely on that. But yes in our test so far it scored 47% against Stockfish 8 and 49.5% against Houdini 5. I like Andreas's testing because of the large number of games and consequent low error margins.
I'm running this at chess960 so will see how it goes there.
Nice test Andreas as always!
Waiting for the 60'+15" test of K11.2.
I found the everage ELO gain = 6 per version since K9.42 according at Fastgm's 10 minutes + 6 seconds Rating list:
JJJ wrote:Komodo 11.01 vs Stockfish 8 : 45,8% 300 games
Komodo 11.2 vs Stockfish 8 : 47,2% 300 games
Komodo 11.01 vs Houdini 5 : 48,7% 300 games
Komodo 11.2 vs Houdini 5 : 49,7% 300 game
I see a progress here. I think it needs more game for both version to know better.
Also, maybe this time Komodo won more elo at bullet than in mid time control or long time control.
It is possible that we fixed some things that matter more in bullet chess than in long tc chess. But I note that the early CEGT 40/20 results are encouraging, so perhaps there is no problem other than the general rule that rating gains contract with greater TC due to more draws.
It is also quite possible that the openings are more drawish in some tests than in others.
lkaufman wrote:
JJJ wrote:Komodo 11.01 vs Stockfish 8 : 45,8% 300 games
Komodo 11.2 vs Stockfish 8 : 47,2% 300 games
Komodo 11.01 vs Houdini 5 : 48,7% 300 games
Komodo 11.2 vs Houdini 5 : 49,7% 300 game
I see a progress here. I think it needs more game for both version to know better.
Also, maybe this time Komodo won more elo at bullet than in mid time control or long time control.
It is possible that we fixed some things that matter more in bullet chess than in long tc chess. But I note that the early CEGT 40/20 results are encouraging, so perhaps there is no problem other than the general rule that rating gains contract with greater TC due to more draws.