No, it is an approximate, and therefore wrong answer. And useless ifor determining which is best of the difference is less than the rounding error.Dann Corbit wrote: ↑Sun Jul 05, 2020 9:05 pm The Elo calculation said that the same engine had the same Elo all the way down to the last decimal digit. This is, in fact, the right answer.
And now apply that 'wisdom' to an 'engine' where you set the winning threshold at 0.500001. Or 0.500000001. Now it will also (after rounding) say they are exactly equal. Which they were not. So the Elo answer is wrong. For the single case for which you artificially forced it to be right, it is now wrong for an infinity of other cases. You succeeded in making Elo calculations unsuitable for engines that are different. "Only to be used when beforehand you are absolutely sure the engines are the same, otherwise you will get wrong answers." Congratulations!Now, the fractional part is still in question even after half a million trials. But that does not matter, because Elo is (wisely) reported as an integer.
So, the Elo difference of zero is exactly correct for every single measurement. The hypothesis is confirmed. An engine is not stronger than itself.
Why not report all ratings in kilo-Elo, and round? Wouldn't that save an awful lot of test games? Have you already suggested it to the Stockfish developers?
It is as fruitless as you want it to be. You are obviously not willing to listen to reason, and keep chanting the same falsehoods no matter how often they have been refuted. So yes, discussion about this for you personally is utterly pointless.I think maybe it is a good time to retire the discussion. I think LOS is a bad and misleading statistic, and you think it is a fine and accurate statistic.
I even agree with some possible sets of data, it might return a pretty good answer. But even in those places, Elo would return a better answer (IMO).
I actually feel very badly about the somewhat acrimonious nature of the debate. That is because I consider hgm (for his work on Winboard) and Ronald (for his work on Syzygy tablebase files and Cfish and other things) to be true heroes of computer chess. I think, as Wesley said, "we are at an impasse" and so future debate will prove equally fruitless.
But that doesn't matter. We discuss in public so others can learn. As long as it is glaringly obvious that literally everything you said is completely detached from any reality, as I think by now it must be 10 times over, we should feel good about it.