nice arguing, but the guys at Playchess argue exactly the other way around. They want a book to be perfect with a particular engine. Using that book for an enginetest is useless ...
Fixed starting position (especially if unknown ), and changing colors give the best results for an engine test. I am deeply convinced about that!
Bye
Ingo
George Tsavdaris wrote:
IWB wrote:
Graham Banks wrote:
Sedat Canbaz wrote:Not allowing the games,probably he (Ingo) is afraid that the Book Authors will steal/copy his opening moves lines
Ingo only operates Shredder in major tournaments. I don't think he writes the book as well.
Absolutly right, I have no idea about books, but I know that books are completly in vain (or might even harm the result) if you want to test an engine!
I don't agree.
I agree that testing without books, with a very large set of different starting positions probably gives which is the strongest engine generally so which is the strongest engine for analysis purposes but that is not the whole story.
Because and let's suppose that Houdini 1.5 is 3010 ELO and Rybka 4 2955 ELO in you conditions that is with a big set of random positions, then can we say that Houdini is generally the strongest program than Rybka by all means?
No! ►Since perhaps there is an opening book, let's call it OPB1 that if all engines in your tournament use it to make Rybka 4 stronger than Houdini 1.5 with the same conditions of your list and let's say it will make her have 3105 ELO and Houdini 1.5 3010.
Let now use a different opening book for Houdini. And another one and another one until we find which fits it better and make it stronger. Let's do the other with the other engines on your list.
►So perhaps there is a configuration of opening books OPB1, OPB2, OPB3, etc... for Rybka 4, Houdini 1.5, Stockfish 1.9.1, etc... that will make Rybka 4 to have e.g an ELO of 3090, Houdini an ELO of 3020 and that no opening book for Houdini can be found(or can exist) that can beat Rybka's 4 ELO by using the OPB1.
So our verdict would be that Rybka 4 + Opening Book-1 is the stronger Chess playing entity than Houdini 1.5 (with any opening book that exists).
Houdini 1.5 can't beat Rybka 4+OPB1 no matter what.
These 2 ways of testing(with and without opening books) are completely different in kind, but both are useful and interesting.
You cant. Please have a look at the rules and the general remarks at the end. I know this is degradeing the list, but thats the way I like to keep it.
Thanks for your understanding
Ingo
So that your engine rating list be credible you have to share your games.
That's not necessarily true. If the games are shared, so is the opening book that was used. Once the opening book is revealed there are more possibilities for the authors to take advantage of. In this case the opening book is very small so it would be easily possible to tune a program to do better.
The biggest issue with credibility is trusting the person setting up the tests, not seeing the games. I could rig the results in many ways if I were unethical and yet still let you see the games. The easiest way is to play each game twice and if I wanted "Komodo" to win I could pick out the games from each pair where Komodo scored best. It's sort of like golfing where you give yourself a mulligan on every shot if you don't like it.
Ingo has no motivation for doing that. He is basically testing Shredder for the benefit of the Shredder team, not for us but is generous enough to let us see the results. It would be counter-productive for him to misrepresent any program because that would give feedback to the Shredder team that was in error.
If the tests were published for marketing reasons then it would be a totally different story and I would not trust it even if the games were published.
Don wrote:That's not necessarily true. If the games are shared, so is the opening book that was used. Once the opening book is revealed there are more possibilities for the authors to take advantage of. In this case the opening book is very small so it would be easily possible to tune a program to do better.
There is an easy way. Simply use random sample out of 4000 positions from Bob. If someone can tune his engine for 4000 positions, this means he tuned it to play better chess overall and therefore the credit should be given.
Many (serious) authors use it already anyway.
Don wrote:That's not necessarily true. If the games are shared, so is the opening book that was used. Once the opening book is revealed there are more possibilities for the authors to take advantage of. In this case the opening book is very small so it would be easily possible to tune a program to do better.
There is an easy way. Simply use random sample out of 4000 positions from Bob. If someone can tune his engine for 4000 positions, this means he tuned it to play better chess overall and therefore the credit should be given.
Many (serious) authors use it already anyway.
I agree with you on this. If you can play 4000 games it would be difficult indeed to tune your program against a specific opponent.
But I think Ingo has a very small set of openings and gets bulk by letting program play a lot of different opponents.
Don wrote:That's not necessarily true. If the games are shared, so is the opening book that was used. Once the opening book is revealed there are more possibilities for the authors to take advantage of. In this case the opening book is very small so it would be easily possible to tune a program to do better.
There is an easy way. Simply use random sample out of 4000 positions from Bob. If someone can tune his engine for 4000 positions, this means he tuned it to play better chess overall and therefore the credit should be given.
Many (serious) authors use it already anyway.
I think that it may be possible to tune by some secret opening book
when the program seems to think but practically has a preference in the evaluation for specific book moves.