Automatically created test set of very difficult moves

mmt · Post by **mmt** » Wed Apr 22, 2020 2:09 am

Is there something like this out there? It could be made by looking at positions where both SF and LC0 with 7-piece tablebases take more than 30 minutes on high-end hardware to switch their best move to a single different move and then also evaluate other all other moves as being at least 2.0 worse. Or alternatively, only one of them switches but then "proves" it by playing against the other engine's defense. Of course, the problem with making one is hardware resources.

Tests set are not great to evaluate strengths but it then would be interesting to see how new versions and new programs do on it.

Dann Corbit · Post by **Dann Corbit** » Wed Apr 22, 2020 7:58 am

A simple way to do it is to load all the TCEC games into EPD records and then scan for a differenece of opinion as to the eval. When there is a sudden rise in eval and the other engine does not see that for a while, that is probably a good test position. An additional check would be to see that the engine that saw the rise in eval went on to win. If both of those conditions are true, emit the position and the key move.

Since TCEC is on such high-end hardware, it is probably a very good move. And there are no terrible programs in the tournament

mmt · Post by **mmt** » Thu Apr 23, 2020 8:18 am

That does sound like a good way to do it.

Automatically created test set of very difficult moves

Automatically created test set of very difficult moves

Re: Automatically created test set of very difficult moves

Re: Automatically created test set of very difficult moves