Similarity tester - 2nd generation - BETA

Rebel · Post by **Rebel** » Tue Aug 13, 2019 1:48 am

Added a last feature, add comments to the HTML. For instance these long names for NN engines, store your comments in legend.txt.

http://rebel13.nl/html/nn-500ms.html

Final release tomorrow.

Rebel · Post by **Rebel** » Tue Aug 13, 2019 8:27 am

Released.

http://rebel13.nl/misc/simex.html

Rebel · Post by **Rebel** » Thu Aug 15, 2019 10:50 am

Did another experiment, run engines at depth=1 only, canceling (the main) search, thus testing mainly the evaluation function for similarity. It isn't perfect since engines may vary how they handle checks in QS, I have also seen engines extend root moves. But in general it will just work.

Example:1 - http://rebel13.nl/html/depth1.html

And now a blast from the past - http://rebel13.nl/html/fruit.html

The original Rybka 1.0 looks okay, but it is known it obfuscates the depth with 2 plies so it actually has used depth=3. The patched version has this fixed and then the similarity with Fruit 2.1 is sky-high.

Rebel · Post by **Rebel** » Fri Aug 23, 2019 12:34 pm

Chris Whittington created SIMEX EPD sets of each 10,000 positions. Each set contains a specific piece distribution. In total 100 sets representing the most common board positions in use.

Distribution list - http://rebel13.nl/html/epd.txt

Example with SIMEX - http://rebel13.nl/html/cw2.html

Download of the 100 EPD sets - http://rebel13.nl/dl.html?file=dl/simex-epd.7z

Extract the download in the EPD folder.

chrisw · Post by **chrisw** » Fri Aug 23, 2019 4:06 pm

Rebel wrote: ↑Fri Aug 23, 2019 12:34 pm Chris Whittington created SIMEX EPD sets of each 10,000 positions. Each set contains a specific piece distribution. In total 100 sets representing the most common board positions in use.

Distribution list - http://rebel13.nl/html/epd.txt

Example with SIMEX - http://rebel13.nl/html/cw2.html

Download of the 100 EPD sets - http://rebel13.nl/dl.html?file=dl/simex-epd.7z

Extract the download in the EPD folder.

I might refine these if there’s any demand. At the moment, it selects all possible piece conformations by just the existence of KQRBNPkqrbnp, either pieces of that type for each side exist or they don’t.
Might be better to refine also by pawn count so the files would be named on basis
KQRBNP(x)kqrbnp(y) where x and y represent, say, 0 pawns, or 1-3 pawns, or 4-6 pawns, or 7-8 pawns. Four pawn categories. Adds another factor of 16 to the possible combinations.

Then I rank the combinations by frequency and save out the 100 (can be more) combinations most often occurring in games.

CMCanavessi · Post by **CMCanavessi** » Sun Aug 25, 2019 11:12 pm

Would be interesting to compare leela (with different networks) to all the other usual AB engines

Branko Radovanovic · Post by **Branko Radovanovic** » Mon Aug 26, 2019 2:11 am

CMCanavessi wrote: ↑Sun Aug 25, 2019 11:12 pm Would be interesting to compare leela (with different networks) to all the other usual AB engines

My impression - not sure if that's true or not, so I'd like to see it tested - is that SF10's play is more similar to LC0 than e.g. SF9 was. That would make sense because if LC0's style is the "chess truth", devoid of preconceptions, one would expect AB engines to gradually approach it. And therein lies the chance for top AB engines to compete against NNs: do (almost) the same, only more efficiently.

Also, for the same reason I'd expect AB engines to converge. I'm pretty sure e.g. Komodo and Stockfish are more similar now than they used to be 3 or 4 years ago. Unfortunately, when similarity tests are run, it's almost always one version per engine.

chrisw · Post by **chrisw** » Mon Aug 26, 2019 8:52 am

Branko Radovanovic wrote: ↑Mon Aug 26, 2019 2:11 am
CMCanavessi wrote: ↑Sun Aug 25, 2019 11:12 pm Would be interesting to compare leela (with different networks) to all the other usual AB engines
My impression - not sure if that's true or not, so I'd like to see it tested - is that SF10's play is more similar to LC0 than e.g. SF9 was. That would make sense because if LC0's style is the "chess truth", devoid of preconceptions, one would expect AB engines to gradually approach it.

not necessarily, if chess is a draw, or basically a stable game, as people claim. There’s many ways to get to a draw.

And therein lies the chance for top AB engines to compete against NNs: do (almost) the same, only more efficiently.

Also, for the same reason I'd expect AB engines to converge. I'm pretty sure e.g. Komodo and Stockfish are more similar now than they used to be 3 or 4 years ago. Unfortunately, when similarity tests are run, it's almost always one version per engine.

Rebel · Post by **Rebel** » Mon Aug 26, 2019 10:53 am

Branko Radovanovic wrote: ↑Mon Aug 26, 2019 2:11 am
CMCanavessi wrote: ↑Sun Aug 25, 2019 11:12 pm Would be interesting to compare leela (with different networks) to all the other usual AB engines
My impression - not sure if that's true or not, so I'd like to see it tested - is that SF10's play is more similar to LC0 than e.g. SF9 was. That would make sense because if LC0's style is the "chess truth", devoid of preconceptions, one would expect AB engines to gradually approach it. And therein lies the chance for top AB engines to compete against NNs: do (almost) the same, only more efficiently.

Also, for the same reason I'd expect AB engines to converge. I'm pretty sure e.g. Komodo and Stockfish are more similar now than they used to be 3 or 4 years ago. Unfortunately, when similarity tests are run, it's almost always one version per engine.

Did some Lc0 testing - http://rebel13.nl/html/kai.html

SF vs Lc0 very low similarity.

Branko Radovanovic · Post by **Branko Radovanovic** » Mon Aug 26, 2019 6:56 pm

Rebel wrote: ↑Mon Aug 26, 2019 10:53 am Did some Lc0 testing - http://rebel13.nl/html/kai.html

SF vs Lc0 very low similarity.

Indeed, and that's not surprising. No real difference between SF8 and SF10 vs NNs - that seems to prove my impression was not correct, though. And, finally, of all AB engines SF is the most similar to Lc0 (while remaining very far nevertheless) - that would again make sense to me.

Similarity tester - 2nd generation - BETA

Re: Similarity tester - 2nd generation - BETA

Re: Similarity tester - 2nd generation - BETA

Re: Similarity tester - 2nd generation - BETA

Re: Similarity tester - 2nd generation - BETA

Re: Similarity tester - 2nd generation - BETA

Re: Similarity tester - 2nd generation - BETA

Re: Similarity tester - 2nd generation - BETA

Re: Similarity tester - 2nd generation - BETA

Re: Similarity tester - 2nd generation - BETA

Re: Similarity tester - 2nd generation - BETA