New sets of postions for NICE/MEA (60k & 274k)

majkelnowaq · Post by **majkelnowaq** » Wed May 06, 2020 10:36 am

I prepared two new test-sets for NICE/MEA. With a small amount of cores and time I couldn't ofc just analyze thousands of positions with good quality so I used some trick instead.

Positions refer mainly to opening and early middlegame, no multipv, just one best move. Used brainfish with latest cerebellum (with option best move checked). Only positions where cerebellum had its answer are included, I believe it gives decent quality.

Smaller set has 60 585 positions and a bigger one over 274 000. Bigger set has included positions from first set. Positions picked randomly, one has possibility to split big set to desired amount (for example by excel).

https://www19.zippyshare.com/v/SSSO6O7p/file.html
http://www.mediafire.com/file/rnlwr8mcw ... h.zip/file

Dann Corbit · Post by **Dann Corbit** » Wed May 06, 2020 5:04 pm

Why not just match engine generated answers with cerebellum book answers and you don't have to generate any positions.

majkelnowaq · Post by **majkelnowaq** » Wed May 06, 2020 5:44 pm

Dann Corbit wrote: ↑Wed May 06, 2020 5:04 pm Why not just match engine generated answers with cerebellum book answers and you don't have to generate any positions.

The main reason for this is - I and many others probably wouldn't know how to make that matching at big scale and then compare score. My sets are compatible with NICE and MEA so there is an easy way to test them (by time or depth).

I made these sets because sets with multipv are often confusing. I wrote it before and ll write it once again - stockfish with multipv is bad to make such sets. Usually all pv lines has different depth (after analysis) and it can produces many positions where second or further moves have bigger score than first pv. Other fact is that in multipv sf has its weird behaviors, for example from start position it can suggests e3 as first move (or in first three) while analyzing, but in normal games it would never play it or maybe once every thousand games.

Anyway Im quite interested how to perform process You described. Is there any existing software which can execute such matching "engine generated answers with cerebellum book answers" ?

Damir · Post by **Damir** » Wed May 06, 2020 6:18 pm

Hi Majkel

I am looking forward to new HekasStockfish with many tunable parameters..

Dann Corbit · Post by **Dann Corbit** » Wed May 06, 2020 6:36 pm

majkelnowaq wrote: ↑Wed May 06, 2020 5:44 pm
Dann Corbit wrote: ↑Wed May 06, 2020 5:04 pm Why not just match engine generated answers with cerebellum book answers and you don't have to generate any positions.
The main reason for this is - I and many others probably wouldn't know how to make that matching at big scale and then compare score. My sets are compatible with NICE and MEA so there is an easy way to test them (by time or depth).

I made these sets because sets with multipv are often confusing. I wrote it before and ll write it once again - stockfish with multipv is bad to make such sets. Usually all pv lines has different depth (after analysis) and it can produces many positions where second or further moves have bigger score than first pv. Other fact is that in multipv sf has its weird behaviors, for example from start position it can suggests e3 as first move (or in first three) while analyzing, but in normal games it would never play it or maybe once every thousand games.

Anyway Im quite interested how to perform process You described. Is there any existing software which can execute such matching "engine generated answers with cerebellum book answers" ?

The source code for probing the cerebellum book comes with the brainfish software.

majkelnowaq · Post by **majkelnowaq** » Wed May 06, 2020 8:56 pm

Damir wrote: ↑Wed May 06, 2020 6:18 pm i Majkel

I am looking forward to new HekasStockfish with many tunable parameters..

Its not a problem to make version with many parameters, but:
- no warranty that any of them can bring elo (and to really check it one needs literally hundreds of cores);
- the more parameters - more slowdown of searching speed,
- most users don't care about tuning on one's own, they want just strong engine, best if it's at least slightly stronger than sf dev.

Anyway I have some ideas like fluid multipv and maybe someday I ll release HekaStockfish 1, I want it too

Dann Corbit wrote: ↑Wed May 06, 2020 6:36 pm The source code for probing the cerebellum book comes with the brainfish software.

I know this code but it's hard to call it 'existing software'. Maybe there is a possibility to make it by some commands or python script but still I can't imagine how to do it without set of positions and without external tool/code. Ok, You can have your way to execute it, but I suppose that most average users without programming knowledge wouldn't be able to match engine generated answers with cerebellum book answers in some simple and comparative approach...

Dann Corbit · Post by **Dann Corbit** » Wed May 06, 2020 9:00 pm

I would probably write a tcp/ip server that reads the cerebellum book and gives the move choice.

The same could be done with noob's database (and I guess it would be better, for that matter, and the server is already written).

Rebel · Post by **Rebel** » Thu May 07, 2020 12:54 pm

majkelnowaq wrote: ↑Wed May 06, 2020 5:44 pm
Dann Corbit wrote: ↑Wed May 06, 2020 5:04 pm Why not just match engine generated answers with cerebellum book answers and you don't have to generate any positions.
The main reason for this is - I and many others probably wouldn't know how to make that matching at big scale and then compare score. My sets are compatible with NICE and MEA so there is an easy way to test them (by time or depth).

I made these sets because sets with multipv are often confusing. I wrote it before and ll write it once again - stockfish with multipv is bad to make such sets. Usually all pv lines has different depth (after analysis) and it can produces many positions where second or further moves have bigger score than first pv.

This is correct, except that the many positions you speak of actually are only a fraction of the total positions and because of volume it is irrelevant for the final result. Feel free to disagree

Like you I started with one best move = 10 points and it doesn't work, I ran your cerebe-lite.epd with 10 engine and despite the excellent volume it can not even list the engines on ccrl elo, let alone you can use it for engine tuning.

http://rebel13.nl/mea/cerebe-lite.html

MultiPV with its flaws is essential.

Damir · Post by **Damir** » Thu May 07, 2020 3:28 pm

majkelnowaq wrote: ↑Wed May 06, 2020 8:56 pm
Damir wrote: ↑Wed May 06, 2020 6:18 pm i Majkel

I am looking forward to new HekasStockfish with many tunable parameters..
Its not a problem to make version with many parameters, but:
- no warranty that any of them can bring elo (and to really check it one needs literally hundreds of cores);
- the more parameters - more slowdown of searching speed,
- most users don't care about tuning on one's own, they want just strong engine, best if it's at least slightly stronger than sf dev.

I am not like the most users... I like to tune things on my own... In Heka's case the more parameters it has to tune the better, as long as it does not affect the playing strength.

majkelnowaq · Post by **majkelnowaq** » Thu May 07, 2020 6:49 pm

Rebel wrote: ↑Thu May 07, 2020 12:54 pm This is correct, except that the many positions you speak of actually are only a fraction of the total positions and because of volume it is irrelevant for the final result. Feel free to disagree

Like you I started with one best move = 10 points and it doesn't work, I ran your cerebe-lite.epd with 10 engine and despite the excellent volume it can not even list the engines on ccrl elo, let alone you can use it for engine tuning.

http://rebel13.nl/mea/cerebe-lite.html

MultiPV with its flaws is essential.

I suppose there are some ways to force sf to print moves in multipv with the same depth. In past I created some version which was able to do it, but only with limits.depth not time (and had some bugs I guess). Komodo seems to not use multipv loop like Sf, it prints all moves and the same depth at once but it significantly weaker than Sf so it isn't best option to build set. I don't disagree with you at all. I think like you that multipv=1 isn't enough and I bet that Sf with proper move order (same depth) and even higher multipv than 4 (10 or a bit more would be great)... and decent depth (at least above 30 but 40 sounds better).. and maybe some less reduction/pruning would be the best test-set creator.

Btw I managed to use NICE with nodestime too, just changed Sf's uci.cpp in a way it treats go depth like go nodes.

Damir wrote: ↑Thu May 07, 2020 3:28 pm I am not like the most users... I like to tune things on my own... In Heka's case the more parameters it has to tune the better, as long as it does not affect the playing strength.

Yes, big choice of parameters is a good thing. Sometimes I test some hard positions with engine and just want to have option to change somehow its evaluation or search to find out if it ll better perform task. Another example when I try to beat Brainfish or other strong book line with my engine and even very high depths don't help, then Im changing some parameter and my engine finds some new moves to surprise opponent. One thing for sure - if/when next Heka ll become real it would have such freedom of choice, but I also want that parameters to be really useful and bugless, it all takes time.

New sets of postions for NICE/MEA (60k & 274k)

New sets of postions for NICE/MEA (60k & 274k)

Re: New sets of postions for NICE/MEA (60k & 274k)

Re: New sets of postions for NICE/MEA (60k & 274k)

Re: New sets of postions for NICE/MEA (60k & 274k)

Re: New sets of postions for NICE/MEA (60k & 274k)

Re: New sets of postions for NICE/MEA (60k & 274k)

Re: New sets of postions for NICE/MEA (60k & 274k)

Re: New sets of postions for NICE/MEA (60k & 274k)

Re: New sets of postions for NICE/MEA (60k & 274k)

Re: New sets of postions for NICE/MEA (60k & 274k)