OpenBench question
Moderators: hgm, Rebel, chrisw
-
- Posts: 6991
- Joined: Thu Aug 18, 2011 12:04 pm
OpenBench question
I downloaded the 4moves_noob.pgn opening set, it contains 1.8 million [!] positions. What me wonder, when you start a match with this opening suite, are the distributed positions random or sequential?
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 560
- Joined: Sun Nov 08, 2015 11:10 pm
-
- Posts: 6991
- Joined: Thu Aug 18, 2011 12:04 pm
Re: OpenBench question
Random seems to me the best option indeed as I noticed that sequential chunks are from the same game. OTOH, since matches are not equal regarding the openings, has it been tried to run the same match of (say) 20,000 games between Ethereal and Rubichess 4-5 times and that the results are about equal?
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 1753
- Joined: Tue Apr 19, 2016 6:08 am
- Location: U.S.A
- Full name: Andrew Grant
Re: OpenBench question
Opening selection is randomly. Which does mean lots of duplicate openings get played in the long run.
I have never tested what you proposed. Really you need two sets of tests:
1. First play a set of N games between two engines using the same openings, 5 times.
2. Then play a set of N randomly selected openings each time, 5 times.
I would hope that you get similar results. Sorta the basis for how all testing right now is done. I'm still surprised SPRT does what it does.
--
As an aside, you've reminded me that I need to convert those PGNs to EPDs. The Cutechess version I was using at the time did not have support for EPD. After getting custom builds to fix a castling bug, EPDs are supported and I can ensure that Clients can run the EPDs. Which are smaller, and would have saved me from compressing the opening books in order to by-pass filesize limits on Github.
I have never tested what you proposed. Really you need two sets of tests:
1. First play a set of N games between two engines using the same openings, 5 times.
2. Then play a set of N randomly selected openings each time, 5 times.
I would hope that you get similar results. Sorta the basis for how all testing right now is done. I'm still surprised SPRT does what it does.
--
As an aside, you've reminded me that I need to convert those PGNs to EPDs. The Cutechess version I was using at the time did not have support for EPD. After getting custom builds to fix a castling bug, EPDs are supported and I can ensure that Clients can run the EPDs. Which are smaller, and would have saved me from compressing the opening books in order to by-pass filesize limits on Github.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
-
- Posts: 6991
- Joined: Thu Aug 18, 2011 12:04 pm
Re: OpenBench question
Yes.AndrewGrant wrote: ↑Wed Apr 14, 2021 12:16 am Opening selection is randomly. Which does mean lots of duplicate openings get played in the long run.
I have never tested what you proposed. Really you need two sets of tests:
1. First play a set of N games between two engines using the same openings, 5 times.
2. Then play a set of N randomly selected openings each time, 5 times.
I would hope that you get similar results. Sorta the basis for how all testing right now is done. I'm still surprised SPRT does what it does.
Are you going to do it on OpenBench?, else I will when my PC's are free.
I shuffled the 1.8 million positions and split them into 18 parts of 100,000. Download - http://rebel13.nl/dump/noob.7z--
As an aside, you've reminded me that I need to convert those PGNs to EPDs. The Cutechess version I was using at the time did not have support for EPD. After getting custom builds to fix a castling bug, EPDs are supported and I can ensure that Clients can run the EPDs. Which are smaller, and would have saved me from compressing the opening books in order to by-pass filesize limits on Github.
Pick a set, test it sequential. Pick another set, test it sequential and compare if results match better.
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 6991
- Joined: Thu Aug 18, 2011 12:04 pm
Re: OpenBench question
I did both, first: - 10,000 games random, 10 times divided over 2 PC's. TC=40/10, EPD=noob.epd (1.8 million)Rebel wrote: ↑Wed Apr 14, 2021 8:26 am I shuffled the 1.8 million positions and split them into 18 parts of 100,000. Download - http://rebel13.nl/dump/noob.7z
Pick a set, test it sequential. Pick another set, test it sequential and compare if results match better.
Code: Select all
RANDOM
Intel i7 3.6 Ghz Intel i7 3.2 Ghz
Round Ethereal Rubichess Round Ethereal Rubichess
1 *49.3%* 50.7% 1 50.4% 49.6%
2 49.9% 50.1% 2 50.6% 49.4%
3 49.8% 50.2% 3 50.1% 49.9%
4 49.9% 50.1% 4 50.5% 49.5%
5 *50.3%* 49.7% 5 50.8% 49.2%
Second - 10,000 games sequential, 10 times divided over 2 PC's. TC=40/10, EPD=noob.001.epd (100,000 shuffled)
Code: Select all
SEQUENTIAL
Intel i7 3.6 Ghz Intel i7 3.2 Ghz
Round Ethereal Rubichess Round Ethereal Rubichess
1 50.0% 50.0% 1 50.5% 49.5%
2 50.3% 49.7% 2 *51.1%* 48.9%
3 49.4% 50.6% 3 *50.1%* 49.9%
4 4
5 5
What have I proven? That 10,000 games is not enough for a reliable test, but that is old news.
BTW, I used the latest versions of Ethereal and Rubi.
90% of coding is debugging, the other 10% is writing bugs.