Page 1 of 10

OKE - Opening Knowledge Engines

Posted: Mon Jun 10, 2019 6:28 am
by Rebel
From another thread -
Rebel wrote: Mon Jun 03, 2019 3:29 pm I am disappointed, I was wondering if it made sense SF could build an opening book by itself using Multi-PV. First results (analysis time 100ms) looked promising 48.3% against the ProDeo book, 49.4% (analysis time 1000ms), now trying 2000ms but I think I am wasting my time looking at the moves it sometimes produces. Was hoping for a bug that could be fixed but looking at the responses that's unlikely.
And -
Rebel wrote: Fri Jun 07, 2019 12:32 pm One idea would be to create a STS type of opening set rating engines with MEA from Ferdy.

Such as:
1.d4 and 1.e4 10 points
1.c4 and 1.Nf3 8 points
1.f4 6 points

etc.
And so I did, in search for the engine that can play the opening best. I created an opening set from 80,000+ games between (human) players of 2600+ elo for the first 4 moves resulting in 2836 epd positions and ran them with Ferdy's tool MEA creating a sorted list. So far I tested 26 engines and there are quite some surprises.

http://rebel13.nl/4-moves.html

#1. While Lc0 tops note that this is the slow CPU version and that 1 sec scores better than 5 secs :)

#2. That Rodent III is another surprise.

#3. That there isn't much difference between SF5 and SF10.

#4. Komodo 10 at 5 secs is almost equal to ProDeo 2.6

I am looking for candidates like Rodent.

Re: OKE - Opening Knowledge Engines

Posted: Mon Jun 10, 2019 9:57 am
by xr_a_y
Great stuff,

Can I ran the test on my engine by myself ?

Re: OKE - Opening Knowledge Engines

Posted: Mon Jun 10, 2019 10:27 am
by Joerg Oster
Interesting.
It looks like there is a LOT of room for improvement, in general.

Re: OKE - Opening Knowledge Engines

Posted: Mon Jun 10, 2019 11:18 am
by Rebel
xr_a_y wrote: Mon Jun 10, 2019 9:57 am Great stuff,

Can I ran the test on my engine by myself ?
Yes, within a few days.

Re: OKE - Opening Knowledge Engines

Posted: Mon Jun 10, 2019 12:30 pm
by xr_a_y
Rebel wrote: Mon Jun 10, 2019 11:18 am
xr_a_y wrote: Mon Jun 10, 2019 9:57 am Great stuff,

Can I ran the test on my engine by myself ?
Yes, within a few days.
Great, can you give us the positions and the how-to please.

Re: OKE - Opening Knowledge Engines

Posted: Mon Jun 10, 2019 6:21 pm
by Rebel
mea-epd.zip
xr_a_y wrote: Mon Jun 10, 2019 12:30 pm
Rebel wrote: Mon Jun 10, 2019 11:18 am
mea-epd.zip
xr_a_y wrote: Mon Jun 10, 2019 9:57 am Great stuff,

Can I ran the test on my engine by myself ?
Yes, within a few days.
Great, can you give us the positions and the how-to please.
Positions for 2, 3 and 4 moves attached and then Download MEA. Else have a bit of patience.

Re: OKE - Opening Knowledge Engines

Posted: Mon Jun 10, 2019 6:59 pm
by xr_a_y
Rebel wrote: Mon Jun 10, 2019 6:21 pm mea-epd.zip
xr_a_y wrote: Mon Jun 10, 2019 12:30 pm
Rebel wrote: Mon Jun 10, 2019 11:18 ammea-epd.zip
xr_a_y wrote: Mon Jun 10, 2019 9:57 am Great stuff,

Can I ran the test on my engine by myself ?
Yes, within a few days.
Great, can you give us the positions and the how-to please.
Positions for 2, 3 and 4 moves attached and then Download MEA. Else have a bit of patience.
Thanks, I didn't wanna rush you, sorry.

Re: OKE - Opening Knowledge Engines

Posted: Mon Jun 10, 2019 7:34 pm
by Rebel
http://rebel13.nl/2-moves.html

Same 30 engines, now using 2-moves.epd, only 404 positions.

#1. Look at Gideon from 1993, 5th place before SF10. A time before the internet, null-move was not invented, neither reductions, neither futility pruning etc. A time when a ply was a ply and you could count the number of moves of a combination and calculate the iteration an engine had to find the right move. A time also that all the pruning and reductions did not affect the evaluation function in a negative way.

#2. The once almighty Rybka at the bottom.

Next: 3-moves.epd (1221 positions) now running.

Re: OKE - Opening Knowledge Engines

Posted: Mon Jun 10, 2019 7:49 pm
by Laskos
How did you pick the 2800 positions from 80,000 games? Were they in some way special? Did you use some analysis tool to pick them?
The results I see are strange. Lc0 positionally in the openings is far ahead of anything, so it managed to get to the top in your test too. But the rest seems a bit scrambled. What's that Rodent result? And many others. Also, what's that Lc0 worsens from 1s to 5s?
I estimate 1 standard deviation of the difference between engines to be about 800-1000 points in your scoring.

I have my custom made (hand-made) test-suite of 200 opening positions collected from a large database of human games (Elo above 2200, about 2 million games), but I had to pick by hand interesting positions according to the statistic of outcomes (and only marginally using engines' analysis to trim out tactical complications), it took some time to build it. Repeated 5 times to diminish the noise, I am using 1000 opening positions in this positional test suite with these much more normal results:

Code: Select all

4 i7 cores at 3.80GHz
RTX 2070 GPU

Openings1000.epd test-suite 
Sticks to the solution from 1s to 2s of thinking:

Lc0 v21.2 ID42524  757/1000
Lc0 v21.2 ID32930  727/1000
Lc0 v21.2 ID11261  723/1000
Stockfish_dev      574/1000
Houdini 6.03       558/1000
Komodo 13.02       556/1000
Xiphos 0.5         513/1000
Booot 6.3.1        494/1000
Andscacs 0.95      484/1000
Laser 1.7          480/1000
Ethereal 11.25     467/1000
Texel 1.07         419/1000
Rodent III         376/1000
Fruit 2.1          348/1000
BikJump 2.01       276/1000
Predateur 2.2.1    265/1000
I estimate my standard deviation to be about 10 points here. I frankly think that you have a lot of garbage in your suite.

Re: OKE - Opening Knowledge Engines

Posted: Mon Jun 10, 2019 8:15 pm
by xr_a_y
Rebel wrote: Mon Jun 10, 2019 7:34 pm http://rebel13.nl/2-moves.html

Same 30 engines, now using 2-moves.epd, only 404 positions.

#1. Look at Gideon from 1993, 5th place before SF10. A time before the internet, null-move was not invented, neither reductions, neither futility pruning etc. A time when a ply was a ply and you could count the number of moves of a combination and calculate the iteration an engine had to find the right move. A time also that all the pruning and reductions did not affect the evaluation function in a negative way.

#2. The once almighty Rybka at the bottom.

Next: 3-moves.epd (1221 positions) now running.

Code: Select all

Minic @100ms  per position is scoring 13240 for the 2-moves test.
Minic @1000ms per position is scoring 15076 for the 2-moves test.
This seems a little too strong for Minic no ?