OKE - Opening Knowledge Engines

xr_a_y · Post by **xr_a_y** » Mon Jun 10, 2019 9:06 pm

xr_a_y wrote: ↑Mon Jun 10, 2019 8:15 pm
Rebel wrote: ↑Mon Jun 10, 2019 7:34 pm http://rebel13.nl/2-moves.html

Same 30 engines, now using 2-moves.epd, only 404 positions.

#1. Look at Gideon from 1993, 5th place before SF10. A time before the internet, null-move was not invented, neither reductions, neither futility pruning etc. A time when a ply was a ply and you could count the number of moves of a combination and calculate the iteration an engine had to find the right move. A time also that all the pruning and reductions did not affect the evaluation function in a negative way.

#2. The once almighty Rybka at the bottom.

Next: 3-moves.epd (1221 positions) now running.
Code: Select all
Minic @100ms  per position is scoring 13240 for the 2-moves test.
Minic @1000ms per position is scoring 15076 for the 2-moves test.
This seems a little too strong for Minic no ?

and

Code: Select all

Minic @1000ms per position is scoring 118933 for the 4-moves test

Rebel · Post by **Rebel** » Mon Jun 10, 2019 9:24 pm

Laskos wrote: ↑Mon Jun 10, 2019 7:49 pm How did you pick the 2800 positions from 80,000 games? Were they in some way special? Did you use some analysis tool to pick them?

With a tool, to be released soon. In a nutshell:
- pick a PGN. In this case 80,000 games between 2600+ players, unlikely garbage

. select ply-from and ply-till, in the 4-moves case, ply-from=0, ply-till=8
. it will create an EPD with the first 4½ moves
. remove the doubles
. Load a good book, I used the ProDeo book because it's hand-typed by Jeroen Noomen, no comp games.
. Any EPD found in the ProDeo book that is playable is selected and stored in the final EPD for OKE.

Laskos wrote: ↑Mon Jun 10, 2019 7:49 pmThe results I see are strange. Lc0 positionally in the openings is far ahead of anything, so it managed to get to the top in your test too.

I agree with that.

Laskos wrote: ↑Mon Jun 10, 2019 7:49 pm But the rest seems a bit scrambled. What's that Rodent result?

Ask the author.

Laskos wrote: ↑Mon Jun 10, 2019 7:49 pm And many others. Also, what's that Lc0 worsens from 1s to 5s?

No idea.

Laskos wrote: ↑Mon Jun 10, 2019 7:49 pm I estimate 1 standard deviation of the difference between engines to be about 800-1000 points in your scoring.

Which should not surprise you too much, the opening is an area that doesn't get the attention it deserves. You are knowledgeable enough to figure it out why that is and I plea guilty as well.

Laskos wrote: ↑Mon Jun 10, 2019 7:49 pm I have my custom made (hand-made) test-suite of 200 opening positions collected from a large database of human games (Elo above 2200, about 2 million games), but I had to pick by hand interesting positions according to the statistic of outcomes (and only marginally using engines' analysis to trim out tactical complications), it took some time to build it. Repeated 5 times to diminish the noise, I am using 1000 opening positions in this positional test suite with these much more normal results:
Code: Select all
4 i7 cores at 3.80GHz
RTX 2070 GPU

Openings1000.epd test-suite 
Sticks to the solution from 1s to 2s of thinking:

Lc0 v21.2 ID42524  757/1000
Lc0 v21.2 ID32930  727/1000
Lc0 v21.2 ID11261  723/1000
Stockfish_dev      574/1000
Houdini 6.03       558/1000
Komodo 13.02       556/1000
Xiphos 0.5         513/1000
Booot 6.3.1        494/1000
Andscacs 0.95      484/1000
Laser 1.7          480/1000
Ethereal 11.25     467/1000
Texel 1.07         419/1000
Rodent III         376/1000
Fruit 2.1          348/1000
BikJump 2.01       276/1000
Predateur 2.2.1    265/1000

Care to share your suite so I can test it?

Laskos wrote: ↑Mon Jun 10, 2019 7:49 pm I estimate my standard deviation to be about 10 points here. I frankly think that you have a lot of garbage in your suite.

Keep the garbage to yourself.

xr_a_y · Post by **xr_a_y** » Mon Jun 10, 2019 9:26 pm

xr_a_y wrote: ↑Mon Jun 10, 2019 9:06 pm
xr_a_y wrote: ↑Mon Jun 10, 2019 8:15 pm
Rebel wrote: ↑Mon Jun 10, 2019 7:34 pm http://rebel13.nl/2-moves.html

Same 30 engines, now using 2-moves.epd, only 404 positions.

#1. Look at Gideon from 1993, 5th place before SF10. A time before the internet, null-move was not invented, neither reductions, neither futility pruning etc. A time when a ply was a ply and you could count the number of moves of a combination and calculate the iteration an engine had to find the right move. A time also that all the pruning and reductions did not affect the evaluation function in a negative way.

#2. The once almighty Rybka at the bottom.

Next: 3-moves.epd (1221 positions) now running.
Code: Select all
Minic @100ms  per position is scoring 13240 for the 2-moves test.
Minic @1000ms per position is scoring 15076 for the 2-moves test.
This seems a little too strong for Minic no ?
and
Code: Select all
Minic @1000ms per position is scoring 118933 for the 4-moves test

and

Code: Select all

Minic @1000ms per position is scoring 50197 for the 3-moves test

Guenther · Post by **Guenther** » Mon Jun 10, 2019 9:28 pm

Rebel wrote: ↑Mon Jun 10, 2019 7:34 pm http://rebel13.nl/2-moves.html

Same 30 engines, now using 2-moves.epd, only 404 positions.

#1. Look at Gideon from 1993, 5th place before SF10. A time before the internet, null-move was not invented, neither reductions, neither futility pruning etc. A time when a ply was a ply and you could count the number of moves of a combination and calculate the iteration an engine had to find the right move. A time also that all the pruning and reductions did not affect the evaluation function in a negative way.

#2. The once almighty Rybka at the bottom.

Next: 3-moves.epd (1221 positions) now running.

Your system of adding points is completely bogus, sorry.
I have downloaded your 2moves epd and already the first two positions show
that you have not thought much about it, or were too biased.

[d]r1bqkb1r/pppppppp/2n2n2/8/2PP4/8/PP2PPPP/RNBQKBNR w KQkq

Code: Select all

r1bqkb1r/pppppppp/2n2n2/8/2PP4/8/PP2PPPP/RNBQKBNR w KQkq - c0 "Nc3=53, Nf3=46";

3. d5 / 3.g3 are surely as good - still 0 points? What about 3. Bg5? /3. Bf4?

[d]r1bqkbnr/pp1ppppp/2n5/2p5/2P5/2N5/PP1PPPPP/R1BQKBNR w KQkq

Code: Select all

r1bqkbnr/pp1ppppp/2n5/2p5/2P5/2N5/PP1PPPPP/R1BQKBNR w KQkq - c0 "Nf3=100";

again 3. g3 and 3. e4 are as good - still 0 points

After those first two lines I stopped looking further.
It is impossible to create an opening test out of this in a few hours even for 2 moves epd, actually
it would need several days or weeks and a good player and an opening encyclopedia and the understanding
that in early stage there are often not best moves or only moves and it is just a matter of taste to chose
between good and equal moves.

It could be seen already from your first statement when you favoured 1.e4/1.d4 (10) over 1.c4/1.Nf3 (8) nothing
in chess nowadays suggests those are inferior. You just need to know how to continue or what setup to reach
and this is far beyond 4 moves of course.

All of this is an extreme case of being too simplistic.

xr_a_y · Post by **xr_a_y** » Mon Jun 10, 2019 9:30 pm

xr_a_y wrote: ↑Mon Jun 10, 2019 9:26 pm
xr_a_y wrote: ↑Mon Jun 10, 2019 9:06 pm
xr_a_y wrote: ↑Mon Jun 10, 2019 8:15 pm
Rebel wrote: ↑Mon Jun 10, 2019 7:34 pm http://rebel13.nl/2-moves.html

Same 30 engines, now using 2-moves.epd, only 404 positions.

#1. Look at Gideon from 1993, 5th place before SF10. A time before the internet, null-move was not invented, neither reductions, neither futility pruning etc. A time when a ply was a ply and you could count the number of moves of a combination and calculate the iteration an engine had to find the right move. A time also that all the pruning and reductions did not affect the evaluation function in a negative way.

#2. The once almighty Rybka at the bottom.

Next: 3-moves.epd (1221 positions) now running.
Code: Select all
Minic @100ms  per position is scoring 13240 for the 2-moves test.
Minic @1000ms per position is scoring 15076 for the 2-moves test.
This seems a little too strong for Minic no ?
and
Code: Select all
Minic @1000ms per position is scoring 118933 for the 4-moves test
and
Code: Select all
Minic @1000ms per position is scoring 50197 for the 3-moves test

One issue ... if I activate Minic internal book, the 2-moves test falls to only 9674 ...
So what ? my book is totally cooked ?

Rebel · Post by **Rebel** » Mon Jun 10, 2019 11:16 pm

Guenther wrote: ↑Mon Jun 10, 2019 9:28 pm Your system of adding points is completely bogus, sorry.
I have downloaded your 2moves epd and already the first two positions show
that you have not thought much about it, or were too biased.

Apparently 81,900 games are not enough, I will look into it.

Biased and bogus regards.

Dann Corbit · Post by **Dann Corbit** » Tue Jun 11, 2019 1:12 am

Here is a link to Ed's EPD data after the redundant positions have been removed (mostly due to frivolous e.p.):
https://drive.google.com/open?id=1SYvUc ... 1_Nz_Q6in2
There are 2488 distinct positions.
Of those, 822 have analyzed depth of less than 36 in my database, and are hence unsuitable for study at this point.
I will analyze all of those to 36 plies or greater.
I will also run multi-pv analysis on the positions after the 36 ply analysis has finished.

In the archive the link points to, there are :
1. the 2488 bare EPD positions
2. the bare EPD positions with bm supplied (though about 1/3 of the data is useless at this point)
3. the 2488 positions with analysis and notes (non-standard EPD, load into your chess program only while wearing a helmet).

Dann Corbit · Post by **Dann Corbit** » Tue Jun 11, 2019 1:21 am

For these two, I have this:
[d]r1bqkb1r/pppppppp/2n2n2/8/2PP4/8/PP2PPPP/RNBQKBNR w KQkq - acd 37; acs 2723; bm Nf3; cce 49; ce 37; pm Nf3 {3016} Nc3 {583} d5 {108} e3 {7} g3 {5} Bf4 {2} f4 {2} a3 {1} f3 {1}; pv Nf3 e6 a3 d5 e3 Be7 Nbd2 Nb8 Bd3 Nbd7 O-O c5 Ne5 O-O b3 b6 Nc6 Qe8 Nxe7+ Qxe7 a4 Bb7 Bb2 a5 Rc1 Rfd8 Nf3 Ne4 Re1 Rac8 Qc2 h6 h3 Qd6 Qd1 Qc7 cxd5 Bxd5 Re2 Bb7 Bb5;

[d]r1bqkbnr/pp1ppppp/2n5/2p5/2P5/2N5/PP1PPPPP/R1BQKBNR w KQkq - acd 39; acs 3601; bm e3; cce 24; ce 8; pm g3 {6758} Nf3 {2008} e3 {318} e4 {202} d3 {62} b3 {43} f4 {5} Rb1 {2} a3 {2} Qc2 {1}; pv e3 Nf6 Nf3 e6 Be2 Be7 d4 d5 O-O cxd4 exd4 dxc4 Bxc4 O-O a3 a6 Bf4 Bd6 Be5 b5 Bd3 Be7 a4 Bb7 axb5 axb5 Qe2 Rxa1 Rxa1 b4 Nb5 Nxe5 dxe5 Nh5 Be4 Nf4 Qe3 Bxe4 Qxe4;

eyeballing my alternate evals table, it looks to me like Nf3 is just about as good as e3 for the second position.

Dann Corbit · Post by **Dann Corbit** » Tue Jun 11, 2019 1:23 am

Here with the opening names also:
r1bqkb1r/pppppppp/2n2n2/8/2PP4/8/PP2PPPP/RNBQKBNR w KQkq - acd 37; acs 2723; bm Nf3; c3 "Nf3"; cce 49; ce 37; pm Nf3 {3016} Nc3 {583} d5 {108} e3 {7} g3 {5} Bf4 {2} f4 {2} a3 {1} f3 {1}; pv Nf3 e6 a3 d5 e3 Be7 Nbd2 Nb8 Bd3 Nbd7 O-O c5 Ne5 O-O b3 b6 Nc6 Qe8 Nxe7+ Qxe7 a4 Bb7 Bb2 a5 Rc1 Rfd8 Nf3 Ne4 Re1 Rac8 Qc2 h6 h3 Qd6 Qd1 Qc7 cxd5 Bxd5 Re2 Bb7 Bb5; Opening Mexican Defense: General. 1.d4 Nf6 2.c4 Nc6; CaxtonID: 1279; ECO: A50;

r1bqkbnr/pp1ppppp/2n5/2p5/2P5/2N5/PP1PPPPP/R1BQKBNR w KQkq - acd 39; acs 3601; bm e3; c3 "g3"; cce 24; ce 8; pm g3 {6758} Nf3 {2008} e3 {318} e4 {202} d3 {62} b3 {43} f4 {5} Rb1 {2} a3 {2} Qc2 {1}; pv e3 Nf6 Nf3 e6 Be2 Be7 d4 d5 O-O cxd4 exd4 dxc4 Bxc4 O-O a3 a6 Bf4 Bd6 Be5 b5 Bd3 Be7 a4 Bb7 axb5 axb5 Qe2 Rxa1 Rxa1 b4 Nb5 Nxe5 dxe5 Nh5 Be4 Nf4 Qe3 Bxe4 Qxe4; Opening English Opening: Symmetrical Variation. Two Knights Variation 1.c4 c5 2.Nc3 Nc6; CaxtonID: 528; ECO: A35;

leavenfish · Post by **leavenfish** » Tue Jun 11, 2019 2:59 am

Dann Corbit wrote: ↑Tue Jun 11, 2019 1:21 am
eyeballing my alternate evals table, it looks to me like Nf3 is just about as good as e3 for the second position.

Ummm....I think these diagrams are FAR too few moves into a game to 'test' any engine on. I would not think any result would be worthwhile.

OKE - Opening Knowledge Engines

Re: OKE - Opening Knowledge Engines

Re: OKE - Opening Knowledge Engines

Re: OKE - Opening Knowledge Engines

Re: OKE - Opening Knowledge Engines

Re: OKE - Opening Knowledge Engines

Re: OKE - Opening Knowledge Engines

Re: OKE - Opening Knowledge Engines

Re: OKE - Opening Knowledge Engines

Re: OKE - Opening Knowledge Engines

Re: OKE - Opening Knowledge Engines