OKE - one year after

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: OKE - one year after

Post by Rebel »

Terje wrote: Tue Sep 08, 2020 5:33 pm
As expected Weiss is no opening master :D

Would be interesting to see normal minic on there for comparison.
Done.

the 6-moves.html is fun, the NNUE version third place from the top, the normal version third place from the bottom.
90% of coding is debugging, the other 10% is writing bugs.
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: OKE - one year after

Post by Alayan »

What's the reference used to judge opening move quality ? I see "2, 3 and 4-moves.epd are created from 2600+ elo rated human players.", is this related to what moves are considered good/bad ?

The score range of engines is very small compared to the minimum/maximum possible scores.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: OKE - one year after

Post by Rebel »

Alayan wrote: Tue Sep 08, 2020 9:29 pm What's the reference used to judge opening move quality ? I see "2, 3 and 4-moves.epd are created from 2600+ elo rated human players.", is this related to what moves are considered good/bad ?
The assumption is that 2600+ elo rated players seldom blunder the first 6 moves, let alone in the first 2-3 moves.
Alayan wrote: Tue Sep 08, 2020 9:29 pm The score range of engines is very small compared to the minimum/maximum possible scores.
True.
90% of coding is debugging, the other 10% is writing bugs.
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: OKE - one year after

Post by Alayan »

Rebel wrote: Tue Sep 08, 2020 9:50 pm
Alayan wrote: Tue Sep 08, 2020 9:29 pm What's the reference used to judge opening move quality ? I see "2, 3 and 4-moves.epd are created from 2600+ elo rated human players.", is this related to what moves are considered good/bad ?
The assumption is that 2600+ elo rated players seldom blunder the first 6 moves, let alone in the first 2-3 moves.
They don't blunder, but then neither do engines.

In the end, the challenge is to find moves that lead into positions where the most complications exists for your opponent while conceding the least possible, so as to maximize the expected score.

Human GMs are rather good at doing choices that maximize their expected score, but there is a strong specialization bias - try to go into lines you know very well and avoid lines your opponent knows very well.

The expected score of a line might very well be different for 3000+ engines than for 2600+ humans. The expected score of a line is also a function of the rating difference between the two playing entities - the optimal line vs a stronger opponent is drawish, while it is more forceful and imbalanced vs a weaker opponent. In this regard, there is often no absolute best move in balanced drawn positions.

Openings that are objectively worse might also get selected by a human GM because they offer a higher chance to get a win that's needed to get a good tournament placement.

In short, I'm skeptical about the methodology.
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: OKE - one year after

Post by Dann Corbit »

Alayan wrote: Tue Sep 08, 2020 10:07 pm In short, I'm skeptical about the methodology.
It is a good question, and perhaps a mix would be better.

I guess that the bottom line is, "The proof of the pudding is in the eating."

Which is to say, when we try it does it work or not?
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: OKE - one year after

Post by Rebel »

Alayan wrote: Tue Sep 08, 2020 10:07 pm
Rebel wrote: Tue Sep 08, 2020 9:50 pm
Alayan wrote: Tue Sep 08, 2020 9:29 pm What's the reference used to judge opening move quality ? I see "2, 3 and 4-moves.epd are created from 2600+ elo rated human players.", is this related to what moves are considered good/bad ?
The assumption is that 2600+ elo rated players seldom blunder the first 6 moves, let alone in the first 2-3 moves.
They don't blunder, but then neither do engines.
They do, they don't give away a piece, but they don't understand the strategy or concept behind (certain) openings and don't play the right moves, humans on that level do.
Alayan wrote: Tue Sep 08, 2020 10:07 pm In the end, the challenge is to find moves that lead into positions where the most complications exists for your opponent while conceding the least possible, so as to maximize the expected score.

Human GMs are rather good at doing choices that maximize their expected score, but there is a strong specialization bias - try to go into lines you know very well and avoid lines your opponent knows very well.

The expected score of a line might very well be different for 3000+ engines than for 2600+ humans. The expected score of a line is also a function of the rating difference between the two playing entities - the optimal line vs a stronger opponent is drawish, while it is more forceful and imbalanced vs a weaker opponent. In this regard, there is often no absolute best move in balanced drawn positions.

Openings that are objectively worse might also get selected by a human GM because they offer a higher chance to get a win that's needed to get a good tournament placement.

In short, I'm skeptical about the methodology.
The only thing I am skeptical about is the lack of volume. Besides, there is no alternative, you can't use comp games to test how well comps play the early opening, one needs human games. Besides (2) - Certain top-engines score badly on 2-3-4 movers, for instance Komodo, take a 50-100 lower rated engine that is higher placed than Komodo and let it play cutechess with 2-3-4 movers, Komodo might lose. Try Ethereal (CCRL 3374) vs Stockfish 10 (CCRL 3491), I bet you Ethereal will do good on 2-3 movers. I did some tests in the past and that is what it showed.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: OKE - one year after

Post by Rebel »

Added Booot 6.4, Chiron 4 and Igel NNUE.

NNUE seems to be the way to improve the early opening.

http://rebel13.nl/html/2-moves.html
http://rebel13.nl/html/3-moves.html
http://rebel13.nl/html/4-moves.html
http://rebel13.nl/html/5-moves.html
http://rebel13.nl/html/6-moves.html

Next, Crafty and Shredder 13.

Meanwhile waiting for more NNUE releases :)
90% of coding is debugging, the other 10% is writing bugs.
Jouni
Posts: 3278
Joined: Wed Mar 08, 2006 8:15 pm

Re: OKE - one year after

Post by Jouni »

Soon EVERY engine uses Vieri net and start with 3400 ELO..
Jouni
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: OKE - one year after

Post by xr_a_y »

Jouni wrote: Wed Sep 09, 2020 12:57 pm Soon EVERY engine uses Vieri net and start with 3400 ELO..
Well, Minic and Igel were already 3100-3200 engines. So the "every" is maybe a little too optimistic.
But indeed, as shown by the MinicNNUE+sv net versus SF12 similiraty test (=>86% versus 36% for standard Minic eval), evaluation is a very large part of strength here.
voffka
Posts: 288
Joined: Sat Jun 30, 2018 10:58 pm
Location: Ukraine
Full name: Volodymyr Shcherbyna

Re: OKE - one year after

Post by voffka »

Jouni wrote: Wed Sep 09, 2020 12:57 pm Soon EVERY engine uses Vieri net and start with 3400 ELO..
Igel 2.7.0 does not use SV nets, so it is weaker even with NNUE. I am curios btw about the simtest for 2.7.0 and for 2.6.0 (between each) as well as simtext between SF12/Igel and Lc0/Igel :)