Asking to people who believe Leela NN is a book, what they think about SF NN now?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

cucumber
Posts: 144
Joined: Sun Oct 14, 2018 8:21 pm
Full name: JSmith

Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?

Post by cucumber »

jhellis3 wrote: Mon Sep 14, 2020 9:46 pm
cucumber wrote:SF12 knows opening theory better than Leela does.
On 1 node (or equal node counts) or via search? Because I have a hard time believing the former....
I'm not sure you can get Stockfish to output one node, and even if you could, it wouldn't really make much sense. But at the same node counts, Stockfish 12 consistently outputs more reasonable PVs for me than network 64988 even past 10,000 nodes. I got tired of reading PVs past 11k. While I'm sure there are times at which Leela's PV might surpass SF's in quality and vice versa, SF does remarkably well while using a small fraction of the compute.

There's no way around it: SF's search is hilariously well tuned on openings. Its opening evaluations quality is incredibly disproportionate to its midgame and endgame knowledge. People conflate Leela's "opening book" skills with (what was at one point) generally stronger midgame play.
jhellis3
Posts: 546
Joined: Sat Aug 17, 2013 12:36 am

Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?

Post by jhellis3 »

cucumber wrote:There's no way around it: SF's search is hilariously well tuned on openings.
What? There is no phase information used in or available to search. This statement makes me rather suspicious of your other assertations.

Furthermore, a robust (wide) 3-moves book is use for testing, which is rather the opposite of temp based selections.
cucumber
Posts: 144
Joined: Sun Oct 14, 2018 8:21 pm
Full name: JSmith

Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?

Post by cucumber »

jhellis3 wrote: Mon Sep 14, 2020 10:44 pm
cucumber wrote:There's no way around it: SF's search is hilariously well tuned on openings.
What? There is no phase information used in or available to search. This statement makes me rather suspicious of your other assertations.

Furthermore, a rather robust (wide) 3-moves book is use for testing, which is rather the opposite of temp based selections.
"There is no phase information used in or available to search. This statement makes me rather suspicious of your other assertations."
There does not need to be phase information used or available to search for eval (which does have phase information) and search tunes to tune SF toward and against certain opening choices. In fact, the nice thing about SPSA is that it only learns on match results.

If certain search changes just so happen to improve the move that SF plays in a certain position, and many 3-move openings transpose to this position, then SPSA will optimize toward it. (With the obvious caveat of search changes that regress too much elsewhere.) This is by design. SPSA finds settings that perform better empirically, even if the settings lack any obvious grounding in chess theory like game phases. Seeing as how opening moves have an outsized impact on the direction that a game will go (i.e, the largest signal will come from changes that affect openings), it'd be really surprising if SPSA didn't optimize toward and against openings and opening characteristics.

It is very hard to find a three-move line that SF doesn't have shockingly reasonable PVs for with low depths. This is not true of most other engines. This is not nearly as true of SF in most middlegame or endgames.

"Furthermore, a rather robust (wide) 3-moves book is use for testing, which is rather the opposite of temp based selections."
Which is why it's both incredible and unsurprising that SF is the only engine that can produce a PV that maintains quality past 3 moves with under 10,000 nodes. It's pretty much exactly what I'd expect from an engine tuned with a bunch of parameters and a tool designed to push the engine toward playing the least-bad moves from 3 moves into a game and onward.
jhellis3
Posts: 546
Joined: Sat Aug 17, 2013 12:36 am

Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?

Post by jhellis3 »

cucumber wrote: In fact, the nice thing about SPSA is that it only learns on match results.
And yet, I believe you will find the static eval of SF (especially classic) is hundreds of Elo weaker than Lc0 networks.
cucumber wrote:It is very hard to find a three-move line that SF doesn't have shockingly reasonable PVs for with low depths. This is not true of most other engines. This is not nearly as true of SF in most middlegame or endgames.
I'm sorry, but this statement is just complete bullshit.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?

Post by Laskos »

cucumber wrote: Mon Sep 14, 2020 9:28 pm
Alayan wrote: Mon Sep 14, 2020 2:07 pm
cucumber wrote: Mon Sep 14, 2020 6:09 am People who don't think SPSA and SF's parameter-set have tuned SF toward/against certain opening moves are also delusional.
With a pure eval and without something like Leela's policy head that is explicitely trained to suggest moves that were successful in training/tuning games (instead of keeping eval parameters that do happen to produce successful moves overall), memorizing is harder.

Another element is that the parameter-space of classical eval is much smaller. The memorization capacity of NN is also related to the huge size of their parameter-space that largely exceeds the training dataset size in many instances. The study linked by Andrew earlier shows that CNN can be made to fit arbitrary datasets.

Nonetheless, you'd have a point, if SPSA tuning was done from the start position. This would definitely cause some memorization however minor and hidden in normal-looking parameters it might be.

SF's SPSA tuning is done from a book containing tens of thousands of positions, however. This is much bigger than SF's eval parameter space.
I'm not convinced.

SPSA has done a ridiculous amount to teach SF theory. And SF's parameter space can capture more than enough for an optimizer to make it function as a highly-compressed book

Stockfish 070620 at depth 12: The PV in its entirety follows theory for 12 straight plies with 72,023 nodes.
NNUE at depth 12: PV follows theory for 9 straight plies with a mere 9,781 nodes.
Leela, with the latest T60 net (64988), follows theory for 4 plies with 10,381 nodes before playing weird moves.
Ethereal, with 161,564 nodes, is able to follow theory for four plies. Clearly, Ethereal demonstrates highly advanced opening-book tendencies just like Leela.

NNUE and classical eval both follow theory at laughably small node counts perfectly fine, where other engines (Leela included) would otherwise struggle tremendously. Even classical eval is able to follow some level of theory regardless of the node count at nearly any depth.

Stockfish is the best book out there.
According to my observations and tests, positionally in openings Leela > SF NNUE > SF classical.
cucumber
Posts: 144
Joined: Sun Oct 14, 2018 8:21 pm
Full name: JSmith

Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?

Post by cucumber »

jhellis3 wrote: Tue Sep 15, 2020 12:21 am
cucumber wrote: In fact, the nice thing about SPSA is that it only learns on match results.
And yet, I believe you will find the static eval of SF (especially classic) is hundreds of Elo weaker than Lc0 networks.
cucumber wrote:It is very hard to find a three-move line that SF doesn't have shockingly reasonable PVs for with low depths. This is not true of most other engines. This is not nearly as true of SF in most middlegame or endgames.
I'm sorry, but this statement is just complete bullshit.
It's a good thing that we're not comparing the static evaluations, but the engines themselves.
cucumber
Posts: 144
Joined: Sun Oct 14, 2018 8:21 pm
Full name: JSmith

Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?

Post by cucumber »

Laskos wrote: Tue Sep 15, 2020 12:23 am
cucumber wrote: Mon Sep 14, 2020 9:28 pm
Alayan wrote: Mon Sep 14, 2020 2:07 pm
cucumber wrote: Mon Sep 14, 2020 6:09 am People who don't think SPSA and SF's parameter-set have tuned SF toward/against certain opening moves are also delusional.
With a pure eval and without something like Leela's policy head that is explicitely trained to suggest moves that were successful in training/tuning games (instead of keeping eval parameters that do happen to produce successful moves overall), memorizing is harder.

Another element is that the parameter-space of classical eval is much smaller. The memorization capacity of NN is also related to the huge size of their parameter-space that largely exceeds the training dataset size in many instances. The study linked by Andrew earlier shows that CNN can be made to fit arbitrary datasets.

Nonetheless, you'd have a point, if SPSA tuning was done from the start position. This would definitely cause some memorization however minor and hidden in normal-looking parameters it might be.

SF's SPSA tuning is done from a book containing tens of thousands of positions, however. This is much bigger than SF's eval parameter space.
I'm not convinced.

SPSA has done a ridiculous amount to teach SF theory. And SF's parameter space can capture more than enough for an optimizer to make it function as a highly-compressed book

Stockfish 070620 at depth 12: The PV in its entirety follows theory for 12 straight plies with 72,023 nodes.
NNUE at depth 12: PV follows theory for 9 straight plies with a mere 9,781 nodes.
Leela, with the latest T60 net (64988), follows theory for 4 plies with 10,381 nodes before playing weird moves.
Ethereal, with 161,564 nodes, is able to follow theory for four plies. Clearly, Ethereal demonstrates highly advanced opening-book tendencies just like Leela.

NNUE and classical eval both follow theory at laughably small node counts perfectly fine, where other engines (Leela included) would otherwise struggle tremendously. Even classical eval is able to follow some level of theory regardless of the node count at nearly any depth.

Stockfish is the best book out there.
According to my observations and tests, positionally in openings Leela > SF NNUE > SF classical.
Can you elaborate?
jhellis3
Posts: 546
Joined: Sat Aug 17, 2013 12:36 am

Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?

Post by jhellis3 »

cucumber wrote:It's a good thing that we're not comparing the static evaluations, but the engines themselves.
Lol, what exactly do you think is meant by the NN? The OP is not referring to the monte carlo tree search.

And I hope, for your sake, you are referring to the static eval, because anything else is laughable.
cucumber
Posts: 144
Joined: Sun Oct 14, 2018 8:21 pm
Full name: JSmith

Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?

Post by cucumber »

jhellis3 wrote: Tue Sep 15, 2020 1:09 am
cucumber wrote:It's a good thing that we're not comparing the static evaluations, but the engines themselves.
Lol, what exactly do you think is meant by the NN? The OP is not referring to the monte carlo tree search.

And I hope, for your sake, you are referring to the static eval, because anything else is laughable.
When I say "Stockfish NNUE", I'm not referring to Stockfishe's efficiently updateable neural network, I'm referring to a game engine, despite the literal interpretation suggesting otherwise. I'm assuming by "the NN", they're referring to the NN as it is used by the engine.

A static evaluation literally cannot be an opening book without search. Policy recommendations are not static evaluations, either. In fact, they're not even evaluations, they're expected visit distributions. Comparing just static evaluation would be laughable.
jhellis3
Posts: 546
Joined: Sat Aug 17, 2013 12:36 am

Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?

Post by jhellis3 »

You did not answer the question. And nobody asked what you meant. As has already been explained, search is meaningless to the OP.
cucumber wrote:A static evaluation literally cannot be an opening book without search. Policy recommendations are not static evaluations, either. In fact, they're not even evaluations, they're expected visit distributions. Comparing just static evaluation would be laughable.

So you are completely unreasonable. Red statement is so bananas I can't even... Ok then, no further conversation required...