Leela is more Chess-domain specific than regular Chess engines

jorose · Post by **jorose** » Sat Aug 11, 2018 8:37 am

Laskos wrote: ↑Sat Aug 11, 2018 2:23 am 1/ Leela underperforms in odd positions

This is expected, with a style rarely or never encountered in training are harder for Leela than those it has experience with.

Laskos wrote: ↑Sat Aug 11, 2018 2:23 am 2/ It under-performs significantly more with lots of gliders

Perhaps, but is this really something you can conclude based on your test?

Laskos wrote: ↑Sat Aug 11, 2018 2:23 am It seems, the 3x3 patterns in consecutive layers used are not adapted extremely well to Chess.

This is where you are making a leap, imo. With 4 CNN layers, a feature in Leela's net can take into account information from the entire board. Leela has 10 times that many CNN layers.

There are tons of reasons Leela could be under-performing in those variants. Perhaps AB engines are just exceptionally strong in those variants? Perhaps large portions of Leela's nets deal with interactions between major and minor pieces? Maybe search is very important in those variants and Leela is suffering from lack of NPS? Why do you conclude that it is the fact that Leela uses 3x3 filters that she is under-performing there?

corres · Post by **corres** » Sat Aug 11, 2018 8:57 am

jorose wrote: ↑Sat Aug 11, 2018 8:37 am
Laskos wrote: ↑Sat Aug 11, 2018 2:23 am 1/ Leela underperforms in odd positions
This is expected, with a style rarely or never encountered in training are harder for Leela than those it has experience with.

Laskos wrote: ↑Sat Aug 11, 2018 2:23 am 2/ It under-performs significantly more with lots of gliders
Perhaps, but is this really something you can conclude based on your test?

I also think this is the main cause of the phenomenon.
Maybe other factors play role in it but to decide about them developers of Leela would change a lot of things in the source of Leela step by step to examine the effect of changing.

Laskos · Post by **Laskos** » Sat Aug 11, 2018 10:38 am

jorose wrote: ↑Sat Aug 11, 2018 8:37 am
Laskos wrote: ↑Sat Aug 11, 2018 2:23 am 1/ Leela underperforms in odd positions
This is expected, with a style rarely or never encountered in training are harder for Leela than those it has experience with.

Why this is expected? I expected that the NN topology (3x3 kernels) used in image recognition will have problems with gliders in Chess. But that DCNN based engine is more specialized in Chess than engines using hand-crafted specific Chess knowledge is not necessarily expected. The net is not a book, and using those weights of the black-box gives SOME level of abstraction and generalization, but this level seems low, which is not necessarily expected. In fact having lower power of inference and generalization than traditional engines is counter-intuitive to me, that is why I was disappointed. Although the "zero approach" seems generalistic, its final results are some sort of expert systems, which hit their efficiency punctually in some well defined very specialized problems.

Laskos wrote: ↑Sat Aug 11, 2018 2:23 am 2/ It under-performs significantly more with lots of gliders
Perhaps, but is this really something you can conclude based on your test?

I don't understand. That Leela performs significantly worse against traditional engines in positions with many heavy gliders is a fact, or maybe, like Leela, you need a million examples to learn this?

Laskos wrote: ↑Sat Aug 11, 2018 2:23 am It seems, the 3x3 patterns in consecutive layers used are not adapted extremely well to Chess.
This is where you are making a leap, imo. With 4 CNN layers, a feature in Leela's net can take into account information from the entire board. Leela has 10 times that many CNN layers.

There are tons of reasons Leela could be under-performing in those variants. Perhaps AB engines are just exceptionally strong in those variants? Perhaps large portions of Leela's nets deal with interactions between major and minor pieces? Maybe search is very important in those variants and Leela is suffering from lack of NPS? Why do you conclude that it is the fact that Leela uses 3x3 filters that she is under-performing there?

I think the weakness with discovered threats and pins is well known with Leela. They involve gliders again, and as shown here, with lots of gliders Leela generally seems to underperform badly against traditional engines. It indicated that 3x3 filters might not be very efficient with 8x1 rays. And if traditional Chess engines, with hand-crafted Chess-specific eval, are "exceptionally strong in those variants", then it's a pretty bad omen for the used approach with DNNs as a way to a generalized AI. Also, the weakness with gliders and the "single pixel attack" might be the main reasons for very weak tactical abilities of Leela.

Laskos · Post by **Laskos** » Sat Aug 11, 2018 11:21 am

With one of the newest nets, I got the following with several kinds of odd and glider-no-glider variants or starting positions.

Leela doesn't play Chess960, but I concocted a 960-like position (with no castling rights) and built a small opening book for it:
[d]nnbbrrkq/pppppppp/8/8/8/8/PPPPPPPP/NNBBRRKQ w - - 0 1
In this mildly odd 960-like variant having the same Chess pieces, Leela underperforms by 60 +/- 30 Elo points against regular engines.

Now, using a variant having the same Chess pieces, but being much odder:
[d]3rqknr/4bpp1/4bpp1/1PP1npp1/1PPN1pp1/1PPB4/1PPB4/RNKQR3 w - - 0 1
Leela under-performs by 200 +/- 30 Elo points.

From this odd position with no gliders:
[d]2nknnn1/2pppppp/8/8/8/8/PPPPPP2/1NNNKN2 w - - 0 1
Leela under-performs by 150 +/- 30 Elo points.

From this odd position with lots of heavy gliders:
[d]2qkq3/2rrr3/1pppp3/8/8/3PPPP1/3RRR2/3QKQ2 w - - 0 1
Leela under-performs by a whopping 350 +/- 50 Elo points.

George Tsavdaris · Post by **George Tsavdaris** » Sat Aug 11, 2018 11:28 am

Leela without history is weaker than with history(even with 1-2 plies history) so better to run your tests with a PGN from the start position ending to the desired position or probably equivalent for Leela, from a PGN with a FEN 2 plies before the desired position and playing the 2 moves to reach the desired position, e.g:
[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "New game"]
[Black "?"]
[Result "*"]
[SetUp "1"]
[FEN "2nknn2/2pppppp/7n/8/8/6N1/PPPPPP2/1NNNK3 w - - 0 1"]
[PlyCount "2"]

1. Nf1 Ng8 *

Laskos · Post by **Laskos** » Sat Aug 11, 2018 11:33 am

George Tsavdaris wrote: ↑Sat Aug 11, 2018 11:28 am Leela without history is weaker than with history(even with 1-2 plies history) so better to run your tests with a PGN from the start position ending to the desired position or probably equivalent for Leela, from a PGN with a FEN 2 plies before the desired position and playing the 2 moves to reach the desired position, e.g:
[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "New game"]
[Black "?"]
[Result "*"]
[SetUp "1"]
[FEN "2nknn2/2pppppp/7n/8/8/6N1/PPPPPP2/1NNNK3 w - - 0 1"]
[PlyCount "2"]

1. Nf1 Ng8 *

Yes, but I have the same issue when playing from standard Chess opening position. I have 3-mover EPD openings, similarly to what I used in these variants. So, the shortcomings are fairly similar when comparing over- and under-performance compared to standard Chess.

oreopoulos · Post by **oreopoulos** » Sat Aug 11, 2018 3:58 pm

The RL approach builds a giant evaluation function based on pattern recognition. The positions you present here are obviously irregular patterns and LC0 can make no real use of the pattern recognition.

On the other hand, regular engines "break-down" the evaluation function into logical pieces of information and can easily adapt to more strange positions.

That is expected.

If you train LC0 with more irregular patterns it would excel there too.

corres · Post by **corres** » Sat Aug 11, 2018 6:13 pm

Laskos wrote: ↑Sat Aug 11, 2018 10:38 am ...
The net is not a book
...

The net is not a book, but the Neural Network is a kind of book.
The difference between NN and a huge opening (plus middle game and some endgame) book is that from opening book a very few moves can be read without the possibility of winning chance of the moves (at least for engine), on the other hand from NN every each possible moves can be read with its winning chance too. Naturally the reading software and hardware are very different.

Robert Pope · Post by **Robert Pope** » Sat Aug 11, 2018 6:26 pm

oreopoulos wrote: ↑Sat Aug 11, 2018 3:58 pm The RL approach builds a giant evaluation function based on pattern recognition. The positions you present here are obviously irregular patterns and LC0 can make no real use of the pattern recognition.

On the other hand, regular engines "break-down" the evaluation function into logical pieces of information and can easily adapt to more strange positions.

That is expected.

If you train LC0 with more irregular patterns it would excel there too.

Well put. In the example with 4 knights, LC0 has basically no experience in positions with a 4th knight. Unlike Stockfish, it may not be giving any score advantage to having a 4th knight, since it is never been trained on that type of position and may not realize that 4 knights is probably better than 3, while Stockfish happily adds 300cp for every knight you stick on the board.

Laskos · Post by **Laskos** » Sat Aug 11, 2018 6:56 pm

Robert Pope wrote: ↑Sat Aug 11, 2018 6:26 pm
oreopoulos wrote: ↑Sat Aug 11, 2018 3:58 pm The RL approach builds a giant evaluation function based on pattern recognition. The positions you present here are obviously irregular patterns and LC0 can make no real use of the pattern recognition.

On the other hand, regular engines "break-down" the evaluation function into logical pieces of information and can easily adapt to more strange positions.

That is expected.

If you train LC0 with more irregular patterns it would excel there too.
Well put. In the example with 4 knights, LC0 has basically no experience in positions with a 4th knight. Unlike Stockfish, it may not be giving any score advantage to having a 4th knight, since it is never been trained on that type of position and may not realize that 4 knights is probably better than 3, while Stockfish happily adds 300cp for every knight you stick on the board.

Well, it seems I was a bit uninformedly optimistic about NNs flexibility and abstraction power, to be disappointed by this very specialized solver of very specific, well-defined tasks. For those 4 presented positions, A0 or Lc0 would probably need to be heavily trained on humongous amounts of specialized data for each of them, to be competitive with one Chess-specific regular engine. So, "zero" approach, which might seem ab-initio generalistic problem solver, seems more of a hype to me now as approaching a generalized AI. And in fact, maybe "Zillions of Games", a general game playing system developed by Mark Lefler and Jeff Mallett in 1998, which uses just some "rules files" (computationally almost 0 input), might impress me more as generalized AI than these big data, big hardware expert systems with fancy DCNNs black-boxes approaches. Maybe the machine learning advances came exactly because processing power increased dramatically, as the main ideas (like the back-propagation) are from 1980s. The advances with machine learning are still impressive, as in many specialized, well-defined tasks, the approach achieves superlative results.

Leela is more Chess-domain specific than regular Chess engines

Re: Leela is more Chess-domain specific than regular Chess engines

Re: Leela is more Chess-domain specific than regular Chess engines

Re: Leela is more Chess-domain specific than regular Chess engines

Re: Leela is more Chess-domain specific than regular Chess engines

Re: Leela is more Chess-domain specific than regular Chess engines

Re: Leela is more Chess-domain specific than regular Chess engines

Re: Leela is more Chess-domain specific than regular Chess engines

Re: Leela is more Chess-domain specific than regular Chess engines

Re: Leela is more Chess-domain specific than regular Chess engines

Re: Leela is more Chess-domain specific than regular Chess engines