My take on the whole "End of an era" thing.

chrisw · Post by **chrisw** » Wed May 01, 2019 4:52 pm

corres wrote: ↑Wed May 01, 2019 3:15 pm
hgm wrote: ↑Wed May 01, 2019 2:11 pm More important is that the NN-based engines aren't really very good at Chess in general: they seem to be quite inferior for analyzing arbitrary positions. The one thing they are good at is playing games from the opening position, because that allows them to avoid the large fraction of positions where they would suck. But who wants that?
Obviously the inferiority of NN engines for analyzing (that is in arbitrary position) depends on the dimension and structure of NN. In the case of an NN engine with bigger and better structured NN this issue is smaller.
We can hope the development of NN engine will reduce this issue.

1. Obviously. Haha.
2. Obviously. Haha.
3. Hopium.

Adding more layers to the tower, faster processors and bigger and better structure (your words) will get you closer to knowledge in much the same way as adding to the Tower of Babel will get you closer to heaven.

Laskos · Post by **Laskos** » Wed May 01, 2019 5:05 pm

chrisw wrote: ↑Wed May 01, 2019 4:41 pm
Laskos wrote: ↑Wed May 01, 2019 4:12 pm
hgm wrote: ↑Wed May 01, 2019 3:54 pm
chrisw wrote: ↑Wed May 01, 2019 2:31 pm
hgm wrote: ↑Wed May 01, 2019 2:11 pm More important is that the NN-based engines aren't really very good at Chess in general: they seem to be quite inferior for analyzing arbitrary positions. The one thing they are good at is playing games from the opening position, because that allows them to avoid the large fraction of positions where they would suck. But who wants that?
Er, people who play chess from the usual opening position?
The point is that people tend to play their own moves from that position, unless they happen to be operator for a NN engine in a computer tournament. And then they would like to get expert advice from an engine on the positions that result. LC0 is not very good at that; it will often come up with tactical blunders.
Not only that. On weird, unusual, "constructed" positions, Lc0 can be pretty bad both positionally and tactically, and I am not sure that can be corrected in the zero or quasi-zero framework. Large disbalances, endgames are other cases where Lc0 often sucks, but maybe that can be corrected.
A/B engines with hand crafted evaluations working on general principles are very good at applying their principles onto all kind of chess positions. But, at same time, there are huge holes in hand crafted evaluations based around a) fortresses and similar and b) marrying together the forcedly material part of the evaluation with other aspects relating to the king and similar dynamic concepts. Result is a sometimes complete blindness and stupidity.

NN engines have other problems, but they are quite good at fortresses and king aspects.

I mean what do you guys want? Endlessly expressing dissatisfaction.

No, sure not dissatisfaction. NN engines are extremely strong positionally in openings and midgames in real games. To such a degree that I will post in several hours a thing which makes me think about the quiet moves in Chess generally. But we must admit that the strength is not general. For example, many people here (not me in particular) are enjoying involved tactical puzzles, lists of them in test-suites, they can discuss for a long thread one such a puzzle. This tradition is old, many sorts of "boom-bast" puzzles ("studies") were regularly published in popular newspapers since the early 20th century. For these involved tactical puzzles, Lc0 is most often pretty useless compared to a top standard engine. Lc0 us also fairly weak in Chess variants (here I am more interested) deviating strongly from Chess. And so on. I think for a long time we will need one top NN engine and one top standard engine to benefit Chess-wise to the highest degree from our hardware.

chrisw · Post by **chrisw** » Wed May 01, 2019 5:16 pm

Laskos wrote: ↑Wed May 01, 2019 5:05 pm
chrisw wrote: ↑Wed May 01, 2019 4:41 pm
Laskos wrote: ↑Wed May 01, 2019 4:12 pm
hgm wrote: ↑Wed May 01, 2019 3:54 pm
chrisw wrote: ↑Wed May 01, 2019 2:31 pm
hgm wrote: ↑Wed May 01, 2019 2:11 pm More important is that the NN-based engines aren't really very good at Chess in general: they seem to be quite inferior for analyzing arbitrary positions. The one thing they are good at is playing games from the opening position, because that allows them to avoid the large fraction of positions where they would suck. But who wants that?
Er, people who play chess from the usual opening position?
The point is that people tend to play their own moves from that position, unless they happen to be operator for a NN engine in a computer tournament. And then they would like to get expert advice from an engine on the positions that result. LC0 is not very good at that; it will often come up with tactical blunders.
Not only that. On weird, unusual, "constructed" positions, Lc0 can be pretty bad both positionally and tactically, and I am not sure that can be corrected in the zero or quasi-zero framework. Large disbalances, endgames are other cases where Lc0 often sucks, but maybe that can be corrected.
A/B engines with hand crafted evaluations working on general principles are very good at applying their principles onto all kind of chess positions. But, at same time, there are huge holes in hand crafted evaluations based around a) fortresses and similar and b) marrying together the forcedly material part of the evaluation with other aspects relating to the king and similar dynamic concepts. Result is a sometimes complete blindness and stupidity.

NN engines have other problems, but they are quite good at fortresses and king aspects.

I mean what do you guys want? Endlessly expressing dissatisfaction.
No, sure not dissatisfaction. NN engines are extremely strong positionally in openings and midgames in real games. To such a degree that I will post in several hours a thing which makes me think about the quiet moves in Chess generally. But we must admit that the strength is not general. For example, many people here (not me in particular) are enjoying involved tactical puzzles, lists of them in test-suites, they can discuss for a long thread one such a puzzle. This tradition is old, many sorts of "boom-bast" puzzles ("studies") were regularly published in popular newspapers since the early 20th century. For these involved tactical puzzles, Lc0 is most often pretty useless compared to a top standard engine. Lc0 us also fairly weak in Chess variants (here I am more interested) deviating strongly from Chess. And so on. I think for a long time we will need one top NN engine and one top standard engine to benefit Chess-wise to the highest degree from our hardware.

I think the error “we” all made, including me, was that NNs learnt patterns. Like if it learnt a certain “pattern” in one place, it could recognise the pattern in another place. It doesn’t work like that. It does not learn generalisable concepts, in fact we have zero idea what it does learn. Just like we have no idea why similar networks recognise cats. They take an image or a chess board and then .... we have no idea, but we can be pretty sure there are not bits of network that say “bishop pair” or “whiskers” or any bits of network that say anything relatable.
All that can be said is that they “work”.

jp · Post by jp » Wed May 01, 2019 5:47 pm

chrisw wrote: ↑Wed May 01, 2019 5:16 pm I think the error “we” all made, including me, was that NNs learnt patterns. Like if it learnt a certain “pattern” in one place, it could recognise the pattern in another place. It doesn’t work like that. It does not learn generalisable concepts, in fact we have zero idea what it does learn. Just like we have no idea why similar networks recognise cats. They take an image or a chess board and then .... we have no idea, but we can be pretty sure there are not bits of network that say “bishop pair” or “whiskers” or any bits of network that say anything relatable.
All that can be said is that they “work”.

I'd put it like this: The error people make is thinking NNs "learn", "understand", are "intelligent", etc. Then people start talking about NNs the same way they do about humans & human intelligence. Those who benefit from such hype just encourage it.

Gary Internet · Post by **Gary Internet** » Wed May 01, 2019 6:51 pm

Graham Banks wrote: ↑Wed May 01, 2019 12:19 pm
tpoppins wrote: ↑Wed May 01, 2019 12:13 pm Ethereal, Laser and Xiphos dead because they failed to produce an official release in the past month? This is a staggering notion, bordering on ludicrous.

That's not what I said. Certainly not what I meant. Let me explain.

Last official releases:

Ethereal 11.25 on 21 January 2019 - See here

Laser 1.7 on 7 February 2019 - See here

Xiphos 0.5 on 11 March 2019 - See here

I get that these are the official releases and that they only happen twice a year at most, and sometimes as infrequently as every 2 or 3 years. The next will be Ethereal 11.50, laser 1.8. and Xiphos 0.6. I understand what an official release is.

What I meant when I said things like:

Gary Internet wrote: ↑Wed May 01, 2019 9:28 am Laser hasn't had a patch since 24 March 2019 and OpenBench has fallen silent showing that there is nothing new in the works for Laser.

Xiphos hasn't had a patch since 30 March 2019, which for Xiphos, if you've followed its development to any extent on GitHub, you'll know is a long time to go without a patch.

Was that these sections of Github for Laser and Xiphos show that the engines haven't had a single patch added to them in over a month. I wasn't talking about an official release. I was talking about a SINGLE PATCH. Not even a non-Elo gaining simplification. Nothing.

You might say that it doesn't mean anything. Fair enough.

Look at the front page of OpenBench and you'll see Andrew Grant's words at the top explaining that he's stopped developing Ethereal. It may not be dead, but I feel that it will be dormant for long enough that it might as well be.

The Laser developer, Jeffrey An, who I believe was friends/friendly with Andrew Grant also used OpenBench to develop Laser.

If he was actively using OpenBench to develop Laser, it would be full of patches, either successful greens or unsuccessful reds and ambers. But it has nothing in it. The last entry from anyone was from Andrew Grant on 30 March 2019. If you look on page 2 you can see the last failed patch that Jeffrey An attempted through OpenBench was on 21 March 2019. There is nothing in there from Jeffrey since.

Either Jeffrey has stopped working on Laser, or OpenBench has shut down because Andrew Grant has ceased development of Ethereal and Jeffrey is continuing to work on Laser using only one or two computers rather than a massive network of hundreds of cores. That would explain the slow down in development. I have a feeling however, that it's not a slow down. It's a stop. Time will tell.

Likewise the same thing has happened with Xiphos. There are no new patches on Github and I can't point to OpenBench in this instance because Milos Tatarevic, to my knowledge, has never had access to it to assist him in developing Xiphos.

What I can do however is point you to this page which shows all the patches that Xiphos has on Github. As you scroll down from the top, starting on 27 February 2018, Xiphos has had multiple patches land in every single calendar month. It didn't miss a single month for 13 consecutive months, and then suddenly on 30 March 2019, we appear to stop, and there is nothing since.

Following the official release of 0.5 on 11 March 2019, we then had 3 patches land on 16, 20 and 30 March taking us through 0.5.1, 0.5.2 and finally to 0.5.3 which is the version you can see competing in TCEC and CCC right now. But that's where things end.

I'm genuinely wishing that Milos is still actively developing Xiphos behind the scenes because it's a great engine, but something about a one month gap (and growing) doesn't fill me with hope. Perhaps he's gone on month long vacation? It's plausible, but based on the evidence we can see on Github, it seems like an uncharacteristically long gap to me.

I'll leave it there for now, but I will return to this once we get to Division 1 of TCEC Season 16, and let you know if 0.5.3 is still playing, or if we're looking at something like 0.5.18. I'm praying for the latter.

crem · Post by **crem** » Wed May 01, 2019 6:57 pm

I don't think it's right to generalize Lc0 weaknesses to weaknesses of all NN engines.

Yes, Lc0 has weak spots, but I don't think they are inherent to neural networks, it's just something to fix (either in Leela, or it's possible that other engines will be first to address that).

For different kinds of endgame positions there are "technique" that human can learn. If there is a "technique" (something that can be learnt rather than having just to calculate), NN will be able to learn it.

E.g. possible reasons why Lc0 is weaker in endgames:
- All positions are sampled equally during training, and naturally opening positions occur more frequently (move 1 occurs every game, but move 30 not). That is even more severe because of resign.
- We still have temperature in endgames, that can lead Leela to expect opponent blunder in fortress positions.
- We don't handle transpositions yet, and there are lots of transpositions in endgame, which means search is redoing the same work multiple times.
- That also has negative consequences for time management. Leela sees that there are two moves (actually transposition) which have almost the same eval so it decides to think more to find what's best, wasting time.

So what I want to say is that Lc0's weaker endgames is more like a bug which has to be fixed in training or search rather than some limitation of NN.

Same with analyzing less typical chess positions. It trained for that, it will be fine. E.g. train on 50% startpos, 30% 2-move opening book, 15% chess960 startposes, and 5% completely weird position (e.g. totally random with equal material) -- I'm sure that will fix it.

jp · Post by jp » Wed May 01, 2019 7:26 pm

crem wrote: ↑Wed May 01, 2019 6:57 pm I don't think it's right to generalize Lc0 weaknesses to weaknesses of all NN engines.

Yes, Lc0 has weak spots, but I don't think they are inherent to neural networks, it's just something to fix

But for now Lc is all we have. Do you have any reason to believe that A0 does not have the same weaknesses Lc has?

crem · Post by **crem** » Wed May 01, 2019 8:21 pm

jp wrote: ↑Wed May 01, 2019 7:26 pm
crem wrote: ↑Wed May 01, 2019 6:57 pm I don't think it's right to generalize Lc0 weaknesses to weaknesses of all NN engines.

Yes, Lc0 has weak spots, but I don't think they are inherent to neural networks, it's just something to fix
But for now Lc is all we have. Do you have any reason to believe that A0 does not have the same weaknesses Lc has?

Lc0 and A0 are very similar, so "bugs" (e.g. due to not handling transpositions or opening-biased training poisition picking) may be common.
But it's also possible that Lc0 has some bugs that A0 didn't have. E.g. during WCC in London Demis Hassabis said to one of Lc0 devs that A0 didn't have endgame problems.

corres · Post by **corres** » Wed May 01, 2019 8:32 pm

jp wrote: ↑Wed May 01, 2019 4:36 pm
corres wrote: ↑Wed May 01, 2019 3:15 pm Obviously the inferiority of NN engines for analyzing (that is in arbitrary position) depends on the dimension and structure of NN. In the case of an NN engine with bigger and better structured NN this issue is smaller.
We can hope the development of NN engine will reduce this issue.
Whether there's a better structure for an NN is an interesting question, but just making the NN bigger and bigger isn't a better solution than just making a traditional engine search deeper and deeper.

Bigger NN can store more information. But to gain benefit we would enhance the measure of the used NN exponentially. Naturally the power of used hardware also would be enhanced exponentially - this is the real issue.

corres · Post by **corres** » Wed May 01, 2019 8:47 pm

crem wrote: ↑Wed May 01, 2019 6:57 pm I don't think it's right to generalize Lc0 weaknesses to weaknesses of all NN engines.
Yes, Lc0 has weak spots, but I don't think they are inherent to neural networks, it's just something to fix (either in Leela, or it's possible that other engines will be first to address that).
...

I think this question will be solved by time. If it will be such many NN engines as many AB engines exist we will know the good answer.

Regarding to the weaknesses of Leela I think if Leela would use for policy head a kind of Stockfish`s evaluation its weakness may reduce significantly in every field of the chess game.

My take on the whole "End of an era" thing.

Re: My take on the whole "End of an era" thing.

Re: My take on the whole "End of an era" thing.

Re: My take on the whole "End of an era" thing.

Re: My take on the whole "End of an era" thing.

Re: My take on the whole "End of an era" thing.

Re: My take on the whole "End of an era" thing.

Re: My take on the whole "End of an era" thing.

Re: My take on the whole "End of an era" thing.

Re: My take on the whole "End of an era" thing.

Re: My take on the whole "End of an era" thing.