Further development of Lc0: Lc1

noobpwnftw · Post by **noobpwnftw** » Sun Feb 24, 2019 5:02 pm

I believe there is a misconception that being totally "zero" would introduce no bias at all, for one, statistically sound priors may not be the true distribution of their outcome under perfect play, those will in turn get magnified in RL process, in that sense the final strength is limited one way or the other, getting to ground truth and having generalization eventually will go to opposite directions.

jp · Post by jp » Tue Feb 26, 2019 7:23 am

chrisw wrote: ↑Sun Feb 24, 2019 3:23 pm
jp wrote: ↑Sun Feb 24, 2019 1:53 pm
crem wrote: ↑Sun Feb 24, 2019 12:25 pm I just want to note that contrary to popular belief, being "zero" is not something that Lc0 tries to rigorously follow.
As you said, having "zero" in the name is misleading.

I'd have liked there to be at least one ongoing true zero version, because then we'd see how far zero can go on its own. What do we know now about the strongest possible zero NN machine? Not much.

Zero would mean no TB rescoring, etc. (but chrisw suggested in a post that even that may not be truly zero).
Did I say that? Is more likely I said it could be argued that EGTB is non zero and argued that it isn’t. Very unlikely I made an assertion over something with arguments either side.
If forced to choose. zero, because it is generated working back from game result.

I'll try to find your post again. I asked you to explain what you meant.

What you were suggesting was not about EGTB. Anyone who claims EGTB is still zero is twisting whatever meaning "zero" has.

jp · Post by jp » Tue Feb 26, 2019 7:31 am

Here it is, Chris.

chrisw wrote: ↑Wed Dec 19, 2018 11:38 am
jp wrote: ↑Wed Dec 19, 2018 7:11 am
chrisw wrote: ↑Tue Dec 18, 2018 9:10 pm
trulses wrote: ↑Tue Dec 18, 2018 8:58 pm
chrisw wrote: ↑Tue Dec 18, 2018 8:27 pm the legal moves list is an attack map, and because of the way it is encoded, a weighted attack map, only for one side though.
Unless you're talking about the policy label, you're not discriminating "bad" vs "good" moves by just providing the legal moves so I'm not sure what you mean by weighted. Shouldn't all legal moves have the same weight in your input encoding? Just so we're clear I'm not suggesting that anyone actually try this, because it would be expensive in number of input planes and I doubt it would add much strength.

You're already taking advantage of the legal move information in your search both in which nodes you add to the tree and how you calculate your prior probabilities, so I don't see how it violates any rules.
weighted = weighted by attacker. sorry, ambiguity. it meant the weight of attacker type on each target square.

If the attack map/moves were being explicity given in order to provide second order information to the network inputs over and above the one-hot piece encodings,
that, to me anyway, would fall under the non-zero knowledge category. One hot is simple position data, attacks are second order for sure, what movement the one hot bit can do. Which, I think, is probably why the pure AZ didn’t do it, and went for backwards movement knowledge instead (but again via static one hot position encodes).
The attack maps are not explicitly being given as inputs, but the information has crept in back door via the outputs.
Yeah, I was going to ask yesterday for clarification about this...
Can you very explicitly explain the non-zeroness?
Well, from what I intuited, the tabula rasa approach says that you present to your knowledge engine a visual look at the chess board, as if a complete beginner. You see the pieces and the squares they are on. There’s no information how they move, nor how valuable each is, nor that the king is special. You then show this engine chess positions, in random order, and show it a game output (win/loss) and train it on that output. Eventually, without any knowledge of even how the pieces move, this engine will well evaluate chess positions. Totally zero.

Life is made a little more complex by introducing policy. Here you have the same inputs, but a separate output map 64x64 of all moves, possible or not. At its simplest, you take the move played from the position and light up the corresponding bit in the map, and train the engine on that lit bit (sorry, logit). In practice, actually, from the prior search, you light up, in proportion, all the legal moves, and flag the remaining of the 64x64 with zero. This gives, of course, a pattern at the outputs, and during back propagation, this pattern is transmogrified and passed back up the layered weights, affecting them. Essentially, even though you didn’t pass into the engine inputs any move/attack/mobility information, you did pass it in via this pattern in the outputs. Rule N of ML “watch out that you don’t tell it the forward data what you want it to find”, and there are many curiously weird and wonderful and unexpected mechanisms for breaching that rule.

Truises argues this is fine because it is permitted under “rules of chess” information only, and the search algorithm that generate training games and the search algorithm that plays enduser games has to know how pieces move.
Yes, but. The NN isn’t supposed to know that, else we could input the moves, attacks, mobility and all other second order parameters under the disguise “rules of chess”.

Strictly speaking, I would say they, also AZ, have breached tabula rasa unintentionally and without realising it.

Do you maybe want to say some more about that?

chrisw · Post by **chrisw** » Tue Feb 26, 2019 10:30 am

jp wrote: ↑Tue Feb 26, 2019 7:31 am Here it is, Chris.

chrisw wrote: ↑Wed Dec 19, 2018 11:38 am
jp wrote: ↑Wed Dec 19, 2018 7:11 am
chrisw wrote: ↑Tue Dec 18, 2018 9:10 pm
trulses wrote: ↑Tue Dec 18, 2018 8:58 pm
chrisw wrote: ↑Tue Dec 18, 2018 8:27 pm the legal moves list is an attack map, and because of the way it is encoded, a weighted attack map, only for one side though.
Unless you're talking about the policy label, you're not discriminating "bad" vs "good" moves by just providing the legal moves so I'm not sure what you mean by weighted. Shouldn't all legal moves have the same weight in your input encoding? Just so we're clear I'm not suggesting that anyone actually try this, because it would be expensive in number of input planes and I doubt it would add much strength.

You're already taking advantage of the legal move information in your search both in which nodes you add to the tree and how you calculate your prior probabilities, so I don't see how it violates any rules.
weighted = weighted by attacker. sorry, ambiguity. it meant the weight of attacker type on each target square.

If the attack map/moves were being explicity given in order to provide second order information to the network inputs over and above the one-hot piece encodings,
that, to me anyway, would fall under the non-zero knowledge category. One hot is simple position data, attacks are second order for sure, what movement the one hot bit can do. Which, I think, is probably why the pure AZ didn’t do it, and went for backwards movement knowledge instead (but again via static one hot position encodes).
The attack maps are not explicitly being given as inputs, but the information has crept in back door via the outputs.
Yeah, I was going to ask yesterday for clarification about this...
Can you very explicitly explain the non-zeroness?
Well, from what I intuited, the tabula rasa approach says that you present to your knowledge engine a visual look at the chess board, as if a complete beginner. You see the pieces and the squares they are on. There’s no information how they move, nor how valuable each is, nor that the king is special. You then show this engine chess positions, in random order, and show it a game output (win/loss) and train it on that output. Eventually, without any knowledge of even how the pieces move, this engine will well evaluate chess positions. Totally zero.

Life is made a little more complex by introducing policy. Here you have the same inputs, but a separate output map 64x64 of all moves, possible or not. At its simplest, you take the move played from the position and light up the corresponding bit in the map, and train the engine on that lit bit (sorry, logit). In practice, actually, from the prior search, you light up, in proportion, all the legal moves, and flag the remaining of the 64x64 with zero. This gives, of course, a pattern at the outputs, and during back propagation, this pattern is transmogrified and passed back up the layered weights, affecting them. Essentially, even though you didn’t pass into the engine inputs any move/attack/mobility information, you did pass it in via this pattern in the outputs. Rule N of ML “watch out that you don’t tell it the forward data what you want it to find”, and there are many curiously weird and wonderful and unexpected mechanisms for breaching that rule.

Truises argues this is fine because it is permitted under “rules of chess” information only, and the search algorithm that generate training games and the search algorithm that plays enduser games has to know how pieces move.
Yes, but. The NN isn’t supposed to know that, else we could input the moves, attacks, mobility and all other second order parameters under the disguise “rules of chess”.

Strictly speaking, I would say they, also AZ, have breached tabula rasa unintentionally and without realising it.
Do you maybe want to say some more about that?

I was initially answering (briefly), your comment:

“Zero would mean no TB rescoring, etc. (but chrisw suggested in a post that even that may not be truly zero).”

which appears to relate to something about TB use. I guess you meant EGTB. I only answered because I didn’t recollect saying anything assertive about EGTB being non zero or zero, there are arguments either way.

But what you’ve just replied with isn’t about EGTB use at all.

jp · Post by jp » Tue Feb 26, 2019 12:11 pm

Yeah, correct. "Even that" meant "no TB rescoring".

I meant using EGTB rescoring is obviously non-zero (according to me). But even if they don't use TB rescoring, if what you said before (unrelated to EGTB use) is right, it may still not strictly be non-zero. (I maybe don't mind calling that "zero", but I do mind calling TB rescored nets "zero".)

chrisw · Post by **chrisw** » Tue Feb 26, 2019 1:04 pm

jp wrote: ↑Tue Feb 26, 2019 12:11 pm Yeah, correct. "Even that" meant "no TB rescoring".

I meant using EGTB rescoring is obviously non-zero (according to me). But even if they don't use TB rescoring, if what you said before (unrelated to EGTB use) is right, it may still not strictly be non-zero. (I maybe don't mind calling that "zero", but I do mind calling TB rescored nets "zero".)

I'm not too fussed either way. I do like the general principle of zero and hope they stick to it. Two reasons, a) it's philosophically satisfying and b) it gives a fixed benchmark or target for other projects. If they switch their process to non_zero by applying pre-computes or hand_crafted input terms or foreign entity training targets and so on, that would be a shame. But I don't think they are. Personally I doubt there is anything much to be gained by using the relatively limited number of strong human games anyway. Nothing to stop others forking, or building anew, though.

Leela Chess is really a peer review and proof of the AZ concept. Can they repeat what AZ team said they did? It's a verification check on AZ, they're getting there, but they're not there yet. Others can use the knowledge/experience to create other engines. They're also a verification check on other chess neural net developments; like, er, we know how difficult this is, and how much data is needed, bla, bla, how did you manage to get to stage X all by yourself, hem-hem?

hgm · Post by **hgm** » Tue Feb 26, 2019 1:25 pm

I am not sure using EGT is non-zero. The generation of EGT also doesn't involve anything but the game rules, and a retrograde search (applied in bulk). Why would a retrograde search be 'less zero' than a forward MCTS?

jp · Post by jp » Wed Feb 27, 2019 6:24 am

chrisw wrote: ↑Tue Feb 26, 2019 1:04 pm Leela Chess is really a peer review and proof of the AZ concept. Can they repeat what AZ team said they did? It's a verification check on AZ

This is a very important role Lc has.

Damir · Post by **Damir** » Wed Feb 27, 2019 12:07 pm

Enough with this bullshit with Leela Chess repeating what Alpha Zero did in the paper. Leela Chess was doing very well before it decided to copy Alpha Zero paper. Now its progress is going downhill ever since...

jp · Post by jp » Wed Feb 27, 2019 12:15 pm

It always tried to follow what AZ did. It's just that the preprint misled them.

What makes you think it's been going downhill? Stalling maybe, but that happened before too.

Further development of Lc0: Lc1

Re: Further development of Lc0: Lc1

Re: Further development of Lc0: Lc1

Re: Further development of Lc0: Lc1

Re: Further development of Lc0: Lc1

Re: Further development of Lc0: Lc1

Re: Further development of Lc0: Lc1

Re: Further development of Lc0: Lc1

Re: Further development of Lc0: Lc1

Re: Further development of Lc0: Lc1

Re: Further development of Lc0: Lc1