My failed attempt to change TCEC NN clone rules

Ovyron · Post by **Ovyron** » Sun Sep 15, 2019 1:25 am

I'm not going to reveal much, but something interesting I've seen in the chess scene is “1 man committees”, so I wouldn't be surprised if Anton was the “rules committee”. Be very suspect when you can't find the names of the people in it, because there's never a page anywhere that lists the members of the “committee” and only shows one name in it.

dkappe · Post by **dkappe** » Sun Sep 15, 2019 3:41 am

chrisw wrote: ↑Sun Sep 15, 2019 12:58 am Basically, right now, if you are piggybacking on LCZero engine, your entity is to all intents and purposes a 100% clone.

Trying to persuade TCEC is another thing in itself. They’ve built on the idea there is some kind if competition going on with different entities in it, especially with these “exciting” “different” NN engines. That it’s a bunch of LCZero clones doesn’t fit too well to the marketing model.

I’m not sure if you realize this, but lczero and lc0 are two distinct engines. The first, lczero, was the initial chess engine used by the leela chess project and was derived from a combination of the leela go engine and stockfish chess board logic. The second, lc0, was a rewrite from scratch.

I’m not sure from your post, but I think you’re referring to lc0 as “lczero?” They are distinct code bases. An engine derived from lczero could not be a clone of lc0 and vice versa.

dkappe · Post by **dkappe** » Sun Sep 15, 2019 3:58 am

chrisw wrote: ↑Sun Sep 15, 2019 12:58 am
Part 2. Are the inputs to your neural network unique? Answer. Only if you are Scorpio.

You forgot Stoofvlees.

As the person who trained Maddex (that’s the net Scorpio is running), I know exactly what the inputs to it were. You can read up on it here: https://github.com/dkappe/leela-chess-w ... iki/Maddex

I’m not sure what makes it more or less unique than any other network.

chrisw · Post by **chrisw** » Sun Sep 15, 2019 6:36 pm

crem wrote: ↑Sat Sep 14, 2019 10:05 am
Modern Times wrote: ↑Sat Sep 14, 2019 9:59 am The whole clone and derivatives issue is a minefield. So many different opinions and viewpoints and "definitions". I would advise Anton to replace any clone "rules" with a simple statement "engines are entered at our sole discretion" then there are no arguments about rules being followed or not. Job done.
I suggested that too as the first suggestion. He responded that rules have to be strict, unambiguous and cover all cases without need of any human interpretation.

As soon as any entity connected to computer chess gives itself a name, which then gets expressed as a mnemonic of three or more letters you can absolutely forget about being able to negotiate with the person/people who run it using any kind of reasoned logic, or appeals to their better selves, or appeals to humanity or reason. Threats don’t work either. Only way to get change is wait for them to die. You’ll meet stubborn as a mule, not even comprehending or even trying to comprehend what you are talking about. It’s a power issue. The people concerned don’t do democracy or community, they do ego. Best way to deal with them is with ontological terrorism. Remove their legitimacy with truth. Kind of what you decided here, actually.

crem · Post by **crem** » Sun Sep 15, 2019 9:20 pm

Now, I’m continuing to retell my March documents to TCEC, regarding the other two parts of the TCEC rule, engine and neural network.
Here the intention was quite good, but interpretation is quite bent..

Let’s go to our favorite example, Allie+Stein.
They use Lc0’s NN architecture (together entire Lc0 NN backends).
NN architecture is something distinctive to an engine (see below), it should be considered either a part of neural network, or an engine (or both). In TCEC interpretation, it’s neither!

- Neural network
Somehow NN architecture is not part of a Neural Network.
So (I guess Albert Silver started this with "DeusX", and now continues with “Fat Fritz” by the way) people somehow use “neural network weights” and “neural network” interchangeably.
In my opinion, as they share exactly the same architecture in detail, all those DeusX, Einstein, etc, are not different neural networks.

They are the same neural network, but trained differently, similarly like if you retune all SF constants to make SF play stronger/sharper/more-like-human/whatever, you won’t say that you created a unique eval function, you’ll say you tuned the existing one. Different attempts to write NN-based engine will lead to differences in architecture, both internal and in inputs/outputs. Yes, currently all attempts are similar to AlphaZero, but still they differ in details in architecture, and they have completely different implementations.

Yet, for TCEC, filling the same network with different numbers somehow makes it a different neural net.

(random thought:
Maybe originally TCEC intended to mean “NN architecture” under their “Neural Network” uniqueness rule, and “training method” instead of “training script”, and later screwed up applying their intention? It would make some sense then. Then Allie+Stein would be “same NN architecture, (arguably) different training method, different engine [which with NN architecture being a separate entry, may be translated to ‘different search’]”, 2 of 3)

(one more note:
also regarding network weights file format: currently for Lc0 it’s not some generic format, we change format every time we change NN architecture. So currently the same file format automatically means it’s the same architecture, that’s why it’s often brought up as an indication of non-uniqueness)

- Engine
So, according to TCEC, NN architecture is not part of NN. Maybe it’s part of an engine then?
Nope, from TCEC point of view, nothing NN-related is part of an engine it seems.

So Allie doesn’t have a single line of code which deals with neural networks. Allie’s author didn’t think how to encode chess position for NN, which outputs to get, where to have batch normalization layers, whether to use int8 or fp16 arithmetics, nothing like that.
Why care about all that, because there is this “standard library for NN evals” called "Lc0 backends".

So, Allie doesn’t have a single line of own NN-related code.
Is it still an “unique NN-based engine”? — According to TCEC, yes!

Some background on why competing vs engines reusing Lc0 NN backends feels unfair, even if it formally would be allowed:

I’d say 90% of all code changes leading to strength growth of Lc0 during last 12 months were in neural network backend code (either NN architecture changes or performance optimizations). It is the main vector of improvement for Lc0 for a year, and projects reusing it just get the same improvements for free. Competing with such projects means competing with yourself.

Lc0 NN backend is not a “generic library to do NN evals”, and is not “a standard NN library which is de-facto a standard for all open source NN engines”, it’s an important distinctive part of an engine.

dkappe · Post by **dkappe** » Sun Sep 15, 2019 9:58 pm

crem wrote: ↑Sun Sep 15, 2019 9:20 pm

They are the same neural network, but trained differently, similarly like if you retune all SF constants to make SF play stronger/sharper/more-like-human/whatever, you won’t say that you created a unique eval function, you’ll say you tuned the existing one. Different attempts to write NN-based engine will lead to differences in architecture, both internal and in inputs/outputs. Yes, currently all attempts are similar to AlphaZero, but still they differ in details in architecture, and they have completely different implementations.

You’re making up your own terminology here.

Let’s stick with accepted terminology;

1. A neural network architecture is something like ResNet50. It’s an abstract concept that can be applied to different domains (different image sizes, sound inputs, etc.). All of the architectures currently in common use in computer chess are derived from Deepmind’s Alpha Zero work.
2. A model is a trained neural network usually implemented in a framework like Pytorch or Tensorflow. In deep learning circles, models that are trained in different ways (data, optimization, etc.) and perform differently are considered to have a different identity.

If you want to insist that #1 be the measure of uniqueness, I guess we’ll all have to clear the field for a0.

If we accept #2, a trained model, as the measure of uniqueness, as is generally accepted, then we are done here.

chrisw · Post by **chrisw** » Sun Sep 15, 2019 10:01 pm

crem wrote: ↑Sun Sep 15, 2019 9:20 pm Now, I’m continuing to retell my March documents to TCEC, regarding the other two parts of the TCEC rule, engine and neural network.
Here the intention was quite good, but interpretation is quite bent..

Let’s go to our favorite example, Allie+Stein.
They use Lc0’s NN architecture (together entire Lc0 NN backends).
NN architecture is something distinctive to an engine (see below), it should be considered either a part of neural network, or an engine (or both). In TCEC interpretation, it’s neither!

- Neural network
Somehow NN architecture is not part of a Neural Network.
So (I guess Albert Silver started this with "DeusX", and now continues with “Fat Fritz” by the way) people somehow use “neural network weights” and “neural network” interchangeably.
In my opinion, as they share exactly the same architecture in detail, all those DeusX, Einstein, etc, are not different neural networks.

They are the same neural network, but trained differently, similarly like if you retune all SF constants to make SF play stronger/sharper/more-like-human/whatever, you won’t say that you created a unique eval function, you’ll say you tuned the existing one. Different attempts to write NN-based engine will lead to differences in architecture, both internal and in inputs/outputs. Yes, currently all attempts are similar to AlphaZero, but still they differ in details in architecture, and they have completely different implementations.

Yet, for TCEC, filling the same network with different numbers somehow makes it a different neural net.

(random thought:
Maybe originally TCEC intended to mean “NN architecture” under their “Neural Network” uniqueness rule, and “training method” instead of “training script”, and later screwed up applying their intention? It would make some sense then. Then Allie+Stein would be “same NN architecture, (arguably) different training method, different engine [which with NN architecture being a separate entry, may be translated to ‘different search’]”, 2 of 3)

(one more note:
also regarding network weights file format: currently for Lc0 it’s not some generic format, we change format every time we change NN architecture. So currently the same file format automatically means it’s the same architecture, that’s why it’s often brought up as an indication of non-uniqueness)

- Engine
So, according to TCEC, NN architecture is not part of NN. Maybe it’s part of an engine then?
Nope, from TCEC point of view, nothing NN-related is part of an engine it seems.

So Allie doesn’t have a single line of code which deals with neural networks. Allie’s author didn’t think how to encode chess position for NN, which outputs to get, where to have batch normalization layers, whether to use int8 or fp16 arithmetics, nothing like that.
Why care about all that, because there is this “standard library for NN evals” called "Lc0 backends".

So, Allie doesn’t have a single line of own NN-related code.
Is it still an “unique NN-based engine”? — According to TCEC, yes!

Some background on why competing vs engines reusing Lc0 NN backends feels unfair, even if it formally would be allowed:

I’d say 90% of all code changes leading to strength growth of Lc0 during last 12 months were in neural network backend code (either NN architecture changes or performance optimizations). It is the main vector of improvement for Lc0 for a year, and projects reusing it just get the same improvements for free. Competing with such projects means competing with yourself.

Lc0 NN backend is not a “generic library to do NN evals”, and is not “a standard NN library which is de-facto a standard for all open source NN engines”, it’s an important distinctive part of an engine.

Exactly. If it wasn’t for lc0 backend, I’ld personally have released several months ago, my own entirely unique self written neural network chess engine, with its own weights trained on its own humungous selection of epds, its own architecture, its own inputs translated from chess position with chess “knowledge elements” factored in to nn inputs, and its own search with its own search parameters. All entirely written from scratch, researched on other peoples ideas of course, although the only bit I would not try an originality claim on is the actual network architecture which is essentially a mix of ideas from visual recognition systems and the AZ ideas and recent papers, which I guess we are all using one way or another.

Okay, so what stopped me? Unique engine, working, searching, not actually that difficult. Well, what stopped me is that I could not hope to produce something special and different and unique that could generate a nodes per second rate even remotely comparable to lc0 and thus be remotely competitive. The challenge to work with ever changing hardware, with handling specialities of GPU/CPU, and in C (All is possible in Python of course, up to a point), but also do it in C, dealing with parallelisation, all that stuff is not in the realm of possibilities for one person, working on all the other stuff too. It would be forever chasing a moving target with way more development strength, no thanks very much. If I do something, it has to be somehow special.

It’s the almost impossible task of catching up with all the work already done and being, lc0. It’s the reason the other NNs are essentially just piggybacks on LC0, because their developers simply can’t do the mass of work involved. It would be like trying to catch Stockfish with something based on TSCP, all by yourself, in limited time, and with a moving target, and then claiming originality. Can’t be done. lc0 engine is the deal. everything else is simplicity in comparison. Piggybackers are clones. Plus they like to spin BS stories. Don’t fall for it.

hgm · Post by **hgm** » Sun Sep 15, 2019 10:05 pm

Why care about TCEC? It is a commercial venture for entertainment, and apparently enough people think it is entertaining to see clones play each other, for lack of original engines. Even if you would convince them that what they want to do violates their own rules, they would just change the rules and do what they think is best to amuse the audience anyway.

sovaz1997 · Post by **sovaz1997** » Sun Sep 15, 2019 10:43 pm

I'm sure that the audience watching the same games of Leela - Allie and Allie - Lleela will not be very happy. And that is to say the least. Is it necessary for the organizers, the question is open.

crem · Post by **crem** » Sun Sep 15, 2019 10:52 pm

dkappe wrote: ↑Sun Sep 15, 2019 9:58 pm You’re making up your own terminology here.

Let’s stick with accepted terminology;

1. A neural network architecture is something like ResNet50. It’s an abstract concept that can be applied to different domains (different image sizes, sound inputs, etc.). All of the architectures currently in common use in computer chess are derived from Deepmind’s Alpha Zero work.
2. A model is a trained neural network usually implemented in a framework like Pytorch or Tensorflow. In deep learning circles, models that are trained in different ways (data, optimization, etc.) and perform differently are considered to have a different identity.

If you want to insist that #1 be the measure of uniqueness, I guess we’ll all have to clear the field for a0.

If we accept #2, a trained model, as the measure of uniqueness, as is generally accepted, then we are done here.

Oh yeah, if we stick to "2 of 3" rule, I agree that Lc0's NN is surely not "100% unique" relatively to A0 net, it is either a clone or close to being a clone (surely in early Leela it was a clone). Or at least it should bring a strong notion of "not-uniqueness" to either engine or NN part of "2 of 3" rule.

Given that most of NN-based attempts mostly reimplement DeepMind's A0 (although probably Stoofvlees is an exception), where to draw the line between "unique" and "non-unique" is highly opinion-dependent..

For me,
- If an engine implemented NN from a0 papers and intends to move it away from initial architecture as engine develops, it's "unique enough" (although it's already a shaky ground, it's more like an exception only because everyone implements the same thing from a0 paper).

- If an engine took a code from another engine and intends to follow changes by syncing, it is not "unique".

The line is somewhere in the middle, and it's probably not possible to define it in TCEC rules.

My failed attempt to change TCEC NN clone rules

Re: My failed attempt to change TCEC NN clone rules

Re: My failed attempt to change TCEC NN clone rules

Re: My failed attempt to change TCEC NN clone rules

Re: My failed attempt to change TCEC NN clone rules

Re: My failed attempt to change TCEC NN clone rules

Re: My failed attempt to change TCEC NN clone rules

Re: My failed attempt to change TCEC NN clone rules

Re: My failed attempt to change TCEC NN clone rules

Re: My failed attempt to change TCEC NN clone rules

Re: My failed attempt to change TCEC NN clone rules