Page 2 of 10

Re: My failed attempt to change TCEC NN clone rules

Posted: Sat Sep 14, 2019 3:15 pm
by JJJ
AllieStein is a clone. I won't watch the superfinal. I m only interested in final with Stockfish and Lczero.

Re: My failed attempt to change TCEC NN clone rules

Posted: Sat Sep 14, 2019 5:44 pm
by MikeB
JJJ wrote: Sat Sep 14, 2019 3:15 pm AllieStein is a clone. I won't watch the superfinal. I m only interested in final with Stockfish and Lczero.
TCEC does get nice hardware, unfortunately the administration of TCEC does not match the TCEC hardware.

Re: My failed attempt to change TCEC NN clone rules

Posted: Sat Sep 14, 2019 5:46 pm
by MikeB
crem wrote: Sat Sep 14, 2019 9:32 am I wanted to bring up this topic several times already, but my drafts were too long to post.
Now, as Allie has good chances to go into finals, I think it’s time to bring this topic back.

Also I get frequent questions why don’t I bring this topic up if I don’t agree.

So, I tried to bring this topic up with TCEC administration, with no success.

Timeline:

March 10th (TCEC 15 just started, and Allie suddenly appeared there)

I contacted Anton Mihailov, TD director, telling that:

1. Current TCEC clone rules are not applied correctly:
Allie+Stein is a clone according to TCEC rules (both Allie is not unique relative to Lc0, and Stein is not unique comparing to Leela’s weights, see also viewtopic.php?p=792755).
2. Rules themselves are poor and have to be changed.
3. I proposed various changes and ideas to the rules (with most of them Allie+Stein would be able to participate, and in some also “DeusX” could).
4. TCEC15 is already messed up, but let’s do that right for TCEC16, we have plenty of time (3 months until the new season start).

Anton responded:
1. This is very important, please keep writing.
2. Please keep it TOP SECRET! Noone should know!
(I tried to convince them that such discussions should be public, only got irrelevant answers, that I don’t know how to manage large communities)

I wrote lots of material, in different forms: one-line summary, one paragraph summary, diagrams, very detailed description etc.

In the end it was clear that noone from TCEC read even one-line summary.

March 29th
(not very relevant but for completeness)
We created a discord server, and Anton invited an undisclosed guest expert to the discussion, with whom we had an interesting, but short and not very relevant to TCEC rules discussion (because TCEC rules are up to TCEC team, really).

March 31th
I received the last message from Anton stating how important is this, and that I should continue writing my proposals.

April
I was pinging them periodically, with no reaction.

May 1st
I deleted the conversations (it was Google Doc and Discord server). Noone seemed to notice. Noone contacted me after that.

In the end I wasted ~15 hours of my time drawing all the diagrams and explanations of different detailization, all to save TCEC admins time.
From chats with them it was clear though that they didn’t even spend 5 minutes on that, they didn’t even read the 1-line summary.

Then TCEC-16 started with no changes at all.

At this point I decided that trying to convey any message to TCEC doesn't worth the effort.



Some screenshots from the document:
Image
Image
Image
Image
I'm not surprised, it's nothing more than a basement tournament run on steroids.

Re: My failed attempt to change TCEC NN clone rules

Posted: Sat Sep 14, 2019 6:06 pm
by jorose
I like that you differentiated more than the original rules, I think they could be improved.

What do you mean by "weights format"? If you are referring to the actual format of the external weights file then I don't feel this should be included as I feel it should be encouraged to reuse that. Having a consistent format across NN engines means we can mix and match and compare binaries or networks directly with one another should we so choose.

As Allie's author has said several times himself, a rule clarification is need with regards to NN backend as well as Fathom. To my knowledge there has been no official comment on either of them. Tons of engines rely on Fathom, none of which would be deemed unique should that be required for uniqueness criteria. Perhaps Fathom should be added to the diagram with a new color for SF...

Does Stein use the same training code as Leela? Does it generate self play games?

A point could be made for differentiating neural nets based on them actually being different. Ie: Have something like SIMEX, but speciallized for NN policy and eval output. The main issue here being two devs with completely unique approaches could theoretically converge to the same thing. I think having such a tool would nevertheless be interesting.

Re: My failed attempt to change TCEC NN clone rules

Posted: Sat Sep 14, 2019 8:27 pm
by Branko Radovanovic
Quote from the TCEC rules:
Definition: A neural network is a computer system modeled on the human brain and nervous system. For the purpose of TCEC a participant is considered a neural network (NN) engine if it generally requires the use of GPU and consists of at least the following 3 parts:
  1. The code for training the neural network
  2. The neural network (and weights file) itself
  3. The engine that executes this network It is the parts 2 and 3 that will actually be a playing combination at TCEC. Part 1 is used in preparation.
Uniqueness: For an NN engine to be unique in the TCEC context, at least two of the three defining parts mentioned above have to be unique.
I don't like the way these three criteria are presented as mutually independent, as if it's possible to satisfy both #1 and #3 without satisfying #2. That put aside, two engines having the exact same neural network (weights file), but different training code and game-playing code, would be considered distinct. So, it would have been quite possible to have two engines that play exactly the same, yet both be considered original, while at the same time there could be two engines with wildly different styles which would, by the same rules, be considered clones.

Indeed, it would have made more sense to decide on a case-by-case basis.

My gripe about TCEC is voluntaristic attitude towards rules and their application. (The current Stockfish crashing affair is a case in point.) As TCEC grew in importance, I expected a change, but actually it's gotten worse.

Re: My failed attempt to change TCEC NN clone rules

Posted: Sat Sep 14, 2019 9:25 pm
by Rebel
I recently tested Allie vs Lc0 on similarity, same NN's thus testing the code base.

http://rebel13.nl/html/nn-500ms.html

http://rebel13.nl/html/nn-1000ms.html

75% is pretty high.

But then NN is a whole different subject in comparison when it's about similarity.

Re: My failed attempt to change TCEC NN clone rules

Posted: Sat Sep 14, 2019 11:36 pm
by dkappe
Rebel wrote: Sat Sep 14, 2019 9:25 pm I recently tested Allie vs Lc0 on similarity, same NN's thus testing the code base.

http://rebel13.nl/html/nn-500ms.html

http://rebel13.nl/html/nn-1000ms.html

75% is pretty high.

But then NN is a whole different subject in comparison when it's about similarity.
I’m sorry, you tested these engines with the same network? I would expect something close to 100% similarity. The fact that it’s so low is shocking.

Re: My failed attempt to change TCEC NN clone rules

Posted: Sat Sep 14, 2019 11:42 pm
by dkappe
dkappe wrote: Sat Sep 14, 2019 11:36 pm
Rebel wrote: Sat Sep 14, 2019 9:25 pm I recently tested Allie vs Lc0 on similarity, same NN's thus testing the code base.

http://rebel13.nl/html/nn-500ms.html

http://rebel13.nl/html/nn-1000ms.html

75% is pretty high.

But then NN is a whole different subject in comparison when it's about similarity.
I’m sorry, you tested these engines with the same network? I would expect something close to 100% similarity. The fact that it’s so low is shocking.
Ah, the run times are pretty low. You might find some variation at such low node counts. Really, plugging the same network into a PUCT algorithm is akin to plugging the same eval function into a negamax algorithm: you should get close to 100% similarity (though Allie uses a slightly different backup strategy than lc0, if I recall).

Re: My failed attempt to change TCEC NN clone rules

Posted: Sun Sep 15, 2019 12:08 am
by crem
Just to clarify,

I’m not accusing authors (of Allie+Stein) of sending the engine which I don’t consider original enough. Authors can make an engine, and reuse any portion of code, that’s GPL after all. CCCC admins for example openly say that they allow clones, they pick engines based on entertainment value, that that’s totally fine.

And I’m also not trying to disqualify Alliestein from TCEC16, or affect the current season in any way (I agree I picked a bad timing for this thread). I'm not directly anti-Allie for the next season either. I want better rules, and rules applied better.


I do accuse TCEC admins though, even not so much for not following own rules and having poor rules, but for lack of ANY interest in improving those.

Today after my post Anton wrote me that my concern is not forgotten and there’s tremendous work going on to fix NN clone rules (for more than 6 months already!).

Even if I believed that (which I don’t), I still would think it would be done very wrongly. I did write a lot of input, I really spent 4 hours almost every evening for more than a week coming up with pros and cons. Don’t I deserve any feedback? Shouldn’t be I included into the discussion? Shouldn’t the discussion be public after all?

No, it’s “be sure that rules committee will review your input thoroughly and will take into account in the new version of rules, and please keep it TOP SECRET”. And new rules/competitors are usually published at the last day and very publicly when there’s too late to change anything.

I don’t know who “rules committee” is, but what they came up with last time was just silly.
I suspect last time they also got “input to review” from me, although I’m not sure.
It was during DeusX incident. I chatted about that with Anton and told that what ASilver did is he took scripts from Lc0 project and trained the net, and the engine itself is Lc0 too.

What was the resulting rule? You all know:

NN-based engine consists of 3 parts:
1. Neural network.
2. Engine.
3. Training script.
You need 2 of 3 to be unique to be unique.

It’s so detached from reality! What is “training script” doing here? I was really perplexed. Who came up with those “2 of 3”? Training script is something that can be written in 1-2 hours (and I’m surprised no one did that to work around TCEC rules), it’s really very minor piece of work compared to other (others are very ambiguous too, I’ll post about them too a bit later).


If “rules committee” is working the same way (reviewing inputs and generating rules without public discussion), it will be again same bad.

Re: My failed attempt to change TCEC NN clone rules

Posted: Sun Sep 15, 2019 12:58 am
by chrisw
crem wrote: Sun Sep 15, 2019 12:08 am Just to clarify,

I’m not accusing authors (of Allie+Stein) of sending the engine which I don’t consider original enough. Authors can make an engine, and reuse any portion of code, that’s GPL after all. CCCC admins for example openly say that they allow clones, they pick engines based on entertainment value, that that’s totally fine.

And I’m also not trying to disqualify Alliestein from TCEC16, or affect the current season in any way (I agree I picked a bad timing for this thread). I'm not directly anti-Allie for the next season either. I want better rules, and rules applied better.


I do accuse TCEC admins though, even not so much for not following own rules and having poor rules, but for lack of ANY interest in improving those.

Today after my post Anton wrote me that my concern is not forgotten and there’s tremendous work going on to fix NN clone rules (for more than 6 months already!).

Even if I believed that (which I don’t), I still would think it would be done very wrongly. I did write a lot of input, I really spent 4 hours almost every evening for more than a week coming up with pros and cons. Don’t I deserve any feedback? Shouldn’t be I included into the discussion? Shouldn’t the discussion be public after all?

No, it’s “be sure that rules committee will review your input thoroughly and will take into account in the new version of rules, and please keep it TOP SECRET”. And new rules/competitors are usually published at the last day and very publicly when there’s too late to change anything.

I don’t know who “rules committee” is, but what they came up with last time was just silly.
I suspect last time they also got “input to review” from me, although I’m not sure.
It was during DeusX incident. I chatted about that with Anton and told that what ASilver did is he took scripts from Lc0 project and trained the net, and the engine itself is Lc0 too.

What was the resulting rule? You all know:

NN-based engine consists of 3 parts:
1. Neural network.
2. Engine.
3. Training script.
You need 2 of 3 to be unique to be unique.

It’s so detached from reality! What is “training script” doing here? I was really perplexed. Who came up with those “2 of 3”? Training script is something that can be written in 1-2 hours (and I’m surprised no one did that to work around TCEC rules), it’s really very minor piece of work compared to other (others are very ambiguous too, I’ll post about them too a bit later).


If “rules committee” is working the same way (reviewing inputs and generating rules without public discussion), it will be again same bad.
Any entity that reuses the Lczero engine with another set of weights, retrained, trained, RL or SL, whatever, is a straight copy of lczero with zero originality and anyone claiming original engine status doing that is simply delivering BS.

Reuse of LCZero engine forces identical inputs to the net. There is no scope at all to design a network that takes, say attack tables, or material values or any other chess knowledge variant, for the simple reason that LCZero engine is expecting chess inputs to be exactly in the form LCZero defines, use their engine, and you are stuck with their input format. Want to use another input format? You have either to write your own engine, or do some serious hacking into LCZero code.
LCZero engine code IS the difficult bit. Nobody except Daniel Shawul has tried to write their own engine code. Dealing with parallel PUCT search efficiently using GPU and CPU in a fast language like a C variant is an extremely tough programming task, plus you’re in a competitive arms race with some specialists and optimisers who have already worked a lot of magic tricks in LCZero code, with a fast moving hardware environment that’s a kind of undocumented challenge in itself. Using LCZero engine is just piggy backing on a lot of hard creative work which the piggy backer couldn’t hope to do himself. Which is why they don’t. Plus the piggy backers benefit from the ongoing difficult improvement development work being done on LCZero internals. It’s a lazy cheat. By all means make another set of weights (still stuck with same inputs) but please don’t cheat everybody by giving it another name and pretending it is another engine. To do that, requires doing a Daniel Shawul and writing the engine code yourself.

TCEC, or anybody else, needs to ask the question part 2 about the weights.
Part 1. Are your weights unique? Answer. Obviously, everybody’s weights are unique.
Part 2. Are the inputs to your neural network unique? Answer. Only if you are Scorpio.

Basically, right now, if you are piggybacking on LCZero engine, your entity is to all intents and purposes a 100% clone.

Trying to persuade TCEC is another thing in itself. They’ve built on the idea there is some kind if competition going on with different entities in it, especially with these “exciting” “different” NN engines. That it’s a bunch of LCZero clones doesn’t fit too well to the marketing model.