will Tcec allow Stockfish with a Leela net to play?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

connor_mcmonigle
Posts: 533
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: will Tcec allow Stockfish with a Leela net to play?

Post by connor_mcmonigle »

kranium wrote: Thu Jun 17, 2021 9:22 pm ...

The only thing that looks like a 'statement' is
Houdini "DQ'd for covertly containing copied code"

certainly doesn't explain much, but it's pretty clear what happened, and is continuing to happen today
not-so-subtle innuendo against Komodo, rage against FF, Fire, and others
enormous outrage, criticism, and pressure on the testers
etc

Apparently it's down to Ethereal, SF, and Seer! and Komodo listed as a (suspicious/maybe)

Mahem should be freed from this unreasonable oppression!
Test Mayhem now! :D
Hopefully, reconnecting this to the original discussion:

You're misinterpreting/misrepresenting Andrew's list. Ethereal, Stockfish (prior to training on Lc0 data), Seer and Komodo are all engines which have NNUE inspired evaluation functions which don't rely (or are unlikely to rely - in the case of Komodo - ) on any code directly copied from another engine for inference, training, data generation (self play, etc.) and all other components of the training pipeline. Whether or not this matters at all is pretty subjective, to be fair.



Here's an overview of how engines relying on NNUE based evaluation functions currently compare (feel free to correct the below descriptions if there are any inaccuracies):

Stockfish:
- Training: nnue-pytorch project (https://github.com/glinscott/nnue-pytorch) developed primarily by Sopel, Gary and Vondele, written in Python and using the PyTorch library. Early Stockfish networks (such as those trained by Sergio Vieri) relied upon C++ training code initially written for computer Shogi and adapted to chess by Nodchip.

- Inference: initially contributed to Stockfish by Nodchip and used ubiquitously in modern Shogi engines largely relying on Stockfish's search code. The inference speed and flexibility of the code has been notably improved by Sopel and others.

- Architecture: A variant of the original architecture with tweaked input features (HalfKA-V2) and some other tweaks (notable is the addition of a skip connection from the input features to the output enabling piece values to be more explicitly learned).

- Data Generation: Initially, Stockfish's training data was generated by heavily modified computer Shogi derived code for generating self play games ("gensfen"). The initial labels for the self play positions were supplied by a mixture of Stockfish's classical evaluation function (later the bootstrapped NNUE evaluation function) and the self play game outcome. The latest Stockfish networks are trained on data derived from the Lc0 project.



Komodo Dragon:
- Training: Unknown, though possibly originating from some modification to the nnue-pytorch project, the original NNUE training code ported by Nodchip or something all original. It has been stated that the architecture differs somewhat, necessitating some modification irrespective of its origin.

- Inference: Original. It has been mentioned (speculated?) that not all the layers are quantized and the quantization scheme differs somewhat as compared to Stockfish and those engines relying upon inference code derived from Stockfish.

- Architecture: The first layer is known to be a 128x2 feature transformer (as compared to the 2x256 feature transformer initially used in Stockfish and pretty much exclusively used in engines relying upon Stockfish derived inference code). Whether there are other more interesting modifications is unknown. Input features are presumably either HalfKA/HalfKP-esque

- Data Generation: The Dragon network is presumably trained on positions from Komodo self play games labeled using a mixture of Komodo's unique classical evaluation function and the self play game outcomes. Specifics are obviously unknown here.


Ethereal:
- Training: Ethereal's training code (NNTrainer) is private, written in C and not derived from any existing project. Halogen relies on the same project for its training code

- Inference: Ethereal's networks use a differing quantization scheme as compared to Stockfish and later layers in the network are not quantized. The inference code is publicly available and can be found on GitHub.

- Architecture: Ethereal uses a standard architecture with HalfKP input features and a 2x256 feature transformer with HalfKP-2x256-32-32-1 architecture as initially ported to chess by Nodchip. Being code level original, it is likely to have some other subtle differences.

- Data Generation: Ethereal self play games with labels originating from the evaluations provided by Ethereal's unique classical evaluation function.

Seer:
- Training: Seer’s training code is written in Python and makes use of the PyTorch library. It predates the nnue-pytorch project. Seer’s training code is thoroughly integrated with the engine and relies on pybind11 to expose engine components to the PyTorch training code. It is publicly available and can be found here: https://github.com/connormcmonigle/seer-training

- Inference: Original. Seer does not use quantization and, instead, relies upon minimal use of SIMD intrinsics for reasonable performance.

- Architecture: Seer uses HalfKA input features with an asymmetric 2x160 feature transformer. The remaining layers are densely connected (each input is concatenated with the corresponding, learned, affine transform, enabling superior gradient flow) and use ReLU (instead of clipped ReLU activations). Additionally, the network predicts WDL probabilities (3 values) which is unique to Seer, Winter and Lc0.

- Data generation: Seer uses a retrograde learning process to iteratively back up EGTB WDL values to positions sampled from human games on Lichess.

Minic:
- Training: Minic’s training code is written in Python and makes use of the PyTorch library. It is derived from both the nnue-pytorch project and Seer’s training code. The author has made a number of modifications to adapt the training code for Minic.

- Inference: Minic’s inference code is loosely derived from Seer’s inference code with some modifications and improvements. Notably, the author has implemented a minimal quantization scheme to improve performance.

- Architecture: Minic uses HalfKA input features a la Seer with an asymmetric 2x128 feature transformer. The remaining layers are densely connected (each input is concatenated with the affine transforms enabling superior gradient flow) and use clipped ReLU activations. Minic's networks predict a single scalar corresponding to the score.

- Data generation: Minic is trained on positions from Minic self play games with labels originating from Minic's classical evaluation function with some post processing. Later networks are trained on labels originating from the previously trained Minic networks. Minic makes use of adapted “gensfen” code from Stockfish.


Marvin:
- Training: Marvin’s training code is derived from the nnue-pytorch project with a number of modifications.

- Inference: Marvin’s inference code seems to be somewhat derived from CFish, but mostly original. Marvin makes use of the same quantization scheme used in Stockfish.

- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.

- Data generation: Marvin is trained on positions from Marvin self play games with evaluations supplied by Marvin’s evaluation function.


Halogen:
- Training: Halogen relies on the NNTrainer project originating from Ethereal.

- Inference: Halogen’s inference code is quite simple due to the tiny network it relies upon. The network is fully quantized and does not make use of any SIMD intrinsics.

- Architecture: Halogen uses KP features (the standard 768 psqt features). The network is fully connected with ReLU activations and has layer sizes KP-512-1. It predicts absolute scores as opposed to relative scores, relying upon a fixed tempo adjustment.

- Data Generation: Ethereal self play games with labels originating from the evaluations provided by Ethereal's unique classical evaluation function.


Igel:
- Training: nnue-pytorch (see Stockfish)
- Inference: See Stockfish
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: Igel self play games using a modified version of the "gensfen" code adapted for chess by Nodchip. (labels from its classical evaluation function + previously trained Igel networks)


RubiChess:
- Training: C++ training code contributed to Stockfish by Nodchip and ported from computer Shogi (see Stockfish).
- Inference: See Stockfish
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: RubiChess self play games using a modified version of the "gensfen" code adapted for chess by Nodchip. (labels from its classical evaluation function + previously trained RubiChes networks)


Nemorino:
- Training: C++ training code contributed to Stockfish by Nodchip and ported from computer Shogi (see Stockfish).
- Inference: See Stockfish
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: Nemorino self play games using a modified version of the "gensfen" code adapted for chess by Nodchip. (labels from its classical evaluation function + previously trained Nemorino networks)


BBC:
- Training: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).
- Inference: Daniel Shawul's probe library which is adapted from Ronald's CFish C port of Stockfish's original inference code contributed by Nodchip.
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).

Mayhem:
- Training: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).
- Inference: Daniel Shawul's probe library which is adapted from Ronald's CFish C port of Stockfish's original inference code contributed by Nodchip.
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).

Fire:
- Training: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).
- Inference: Daniel Shawul's probe library which is adapted from Ronald's CFish C port of Stockfish's original inference code contributed by Nodchip.
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).
connor_mcmonigle
Posts: 533
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: will Tcec allow Stockfish with a Leela net to play?

Post by connor_mcmonigle »

Correction: multiple times I wrote HalfKP-256-32-32-1 when I intended to write HalfKP-2x256-32-32-1.
connor_mcmonigle
Posts: 533
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: will Tcec allow Stockfish with a Leela net to play?

Post by connor_mcmonigle »

Correction: Igel's self play game generation code is not based on the "gensfen" code from Stockfish. Initially, a python script was used for self play game generation. Later, that code was later replaced with a pure C++ version of the self play game generation code which emits text files in "plain" format. The plain text format is then be converted to Sopel's optimized binpack format for use with the nnue-pytorch training code.

More information about how networks are trained for Igel can be found here https://github.com/vshcherbyna/igel#ige ... tworks-ign

As far as I know, Marvin and Ethereal use custom python scripts for self play data generation while Minic, Nemorino and RubiChess all rely on adapted "gensfen" code from Stockfish.
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: will Tcec allow Stockfish with a Leela net to play?

Post by kranium »

Thanks Connor
I do appreciate all this work...and your attention to detail,
unfortunately the exact details of differences between engines is just not important to me personally.

I do have an understanding of NNUE, being one of the 1st to help Nodchip last summer, when NNUE 1st appeared.
I went so far in my enthusiasm for NNUE to publish a GUI that helps automates the steps:
https://github.com/FireFather/nnue-gui
and produced many of the most stable and fastest binaries
https://github.com/FireFather/sf-nnue

Concerning NNUE
I really don't think everyone should have to reinvent the wheel...it's counterproductive, not the way to go.
Personally I'm concerned about 1 thing...discovering the techniques and methods that make the engine stronger...closer to the discovering 'truth' in chess (I suppose that means towards solving it).

I don't believe that an arbitrary measurement of originality (for ex using the document you've provided as some sort of criteria) is important concerning a decision towards whether or not to test engines for a rating list, or include them in a prestigious tournament.

All the engine you list are made by real people...spending days, weeks, months, years, on releases.
IMO it's abhorrent to me for someone to dismiss these efforts for any reason, especially to carry on a year-long campaign whose sole purpose is to garner more attention for one's own efforts. And you now what I'm talking about.

Personally I devote an enormous amount of time into this, and work at it every free moment I have.
I'm now very close to a strong working MCTS-UCT feature. That's what I want to work on!
I'm not interested in changing/manipulating nodchip's training source code with a few changes in order to be able to list that as something unique or special. ridiculous.

I could very easily adapt that code (which I know very well), make some important changes, and pronounce that now I'm in the same club as Ethereal and Seer and Sf! LOL

Training only from fire games is trivial...you want that? I can have it for you in just a few days but I'm not at all interested in the fine points of reinforcement learning.

I don't want to do that any of that, nor should I be forced to that in order for my engine to be accepted .

I can pretty much guarantee you all engines in your document play at a very high level.
I can also guarantee they will play different moves to reasonable degree. Isn't that enough?

Connor, instead of a long list of technical challenging technical jargon, testers/tour directors could simply focus on "Does the engine play unique moves?". and there's a tool that measure that! That more than suffices IMO

Otherwise it's just programmers trying to be innovative and offering a much weaker engine ...arbitrarily seeking recognition at the expense of others they can discredit...an originality contest. I don't see the sense of that. I realize that some want originality so the engine plays in a more human manner...but at elo 3600-3700, any human won't even recognize that.

Open source is meant for sharing the knowledge, but it's getting to the point where you can't even utilize 'ideas' from Sf without getting criticized.
The chess engine programming environment here should be much more open, permissive, and inclusive...and far less oppressive and restrictive.
This is best for progress, not the strict repressive environment that currently exists.
connor_mcmonigle
Posts: 533
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: will Tcec allow Stockfish with a Leela net to play?

Post by connor_mcmonigle »

kranium wrote: Fri Jun 18, 2021 1:33 am Thanks Connor
I do appreciate all this work...and your attention to detail,
unfortunately the exact details of differences between engines is just not important to me personally.

I do have an understanding of NNUE, being one of the 1st to help Nodchip last summer, when NNUE 1st appeared.
I went so far in my enthusiasm for NNUE to publish a GUI that helps automates the steps:
https://github.com/FireFather/nnue-gui
and produced many of the most stable and fastest binaries
https://github.com/FireFather/sf-nnue

Concerning NNUE
I really don't think everyone should have to reinvent the wheel...it's counterproductive, not the way to go.
Personally I'm concerned about 1 thing...discovering the techniques and methods that make the engine stronger...closer to the discovering 'truth' in chess (I suppose that means towards solving it).

I don't believe that an arbitrary measurement of originality (for ex using the document you've provided as some sort of criteria) is important concerning a decision towards whether or not to test engines for a rating list, or include them in a prestigious tournament.

All the engine you list are made by real people...spending days, weeks, months, years, on releases.
IMO it's abhorrent to me for someone to dismiss these efforts for any reason, especially to carry on a year-long campaign whose sole purpose is to garner more attention for one's own efforts. And you now what I'm talking about.

Personally I devote an enormous amount of time into this, and work at it every free moment I have.
I'm now very close to a strong working MCTS-UCT feature. That's what I want to work on!
I'm not interested in changing/manipulating nodchip's training source code with a few changes in order to be able to list that as something unique or special. ridiculous.

I could very easily adapt that code (which I know very well), make some important changes, and pronounce that now I'm in the same club as Ethereal and Seer and Sf! LOL

Training only from fire games is trivial...you want that? I can have it for you in just a few days but I'm not at all interested in the fine points of reinforcement learning.

I don't want to do that any of that, nor should I be forced to that in order for my engine to be accepted .

I can pretty much guarantee you all engines in your document play at a very high level.
I can also guarantees they play different moves.

Connor, instead of a long list of technical challenging technical jargon, testers/tour directors could simply focus on "Does the engine play unique moves?". and there's a tool that measure that! That more than suffices IMO

Otherwise it's just programmers trying to be innovative and offering a much weaker engine ...arbitrarily seeking recognition at the expense of others they can discredit...an originality contest. I don't see the sense of that. I realize that some want originality so the engine plays in a more human manner...but at elo 3600-3700, any human won't even recognize that.

Open source is meant for sharing the knowledge, but it's getting to the point where you can't even utilize 'ideas' from Sf without getting criticized.
The chess engine programming environment here should be much more open and permissive and far less oppressive and restrictive.
This is best for progress, not the strict repressed environment that currently exists.
I agree that it's all quite subjective, though I find it somewhat amusing that you suggest unique moves as a criterion for originality after decrying Ed's similarity tests as inadequate for establishing originality (when it was pointed out that Fire had high similarity to Houdini). Personally, the details matter to me and I believe that this document, to the extent which it is accurate, provides some criteria for understanding the extent to which an engine's NNUE related evaluation is derivative. At the very least, hopefully it is sufficiently accessible to prove useful to testers, enabling them to make informed decisions about which engines they are interested in testing.

Seer's training/inference/architecture/data generation code isn't just adapted code from Stockfish... it's different in far too many ways to list here, though my post gives somewhat of an overview of the broad technical differences and novel ideas Seer incorporates. Seer didn't start its life derived from Stockfish so every change I've made has gained Elo. Seer's not different just to be different. However, I can never possibly hope to catch Stockfish and that was never my goal when I started this project. I'm just interested in exploring new ideas and variants on existing ideas (HalfKA for example was first implemented in Seer and is now used in Stockfish, though it is quite trivial).

This isn't about reinventing the wheel so much as escaping the Dunning Kruger effect and proving to yourself you actually understand all the details by starting from scratch. It's easy to copy thousands of lines of code and lie to yourself that you understand how it works. I'm almost certain there are a lot of technical details in regards to Stockfish's NNUE code (which you are using) which you don't understand. It's difficult to improve something you don't understand. You write that training a network from scratch for Fire is easy, but at the same time you're uninterested in the technical details. If you were to actually try, you might find that training a strong network from original data produced by Fire is not as trivial as you seem to assume. I'd challenge you to try.
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: will Tcec allow Stockfish with a Leela net to play?

Post by kranium »

connor_mcmonigle wrote: Fri Jun 18, 2021 2:15 am I agree that it's all quite subjective, though I find it somewhat amusing that you suggest unique moves as a criterion for originality after decrying Ed's similarity tests as inadequate for establishing originality (when it was pointed out that Fire had high similarity to Houdini).
You know very well I (like others) was objecting to using it with depth=1, and I listed my reasoning for that.
connor_mcmonigle wrote: Fri Jun 18, 2021 2:15 am You write that training a network from scratch for Fire is easy, but at the same time you're uninterested in the technical details. If you were to actually try, you might find that training a strong network from original data produced by Fire is not as trivial as you seem to assume. I'd challenge you to try.
Connor, creating a nnue using nodchip's functions is trivial...the step are outlined here
https://github.com/FireFather/sf-nnue/b ... readme.txt

using nnue-gui
https://github.com/FireFather/nnue-gui

makes it even easier...
http://talkchess.com/forum3/viewtopic.p ... 1&p=885127

of course I've done it...you think I wrote nnue-gui without ever creating a nnue?
don't be ridiculous


It's also very easy using deeds tools (from pgn)
"Toolkit to train a net without gensfen nor selfplay"
https://outskirts.altervista.org/forum/ ... =41&t=2009
many non-technical chess engine aficionados users doing it without much issue at that site

but I understand your need to make it seem like rocket science
connor_mcmonigle
Posts: 533
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: will Tcec allow Stockfish with a Leela net to play?

Post by connor_mcmonigle »

kranium wrote: Fri Jun 18, 2021 3:16 am
connor_mcmonigle wrote: Fri Jun 18, 2021 2:15 am I agree that it's all quite subjective, though I find it somewhat amusing that you suggest unique moves as a criterion for originality after decrying Ed's similarity tests as inadequate for establishing originality (when it was pointed out that Fire had high similarity to Houdini).
You know very well I (like others) was objecting to using it with depth=1, and I listed my reasoning for that.
connor_mcmonigle wrote: Fri Jun 18, 2021 2:15 am You write that training a network from scratch for Fire is easy, but at the same time you're uninterested in the technical details. If you were to actually try, you might find that training a strong network from original data produced by Fire is not as trivial as you seem to assume. I'd challenge you to try.
Connor, creating a nnue using nodchip's functions is trivial...the step are outlined here
https://github.com/FireFather/sf-nnue/b ... readme.txt

using nnue-gui
https://github.com/FireFather/nnue-gui

makes it even easier...
http://talkchess.com/forum3/viewtopic.p ... 1&p=885127

of course I've done it...you think I wrote nnue-gui without ever creating a nnue?
don't be ridiculous


It's also very easy using deeds tools (from pgn)
Toolkit to train a net without gensfen nor selfplay
https://outskirts.altervista.org/forum/ ... =41&t=2009

but I understand your need to make it seem like rocket science
I don't think using the existing tools to create a strong network with Stockfish or Lc0 data is all that difficult. Quite easy, in fact. However, creating a strong network from data solely from a far weaker engine, as Igel's author has done, isn't so trivial. I know both authors of both Rubi and Igel have invested a great deal of effort to produce strong networks solely from their engines' respective evaluation functions. Neither has come close to beating the network Sergio Vieri trained using Stockfish that you've bundled with Fire, even after all the effort they've invested...
noobpwnftw
Posts: 560
Joined: Sun Nov 08, 2015 11:10 pm

Re: will Tcec allow Stockfish with a Leela net to play?

Post by noobpwnftw »

Who is this "Sergio Viera"?
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: will Tcec allow Stockfish with a Leela net to play?

Post by Daniel Shawul »

connor_mcmonigle wrote: Fri Jun 18, 2021 12:08 am
kranium wrote: Thu Jun 17, 2021 9:22 pm ...

The only thing that looks like a 'statement' is
Houdini "DQ'd for covertly containing copied code"

certainly doesn't explain much, but it's pretty clear what happened, and is continuing to happen today
not-so-subtle innuendo against Komodo, rage against FF, Fire, and others
enormous outrage, criticism, and pressure on the testers
etc

Apparently it's down to Ethereal, SF, and Seer! and Komodo listed as a (suspicious/maybe)

Mahem should be freed from this unreasonable oppression!
Test Mayhem now! :D
Hopefully, reconnecting this to the original discussion:

You're misinterpreting/misrepresenting Andrew's list. Ethereal, Stockfish (prior to training on Lc0 data), Seer and Komodo are all engines which have NNUE inspired evaluation functions which don't rely (or are unlikely to rely - in the case of Komodo - ) on any code directly copied from another engine for inference, training, data generation (self play, etc.) and all other components of the training pipeline. Whether or not this matters at all is pretty subjective, to be fair.



Here's an overview of how engines relying on NNUE based evaluation functions currently compare (feel free to correct the below descriptions if there are any inaccuracies):

..... snipped ....

Stockfish:
- Training: nnue-pytorch project (https://github.com/glinscott/nnue-pytorch) developed primarily by Sopel, Gary and Vondele, written in Python and using the PyTorch library. Early Stockfish networks (such as those trained by Sergio Vieri) relied upon C++ training code initially written for computer Shogi and adapted to chess by Nodchip.

- Inference: initially contributed to Stockfish by Nodchip and used ubiquitously in modern Shogi engines largely relying on Stockfish's search code. The inference speed and flexibility of the code has been notably improved by Sopel and others.

- Architecture: A variant of the original architecture with tweaked input features (HalfKA-V2) and some other tweaks (notable is the addition of a skip connection from the input features to the output enabling piece values to be more explicitly learned).

- Data Generation: Initially, Stockfish's training data was generated by heavily modified computer Shogi derived code for generating self play games ("gensfen"). The initial labels for the self play positions were supplied by a mixture of Stockfish's classical evaluation function (later the bootstrapped NNUE evaluation function) and the self play game outcome. The latest Stockfish networks are trained on data derived from the Lc0 project.

... snipped ...
All this gloating about how Ethereal NNUE trainer is "original" or whatever while Fire's isn't etc for what exactly ?
Reinventing the wheel doesn't make one original or smarter. If one enjoys doing that, that is fine, but bashing others because
they don't do it your way is bullshit.

I want to advertise my efforts that has been ignored, especially when I think I have "theee best and original NNUE trainer in the whole universe" :)

So here you go for Scorpio

-Training : A common Tensorflow keras code I use for training ResNet as well. The NNUE is just one model and you
can train it from any set of scored EPD positions.
https://github.com/dshawul/nn-train
Has any on you tried training a "zero" NNUE net, I have ...
Has any of you tried a different input architecture , I have ... , e.g. I use only 16 king indices
Has any of you tired using NNUE on the GPU, I have

- Inference: I have my own inference code, as original as any other mentioned here.
This is not the one many use that I adapted from Cfish, but a different one that can infer my own formated NNUE nets
Anyone is free to use it including Fire or whatever with no attachment to GNU license, this is not something to boast about IMO.

https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp

- Architecture: I have tried larger nets after the FatFritz news, i.e. besides my experiments with the feature transformer
Adding PSQT, different factorizers. Factorizers added not in the manner Stockfish does it but as additional input that is removed later

- Data generation, I have trained current net from Scorpio's larger 20b resnet
Also tried man sources including zero, ccrl, and stockfish itself.

So I claim my "NNUE training d*ck is longer than yours, unless proven otherwise :):)
connor_mcmonigle
Posts: 533
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: will Tcec allow Stockfish with a Leela net to play?

Post by connor_mcmonigle »

Daniel Shawul wrote: Fri Jun 18, 2021 4:29 am
connor_mcmonigle wrote: Fri Jun 18, 2021 12:08 am
kranium wrote: Thu Jun 17, 2021 9:22 pm ...

The only thing that looks like a 'statement' is
Houdini "DQ'd for covertly containing copied code"

certainly doesn't explain much, but it's pretty clear what happened, and is continuing to happen today
not-so-subtle innuendo against Komodo, rage against FF, Fire, and others
enormous outrage, criticism, and pressure on the testers
etc

Apparently it's down to Ethereal, SF, and Seer! and Komodo listed as a (suspicious/maybe)

Mahem should be freed from this unreasonable oppression!
Test Mayhem now! :D
Hopefully, reconnecting this to the original discussion:

You're misinterpreting/misrepresenting Andrew's list. Ethereal, Stockfish (prior to training on Lc0 data), Seer and Komodo are all engines which have NNUE inspired evaluation functions which don't rely (or are unlikely to rely - in the case of Komodo - ) on any code directly copied from another engine for inference, training, data generation (self play, etc.) and all other components of the training pipeline. Whether or not this matters at all is pretty subjective, to be fair.



Here's an overview of how engines relying on NNUE based evaluation functions currently compare (feel free to correct the below descriptions if there are any inaccuracies):

..... snipped ....

Stockfish:
- Training: nnue-pytorch project (https://github.com/glinscott/nnue-pytorch) developed primarily by Sopel, Gary and Vondele, written in Python and using the PyTorch library. Early Stockfish networks (such as those trained by Sergio Vieri) relied upon C++ training code initially written for computer Shogi and adapted to chess by Nodchip.

- Inference: initially contributed to Stockfish by Nodchip and used ubiquitously in modern Shogi engines largely relying on Stockfish's search code. The inference speed and flexibility of the code has been notably improved by Sopel and others.

- Architecture: A variant of the original architecture with tweaked input features (HalfKA-V2) and some other tweaks (notable is the addition of a skip connection from the input features to the output enabling piece values to be more explicitly learned).

- Data Generation: Initially, Stockfish's training data was generated by heavily modified computer Shogi derived code for generating self play games ("gensfen"). The initial labels for the self play positions were supplied by a mixture of Stockfish's classical evaluation function (later the bootstrapped NNUE evaluation function) and the self play game outcome. The latest Stockfish networks are trained on data derived from the Lc0 project.

... snipped ...
All this gloating about how Ethereal NNUE trainer is "original" or whatever while Fire's isn't etc for what exactly ?
Reinventing the wheel doesn't make one original or smarter. If one enjoys doing that, that is fine, but bashing others because
they don't do it your way is bullshit.

I want to advertise my efforts that has been ignored, especially when I think I have "theee best and original NNUE trainer in the whole universe" :)

So here you go for Scorpio

-Training : A common Tensorflow keras code I use for training ResNet as well. The NNUE is just one model and you
can train it from any set of scored EPD positions.
https://github.com/dshawul/nn-train
Has any on you tried training a "zero" NNUE net, I have ...
Has any of you tried a different input architecture , I have ... , e.g. I use only 16 king indices
Has any of you tired using NNUE on the GPU, I have

- Inference: I have my own inference code, as original as any other mentioned here.
This is not the one many use that I adapted from Cfish, but a different one that can infer my own formated NNUE nets
Anyone is free to use it including Fire or whatever with no attachment to GNU license, this is not something to boast about IMO.

https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp

- Architecture: I have tried larger nets after the FatFritz news, i.e. besides my experiments with the feature transformer
Adding PSQT, different factorizers. Factorizers added not in the manner Stockfish does it but as additional input that is removed later

- Data generation, I have trained current net from Scorpio's larger 20b resnet
Also tried man sources including zero, ccrl, and stockfish itself.

So I claim my "NNUE training d*ck is longer than yours, unless proven otherwise :):)
Your NNUE d*ck might indeed be longer Daniel, lol. I somehow forgot to include Scorpio in my list. Many of your experiments are certainly interesting and your implementation and ideas are quite original (FWIW, I've experimented with different input features, zero learning, etc. as well :):)). I'll probably create a separate thread with a cleaned up list soon so it doesn't get buried here. The intent of the list was not as some d*ck measuring contest, but a - hopefully - helpful overview.