will Tcec allow Stockfish with a Leela net to play?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: will Tcec allow Stockfish with a Leela net to play?

Post by kranium »

connor_mcmonigle wrote: Fri Jun 18, 2021 2:15 am I agree that it's all quite subjective, though I find it somewhat amusing that you suggest unique moves as a criterion for originality after decrying Ed's similarity tests as inadequate for establishing originality (when it was pointed out that Fire had high similarity to Houdini).
You know very well I (like others) was objecting to using it with depth=1, and I listed my reasoning for that.
connor_mcmonigle wrote: Fri Jun 18, 2021 2:15 am You write that training a network from scratch for Fire is easy, but at the same time you're uninterested in the technical details. If you were to actually try, you might find that training a strong network from original data produced by Fire is not as trivial as you seem to assume. I'd challenge you to try.
Connor, creating a nnue using nodchip's functions is trivial...the step are outlined here
https://github.com/FireFather/sf-nnue/b ... readme.txt

using nnue-gui
https://github.com/FireFather/nnue-gui

makes it even easier...
http://talkchess.com/forum3/viewtopic.p ... 1&p=885127

of course I've done it...you think I wrote nnue-gui without ever creating a nnue?
don't be ridiculous


It's also very easy using deeds tools (from pgn)
"Toolkit to train a net without gensfen nor selfplay"
https://outskirts.altervista.org/forum/ ... =41&t=2009
many non-technical chess engine aficionados users doing it without much issue at that site

but I understand your need to make it seem like rocket science
connor_mcmonigle
Posts: 543
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: will Tcec allow Stockfish with a Leela net to play?

Post by connor_mcmonigle »

kranium wrote: Fri Jun 18, 2021 3:16 am
connor_mcmonigle wrote: Fri Jun 18, 2021 2:15 am I agree that it's all quite subjective, though I find it somewhat amusing that you suggest unique moves as a criterion for originality after decrying Ed's similarity tests as inadequate for establishing originality (when it was pointed out that Fire had high similarity to Houdini).
You know very well I (like others) was objecting to using it with depth=1, and I listed my reasoning for that.
connor_mcmonigle wrote: Fri Jun 18, 2021 2:15 am You write that training a network from scratch for Fire is easy, but at the same time you're uninterested in the technical details. If you were to actually try, you might find that training a strong network from original data produced by Fire is not as trivial as you seem to assume. I'd challenge you to try.
Connor, creating a nnue using nodchip's functions is trivial...the step are outlined here
https://github.com/FireFather/sf-nnue/b ... readme.txt

using nnue-gui
https://github.com/FireFather/nnue-gui

makes it even easier...
http://talkchess.com/forum3/viewtopic.p ... 1&p=885127

of course I've done it...you think I wrote nnue-gui without ever creating a nnue?
don't be ridiculous


It's also very easy using deeds tools (from pgn)
Toolkit to train a net without gensfen nor selfplay
https://outskirts.altervista.org/forum/ ... =41&t=2009

but I understand your need to make it seem like rocket science
I don't think using the existing tools to create a strong network with Stockfish or Lc0 data is all that difficult. Quite easy, in fact. However, creating a strong network from data solely from a far weaker engine, as Igel's author has done, isn't so trivial. I know both authors of both Rubi and Igel have invested a great deal of effort to produce strong networks solely from their engines' respective evaluation functions. Neither has come close to beating the network Sergio Vieri trained using Stockfish that you've bundled with Fire, even after all the effort they've invested...
noobpwnftw
Posts: 560
Joined: Sun Nov 08, 2015 11:10 pm

Re: will Tcec allow Stockfish with a Leela net to play?

Post by noobpwnftw »

Who is this "Sergio Viera"?
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: will Tcec allow Stockfish with a Leela net to play?

Post by Daniel Shawul »

connor_mcmonigle wrote: Fri Jun 18, 2021 12:08 am
kranium wrote: Thu Jun 17, 2021 9:22 pm ...

The only thing that looks like a 'statement' is
Houdini "DQ'd for covertly containing copied code"

certainly doesn't explain much, but it's pretty clear what happened, and is continuing to happen today
not-so-subtle innuendo against Komodo, rage against FF, Fire, and others
enormous outrage, criticism, and pressure on the testers
etc

Apparently it's down to Ethereal, SF, and Seer! and Komodo listed as a (suspicious/maybe)

Mahem should be freed from this unreasonable oppression!
Test Mayhem now! :D
Hopefully, reconnecting this to the original discussion:

You're misinterpreting/misrepresenting Andrew's list. Ethereal, Stockfish (prior to training on Lc0 data), Seer and Komodo are all engines which have NNUE inspired evaluation functions which don't rely (or are unlikely to rely - in the case of Komodo - ) on any code directly copied from another engine for inference, training, data generation (self play, etc.) and all other components of the training pipeline. Whether or not this matters at all is pretty subjective, to be fair.



Here's an overview of how engines relying on NNUE based evaluation functions currently compare (feel free to correct the below descriptions if there are any inaccuracies):

..... snipped ....

Stockfish:
- Training: nnue-pytorch project (https://github.com/glinscott/nnue-pytorch) developed primarily by Sopel, Gary and Vondele, written in Python and using the PyTorch library. Early Stockfish networks (such as those trained by Sergio Vieri) relied upon C++ training code initially written for computer Shogi and adapted to chess by Nodchip.

- Inference: initially contributed to Stockfish by Nodchip and used ubiquitously in modern Shogi engines largely relying on Stockfish's search code. The inference speed and flexibility of the code has been notably improved by Sopel and others.

- Architecture: A variant of the original architecture with tweaked input features (HalfKA-V2) and some other tweaks (notable is the addition of a skip connection from the input features to the output enabling piece values to be more explicitly learned).

- Data Generation: Initially, Stockfish's training data was generated by heavily modified computer Shogi derived code for generating self play games ("gensfen"). The initial labels for the self play positions were supplied by a mixture of Stockfish's classical evaluation function (later the bootstrapped NNUE evaluation function) and the self play game outcome. The latest Stockfish networks are trained on data derived from the Lc0 project.

... snipped ...
All this gloating about how Ethereal NNUE trainer is "original" or whatever while Fire's isn't etc for what exactly ?
Reinventing the wheel doesn't make one original or smarter. If one enjoys doing that, that is fine, but bashing others because
they don't do it your way is bullshit.

I want to advertise my efforts that has been ignored, especially when I think I have "theee best and original NNUE trainer in the whole universe" :)

So here you go for Scorpio

-Training : A common Tensorflow keras code I use for training ResNet as well. The NNUE is just one model and you
can train it from any set of scored EPD positions.
https://github.com/dshawul/nn-train
Has any on you tried training a "zero" NNUE net, I have ...
Has any of you tried a different input architecture , I have ... , e.g. I use only 16 king indices
Has any of you tired using NNUE on the GPU, I have

- Inference: I have my own inference code, as original as any other mentioned here.
This is not the one many use that I adapted from Cfish, but a different one that can infer my own formated NNUE nets
Anyone is free to use it including Fire or whatever with no attachment to GNU license, this is not something to boast about IMO.

https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp

- Architecture: I have tried larger nets after the FatFritz news, i.e. besides my experiments with the feature transformer
Adding PSQT, different factorizers. Factorizers added not in the manner Stockfish does it but as additional input that is removed later

- Data generation, I have trained current net from Scorpio's larger 20b resnet
Also tried man sources including zero, ccrl, and stockfish itself.

So I claim my "NNUE training d*ck is longer than yours, unless proven otherwise :):)
connor_mcmonigle
Posts: 543
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: will Tcec allow Stockfish with a Leela net to play?

Post by connor_mcmonigle »

Daniel Shawul wrote: Fri Jun 18, 2021 4:29 am
connor_mcmonigle wrote: Fri Jun 18, 2021 12:08 am
kranium wrote: Thu Jun 17, 2021 9:22 pm ...

The only thing that looks like a 'statement' is
Houdini "DQ'd for covertly containing copied code"

certainly doesn't explain much, but it's pretty clear what happened, and is continuing to happen today
not-so-subtle innuendo against Komodo, rage against FF, Fire, and others
enormous outrage, criticism, and pressure on the testers
etc

Apparently it's down to Ethereal, SF, and Seer! and Komodo listed as a (suspicious/maybe)

Mahem should be freed from this unreasonable oppression!
Test Mayhem now! :D
Hopefully, reconnecting this to the original discussion:

You're misinterpreting/misrepresenting Andrew's list. Ethereal, Stockfish (prior to training on Lc0 data), Seer and Komodo are all engines which have NNUE inspired evaluation functions which don't rely (or are unlikely to rely - in the case of Komodo - ) on any code directly copied from another engine for inference, training, data generation (self play, etc.) and all other components of the training pipeline. Whether or not this matters at all is pretty subjective, to be fair.



Here's an overview of how engines relying on NNUE based evaluation functions currently compare (feel free to correct the below descriptions if there are any inaccuracies):

..... snipped ....

Stockfish:
- Training: nnue-pytorch project (https://github.com/glinscott/nnue-pytorch) developed primarily by Sopel, Gary and Vondele, written in Python and using the PyTorch library. Early Stockfish networks (such as those trained by Sergio Vieri) relied upon C++ training code initially written for computer Shogi and adapted to chess by Nodchip.

- Inference: initially contributed to Stockfish by Nodchip and used ubiquitously in modern Shogi engines largely relying on Stockfish's search code. The inference speed and flexibility of the code has been notably improved by Sopel and others.

- Architecture: A variant of the original architecture with tweaked input features (HalfKA-V2) and some other tweaks (notable is the addition of a skip connection from the input features to the output enabling piece values to be more explicitly learned).

- Data Generation: Initially, Stockfish's training data was generated by heavily modified computer Shogi derived code for generating self play games ("gensfen"). The initial labels for the self play positions were supplied by a mixture of Stockfish's classical evaluation function (later the bootstrapped NNUE evaluation function) and the self play game outcome. The latest Stockfish networks are trained on data derived from the Lc0 project.

... snipped ...
All this gloating about how Ethereal NNUE trainer is "original" or whatever while Fire's isn't etc for what exactly ?
Reinventing the wheel doesn't make one original or smarter. If one enjoys doing that, that is fine, but bashing others because
they don't do it your way is bullshit.

I want to advertise my efforts that has been ignored, especially when I think I have "theee best and original NNUE trainer in the whole universe" :)

So here you go for Scorpio

-Training : A common Tensorflow keras code I use for training ResNet as well. The NNUE is just one model and you
can train it from any set of scored EPD positions.
https://github.com/dshawul/nn-train
Has any on you tried training a "zero" NNUE net, I have ...
Has any of you tried a different input architecture , I have ... , e.g. I use only 16 king indices
Has any of you tired using NNUE on the GPU, I have

- Inference: I have my own inference code, as original as any other mentioned here.
This is not the one many use that I adapted from Cfish, but a different one that can infer my own formated NNUE nets
Anyone is free to use it including Fire or whatever with no attachment to GNU license, this is not something to boast about IMO.

https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp

- Architecture: I have tried larger nets after the FatFritz news, i.e. besides my experiments with the feature transformer
Adding PSQT, different factorizers. Factorizers added not in the manner Stockfish does it but as additional input that is removed later

- Data generation, I have trained current net from Scorpio's larger 20b resnet
Also tried man sources including zero, ccrl, and stockfish itself.

So I claim my "NNUE training d*ck is longer than yours, unless proven otherwise :):)
Your NNUE d*ck might indeed be longer Daniel, lol. I somehow forgot to include Scorpio in my list. Many of your experiments are certainly interesting and your implementation and ideas are quite original (FWIW, I've experimented with different input features, zero learning, etc. as well :):)). I'll probably create a separate thread with a cleaned up list soon so it doesn't get buried here. The intent of the list was not as some d*ck measuring contest, but a - hopefully - helpful overview.
AndrewGrant
Posts: 1781
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: will Tcec allow Stockfish with a Leela net to play?

Post by AndrewGrant »

Daniel Shawul wrote: Fri Jun 18, 2021 4:29 am Has any on you tried training a "zero" NNUE net, I have ...
Has any of you tried a different input architecture , I have ... , e.g. I use only 16 king indices
Has any of you tired using NNUE on the GPU, I have
Yes, Yes, No.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
gaard
Posts: 450
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: will Tcec allow Stockfish with a Leela net to play?

Post by gaard »

Daniel Shawul wrote: Fri Jun 18, 2021 4:29 am Has any of you tired using NNUE on the GPU, I have
I know this is getting off-topic, but what were your results? Can you link to them?
dkappe
Posts: 1632
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: will Tcec allow Stockfish with a Leela net to play?

Post by dkappe »

Daniel Shawul wrote: Fri Jun 18, 2021 4:29 am
All this gloating about how Ethereal NNUE trainer is "original" or whatever while Fire's isn't etc for what exactly ?
Reinventing the wheel doesn't make one original or smarter. If one enjoys doing that, that is fine, but bashing others because
they don't do it your way is bullshit.

I want to advertise my efforts that has been ignored, especially when I think I have "theee best and original NNUE trainer in the whole universe" :)

So here you go for Scorpio

-Training : A common Tensorflow keras code I use for training ResNet as well. The NNUE is just one model and you
can train it from any set of scored EPD positions.
https://github.com/dshawul/nn-train
Has any on you tried training a "zero" NNUE net, I have ...
Has any of you tried a different input architecture , I have ... , e.g. I use only 16 king indices
Has any of you tired using NNUE on the GPU, I have

- Inference: I have my own inference code, as original as any other mentioned here.
This is not the one many use that I adapted from Cfish, but a different one that can infer my own formated NNUE nets
Anyone is free to use it including Fire or whatever with no attachment to GNU license, this is not something to boast about IMO.

https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp

- Architecture: I have tried larger nets after the FatFritz news, i.e. besides my experiments with the feature transformer
Adding PSQT, different factorizers. Factorizers added not in the manner Stockfish does it but as additional input that is removed later

- Data generation, I have trained current net from Scorpio's larger 20b resnet
Also tried man sources including zero, ccrl, and stockfish itself.

So I claim my "NNUE training d*ck is longer than yours, unless proven otherwise :):)
Hats off. Daniel has done more innovative experiments than the next three engines put together.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: will Tcec allow Stockfish with a Leela net to play?

Post by xr_a_y »

connor_mcmonigle wrote: Fri Jun 18, 2021 12:08 am
kranium wrote: Thu Jun 17, 2021 9:22 pm ...

The only thing that looks like a 'statement' is
Houdini "DQ'd for covertly containing copied code"

certainly doesn't explain much, but it's pretty clear what happened, and is continuing to happen today
not-so-subtle innuendo against Komodo, rage against FF, Fire, and others
enormous outrage, criticism, and pressure on the testers
etc

Apparently it's down to Ethereal, SF, and Seer! and Komodo listed as a (suspicious/maybe)

Mahem should be freed from this unreasonable oppression!
Test Mayhem now! :D
Hopefully, reconnecting this to the original discussion:

You're misinterpreting/misrepresenting Andrew's list. Ethereal, Stockfish (prior to training on Lc0 data), Seer and Komodo are all engines which have NNUE inspired evaluation functions which don't rely (or are unlikely to rely - in the case of Komodo - ) on any code directly copied from another engine for inference, training, data generation (self play, etc.) and all other components of the training pipeline. Whether or not this matters at all is pretty subjective, to be fair.



Here's an overview of how engines relying on NNUE based evaluation functions currently compare (feel free to correct the below descriptions if there are any inaccuracies):

Stockfish:
- Training: nnue-pytorch project (https://github.com/glinscott/nnue-pytorch) developed primarily by Sopel, Gary and Vondele, written in Python and using the PyTorch library. Early Stockfish networks (such as those trained by Sergio Vieri) relied upon C++ training code initially written for computer Shogi and adapted to chess by Nodchip.

- Inference: initially contributed to Stockfish by Nodchip and used ubiquitously in modern Shogi engines largely relying on Stockfish's search code. The inference speed and flexibility of the code has been notably improved by Sopel and others.

- Architecture: A variant of the original architecture with tweaked input features (HalfKA-V2) and some other tweaks (notable is the addition of a skip connection from the input features to the output enabling piece values to be more explicitly learned).

- Data Generation: Initially, Stockfish's training data was generated by heavily modified computer Shogi derived code for generating self play games ("gensfen"). The initial labels for the self play positions were supplied by a mixture of Stockfish's classical evaluation function (later the bootstrapped NNUE evaluation function) and the self play game outcome. The latest Stockfish networks are trained on data derived from the Lc0 project.



Komodo Dragon:
- Training: Unknown, though possibly originating from some modification to the nnue-pytorch project, the original NNUE training code ported by Nodchip or something all original. It has been stated that the architecture differs somewhat, necessitating some modification irrespective of its origin.

- Inference: Original. It has been mentioned (speculated?) that not all the layers are quantized and the quantization scheme differs somewhat as compared to Stockfish and those engines relying upon inference code derived from Stockfish.

- Architecture: The first layer is known to be a 128x2 feature transformer (as compared to the 2x256 feature transformer initially used in Stockfish and pretty much exclusively used in engines relying upon Stockfish derived inference code). Whether there are other more interesting modifications is unknown. Input features are presumably either HalfKA/HalfKP-esque

- Data Generation: The Dragon network is presumably trained on positions from Komodo self play games labeled using a mixture of Komodo's unique classical evaluation function and the self play game outcomes. Specifics are obviously unknown here.


Ethereal:
- Training: Ethereal's training code (NNTrainer) is private, written in C and not derived from any existing project. Halogen relies on the same project for its training code

- Inference: Ethereal's networks use a differing quantization scheme as compared to Stockfish and later layers in the network are not quantized. The inference code is publicly available and can be found on GitHub.

- Architecture: Ethereal uses a standard architecture with HalfKP input features and a 2x256 feature transformer with HalfKP-2x256-32-32-1 architecture as initially ported to chess by Nodchip. Being code level original, it is likely to have some other subtle differences.

- Data Generation: Ethereal self play games with labels originating from the evaluations provided by Ethereal's unique classical evaluation function.

Seer:
- Training: Seer’s training code is written in Python and makes use of the PyTorch library. It predates the nnue-pytorch project. Seer’s training code is thoroughly integrated with the engine and relies on pybind11 to expose engine components to the PyTorch training code. It is publicly available and can be found here: https://github.com/connormcmonigle/seer-training

- Inference: Original. Seer does not use quantization and, instead, relies upon minimal use of SIMD intrinsics for reasonable performance.

- Architecture: Seer uses HalfKA input features with an asymmetric 2x160 feature transformer. The remaining layers are densely connected (each input is concatenated with the corresponding, learned, affine transform, enabling superior gradient flow) and use ReLU (instead of clipped ReLU activations). Additionally, the network predicts WDL probabilities (3 values) which is unique to Seer, Winter and Lc0.

- Data generation: Seer uses a retrograde learning process to iteratively back up EGTB WDL values to positions sampled from human games on Lichess.

Minic:
- Training: Minic’s training code is written in Python and makes use of the PyTorch library. It is derived from both the nnue-pytorch project and Seer’s training code. The author has made a number of modifications to adapt the training code for Minic.

- Inference: Minic’s inference code is loosely derived from Seer’s inference code with some modifications and improvements. Notably, the author has implemented a minimal quantization scheme to improve performance.

- Architecture: Minic uses HalfKA input features a la Seer with an asymmetric 2x128 feature transformer. The remaining layers are densely connected (each input is concatenated with the affine transforms enabling superior gradient flow) and use clipped ReLU activations. Minic's networks predict a single scalar corresponding to the score.

- Data generation: Minic is trained on positions from Minic self play games with labels originating from Minic's classical evaluation function with some post processing. Later networks are trained on labels originating from the previously trained Minic networks. Minic makes use of adapted “gensfen” code from Stockfish.


Marvin:
- Training: Marvin’s training code is derived from the nnue-pytorch project with a number of modifications.

- Inference: Marvin’s inference code seems to be somewhat derived from CFish, but mostly original. Marvin makes use of the same quantization scheme used in Stockfish.

- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.

- Data generation: Marvin is trained on positions from Marvin self play games with evaluations supplied by Marvin’s evaluation function.


Halogen:
- Training: Halogen relies on the NNTrainer project originating from Ethereal.

- Inference: Halogen’s inference code is quite simple due to the tiny network it relies upon. The network is fully quantized and does not make use of any SIMD intrinsics.

- Architecture: Halogen uses KP features (the standard 768 psqt features). The network is fully connected with ReLU activations and has layer sizes KP-512-1. It predicts absolute scores as opposed to relative scores, relying upon a fixed tempo adjustment.

- Data Generation: Ethereal self play games with labels originating from the evaluations provided by Ethereal's unique classical evaluation function.


Igel:
- Training: nnue-pytorch (see Stockfish)
- Inference: See Stockfish
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: Igel self play games using a modified version of the "gensfen" code adapted for chess by Nodchip. (labels from its classical evaluation function + previously trained Igel networks)


RubiChess:
- Training: C++ training code contributed to Stockfish by Nodchip and ported from computer Shogi (see Stockfish).
- Inference: See Stockfish
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: RubiChess self play games using a modified version of the "gensfen" code adapted for chess by Nodchip. (labels from its classical evaluation function + previously trained RubiChes networks)


Nemorino:
- Training: C++ training code contributed to Stockfish by Nodchip and ported from computer Shogi (see Stockfish).
- Inference: See Stockfish
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: Nemorino self play games using a modified version of the "gensfen" code adapted for chess by Nodchip. (labels from its classical evaluation function + previously trained Nemorino networks)


BBC:
- Training: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).
- Inference: Daniel Shawul's probe library which is adapted from Ronald's CFish C port of Stockfish's original inference code contributed by Nodchip.
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).

Mayhem:
- Training: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).
- Inference: Daniel Shawul's probe library which is adapted from Ronald's CFish C port of Stockfish's original inference code contributed by Nodchip.
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).

Fire:
- Training: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).
- Inference: Daniel Shawul's probe library which is adapted from Ronald's CFish C port of Stockfish's original inference code contributed by Nodchip.
- Architecture: The standard HalfKP-256-32-32-1 originally adapted by Nodchip to chess.
- Data Generation: N/A ~ Using a network trained by SV for Stockfish (see Stockfish).
Very good post Connor! Thanks a lot, I just have to be more precise about Minic on some points :

Minic:
- Training: Minic’s training code is written in Python and makes use of the PyTorch library. It is derived from both the nnue-pytorch project and Seer’s training code. The author has made a number of modifications to adapt the training code for Minic. Training process also make uses of an adapted "run_game" script by @Vondele and some plotting stuff from the nnue-pytorch projet and some others things written by myself. I also recently adopt Connor's factorizer and dropout implementation.

- Inference: Minic’s inference code is loosely derived from Seer’s inference code with some modifications and improvements. Notably, the author has implemented a minimal quantization scheme to improve performance. As of today the quantization scheme and clipped relu are turned off, but retry that is on my todo list.

- Architecture: Minic uses HalfKA input features a la Seer with an asymmetric 2x128 feature transformer. The remaining layers are densely connected (each input is concatenated with the affine transforms enabling superior gradient flow) and use (clipped) ReLU activations. Minic's networks predict a single scalar corresponding to the score. I the last 9 months, I tried many architecture (some smaller, some bigger, with or without skip connection) but did not found a better one for now... Still trying ...

- Data generation: Official Minic nets are trained on positions from Minic self play games with labels originating from Minic's classical evaluation function with some post processing (nets based on SF or "LC0" data are clearly not official ones, not for competitions nor rating lists). Later networks are trained on labels originating from the previously trained Minic networks. Minic does not use Stockfish genfen code anymore except for converting between format (plain, bin, binpack, pgn), I'll make a clean up commit to make this last thing clearer.


Again, thanks for the analysis.
Madeleine Birchfield
Posts: 512
Joined: Tue Sep 29, 2020 4:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: will Tcec allow Stockfish with a Leela net to play?

Post by Madeleine Birchfield »

Regarding the original topic, the Stockfish team trained their own net. None of Fire, BBC, and Mayhem trained their own nets, which brings me to my next point. Why is jjoshua allowed to insert his Stein net into Allie and submit it into TCEC, while dkappe isn't allowed to insert his Night Nurse into Igel and submit it into TCEC? Both nets should either be allowed in TCEC or both should be banned. There are enough variety in neural network architectures that restricting guidelines to only NNUE is not enough for TCEC.