Speculations about NNUE development

AndrewGrant · Post by **AndrewGrant** » Thu Nov 12, 2020 10:17 pm

connor_mcmonigle wrote: ↑Thu Nov 12, 2020 5:09 pm Both Halogen and Seer are comparatively all original. Both just happen to rely on the "efficiently updatable" idea. They probably shouldn't be lumped into the same category as Komodo+NNUE.

+1, Agreed.

Madeleine Birchfield · Thu Nov 12, 2020 10:43 pm

AndrewGrant wrote: ↑Thu Nov 12, 2020 10:17 pm
connor_mcmonigle wrote: ↑Thu Nov 12, 2020 5:09 pm Both Halogen and Seer are comparatively all original. Both just happen to rely on the "efficiently updatable" idea. They probably shouldn't be lumped into the same category as Komodo+NNUE.
+1, Agreed.

But maybe we might be able to add Ethereal to the list soon.

dkappe · Post by **dkappe** » Sat Nov 21, 2020 7:34 am

connor_mcmonigle wrote: ↑Thu Nov 12, 2020 5:09 pm
Madeleine Birchfield wrote: ↑Thu Nov 12, 2020 10:04 am ...which I interpreted to be referring to the search used in training Dragon (Stockfish's trainer uses Stockfish search), but it could just refer to the fact that Dragon uses the same search as Komodo.
Yes. They claim to and are very likely using Komodo's games as training data, but this doesn't mean they implemented new training code + made improvements/changes to the network architecture. This is exceedingly improbable imho.

Likely, what they did for training is the same as DKappe has been doing for a while which involves converting separate data obtained from self play games of a different engine into the packed fen format used by the SF trainer. It seems rather likely they didn't even bother swapping out the SF qsearch code used by the trainer.

To then actually run the networks produced by this process in their engine, they presumably got someone to exactly rewrite just the inference code so they could circumvent the GPL restrictions. If this is the case, I would personally like to see the computer Shogi developers who invested a lot of effort into writing the incredibly optimized and clever training code added to the Dragon authors list. They are responsible for the large majority of the work involved in the increase in strength.

Both Halogen and Seer are comparatively all original. Both just happen to rely on the "efficiently updatable" idea. They probably shouldn't be lumped into the same category as Komodo+NNUE.

(Also see Vajolet's NNUE branch)

Your wild speculations are amusing, stating as fact or high likelihood things you wish to be true. It’s good that things in developer land are generally more friendly. I’ve been encouraging the SF devs to port their trainer to pytorch for a while and been giving them small suggestions in a few areas now that they are on the way. I was afraid they were going to run into a development roadblock without this port, but they are making good progress. I am happy about this.

Just as a note, I’ve been training distilled, endgame and specialist nets in tensorflow and pytorch (and have started to use julia/flux) for several years. These aren’t new concepts to me. It’s my hobby. Don’t assume because you are helpless and out of your depth with regard to training neural nets that others are too.

P.S. On a more useful note, I’ve started using Tord Romstad’s excellent Chess.jl library (https://github.com/romstad/Chess.jl), though it has one major castling bug that I’m working to fix. Pretty speedy for stuff like qsearch.

AndrewGrant · Post by **AndrewGrant** » Sat Nov 21, 2020 7:49 am

dkappe wrote: ↑Sat Nov 21, 2020 7:34 am
connor_mcmonigle wrote: ↑Thu Nov 12, 2020 5:09 pm
Madeleine Birchfield wrote: ↑Thu Nov 12, 2020 10:04 am ...which I interpreted to be referring to the search used in training Dragon (Stockfish's trainer uses Stockfish search), but it could just refer to the fact that Dragon uses the same search as Komodo.
Yes. They claim to and are very likely using Komodo's games as training data, but this doesn't mean they implemented new training code + made improvements/changes to the network architecture. This is exceedingly improbable imho.

Likely, what they did for training is the same as DKappe has been doing for a while which involves converting separate data obtained from self play games of a different engine into the packed fen format used by the SF trainer. It seems rather likely they didn't even bother swapping out the SF qsearch code used by the trainer.

To then actually run the networks produced by this process in their engine, they presumably got someone to exactly rewrite just the inference code so they could circumvent the GPL restrictions. If this is the case, I would personally like to see the computer Shogi developers who invested a lot of effort into writing the incredibly optimized and clever training code added to the Dragon authors list. They are responsible for the large majority of the work involved in the increase in strength.

Both Halogen and Seer are comparatively all original. Both just happen to rely on the "efficiently updatable" idea. They probably shouldn't be lumped into the same category as Komodo+NNUE.

(Also see Vajolet's NNUE branch)
Your wild speculations are amusing, stating as fact or high likelihood things you wish to be true. It’s good that things in developer land are generally more friendly. I’ve been encouraging the SF devs to port their trainer to pytorch for a while and been giving them small suggestions in a few areas now that they are on the way. I was afraid they were going to run into a development roadblock without this port, but they are making good progress. I am happy about this.

Just as a note, I’ve been training distilled, endgame and specialist nets in tensorflow and pytorch (and have started to use julia/flux) for several years. These aren’t new concepts to me. It’s my hobby. Don’t assume because you are helpless and out of your depth with regard to training neural nets that others are too.

P.S. On a more useful note, I’ve started using Tord Romstad’s excellent Chess.jl library (https://github.com/romstad/Chess.jl), though it has one major castling bug that I’m working to fix. Pretty speedy for stuff like qsearch.

Ill note that you failed to deny the claims.

dkappe · Post by **dkappe** » Sat Nov 21, 2020 7:52 am

AndrewGrant wrote: ↑Sat Nov 21, 2020 7:49 am Ill note that you failed to deny the claims.

You mean the baseless speculations? Note what you like Andrew, but your rage posts are somewhat tiring.

connor_mcmonigle · Post by **connor_mcmonigle** » Sat Nov 21, 2020 7:55 am

dkappe wrote: ↑Sat Nov 21, 2020 7:34 am
connor_mcmonigle wrote: ↑Thu Nov 12, 2020 5:09 pm
Madeleine Birchfield wrote: ↑Thu Nov 12, 2020 10:04 am ...which I interpreted to be referring to the search used in training Dragon (Stockfish's trainer uses Stockfish search), but it could just refer to the fact that Dragon uses the same search as Komodo.
Yes. They claim to and are very likely using Komodo's games as training data, but this doesn't mean they implemented new training code + made improvements/changes to the network architecture. This is exceedingly improbable imho.

Likely, what they did for training is the same as DKappe has been doing for a while which involves converting separate data obtained from self play games of a different engine into the packed fen format used by the SF trainer. It seems rather likely they didn't even bother swapping out the SF qsearch code used by the trainer.

To then actually run the networks produced by this process in their engine, they presumably got someone to exactly rewrite just the inference code so they could circumvent the GPL restrictions. If this is the case, I would personally like to see the computer Shogi developers who invested a lot of effort into writing the incredibly optimized and clever training code added to the Dragon authors list. They are responsible for the large majority of the work involved in the increase in strength.

Both Halogen and Seer are comparatively all original. Both just happen to rely on the "efficiently updatable" idea. They probably shouldn't be lumped into the same category as Komodo+NNUE.

(Also see Vajolet's NNUE branch)
Your wild speculations are amusing, stating as fact or high likelihood things you wish to be true. It’s good that things in developer land are generally more friendly. I’ve been encouraging the SF devs to port their trainer to pytorch for a while and been giving them small pointer in a few areas now that they are on the way. I was afraid they were going to run into a development roadblock without this, but they are making good progress. I am happy about this.

Just as a note, I’ve been training distilled, endgame and specialist nets in tensorflow and pytorch (and have started to use julia/flux) for several years. These aren’t new concepts to me. It’s my hobby. Don’t assume because you are helpless and out of your depth with regard to training neural nets that others are too.

P.S. On a more useful note, I’ve started using Tord Romstad’s excellent Chess.jl library (https://github.com/romstad/Chess.jl), though it has one major castling bug that I’m working to fix. Pretty speedy for stuff like qsearch.

Whoa. I believe you totally misintrepret my words (perhaps a little too much speculation on my part in all fairness)

I wish precisely the opposite! I'm very hopeful that the networks used in Komodo NNUE were trained using original training code and a unique network architecture. In fact, a separate forum post describing the training process/unique features of the implementations as well as lessons learned from implementing the training code and inference code from the ground up (is this what you're claiming?) would be much appreciated...

That said, what's with the personal attacks? Perhaps you interpreted what I had written as an attack? This was certainly not my intention. I'm "not hopelessly out of my depths" when it comes to training neural networks. In fact, this PyTorch NNUE training code you refer to largely originated from the PyTorch training code I wrote for training networks for my personal project (you'll find it referenced in Gary's repository's readme).

dkappe · Post by **dkappe** » Sat Nov 21, 2020 8:06 am

connor_mcmonigle wrote: ↑Sat Nov 21, 2020 7:55 am
Whoa. I believe you totally misintrepret my words (perhaps a little too much speculation on my part in all fairness) I wish precisely the opposite! I'm very hopeful that the networks used in Komodo NNUE were trained using original training code and a unique network architecture. In fact, a separate forum post describing the training process/unique features of the implementations as well as lessons learned from implementing the training code and inference code from the ground up (is this what you're claiming?) would be much appreciated...

That said, what's with the personal attacks? Perhaps you interpreted what I had written as an attack? This was certainly not my intention. I'm "not hopelessly out of my depths" when it comes to training neural networks. In fact, this PyTorch NNUE training code you refer to largely originated from the PyTorch training code I wrote for training networks for my personal project (you'll find it referenced in Gary's repository's readme).

My apologies. As you no doubt know, there are many very angry people in these forums.

I read your post and jumped to an unwarranted conclusion.

I’d love to share my experiences with Dragon, but that’s not mine to share. When I return focus back on my personal projects — a0lite, Bad Gyal and Night Nurse — and the tools and techniques I’ve developed, I will as usual make them all public.

Thanks for your work with seer (I assume that’s yours). Without it, work on Stockfish was going to start to get rough.

AndrewGrant · Post by **AndrewGrant** » Sat Nov 21, 2020 8:19 am

dkappe wrote: ↑Sat Nov 21, 2020 7:52 am
AndrewGrant wrote: ↑Sat Nov 21, 2020 7:49 am Ill note that you failed to deny the claims.
You mean the baseless speculations? Note what you like Andrew, but your rage posts are somewhat tiring.

Rage? What. Also, interesting phrase, "baseless speculations". "baseless accusations" is a thing, but baseless speculations? That is new.

connor_mcmonigle · Post by **connor_mcmonigle** » Sat Nov 21, 2020 8:26 am

dkappe wrote: ↑Sat Nov 21, 2020 8:06 am Thanks for your work with seer (I assume that’s yours).

Correct. No worries. I understand that you might not be able to disclose details about the training process. In any case, I've always found your other projects pretty neat and hope to see you make more interesting contributions to computer chess going forwards.

However, hopefully it's reasonable to suggest that if the Shogi Developers' training code was directly used to produce a network for Dragon (not that this is wrong by any means), the Shogi developers be added to the author list. Again, completely ignore this if it's not the case

Guenther · Post by **Guenther** » Sat Nov 21, 2020 8:50 am

I seems this thread was hijacked for speculations about nnue.
(especially Komodo ones, which never was a matter in this thread before at all and shouldn't,
as I never announce commercial releases).
I suggest to spllt that part away from the original thread.

Somehow it started with some dropping in by 'Madeleine'.

Guenther

Speculations about NNUE development

Re: New engine releases 2020

Re: New engine releases 2020

Re: New engine releases 2020

Re: New engine releases 2020

Re: New engine releases 2020

Re: New engine releases 2020

Re: New engine releases 2020

Re: New engine releases 2020

Re: New engine releases 2020

Re: New engine releases 2020