Fat Fritz question

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Leo
Posts: 1014
Joined: Fri Sep 16, 2016 4:55 pm
Location: USA/Minnesota
Full name: Leo Anger

Re: Fat Fritz question

Post by Leo » Mon Jul 19, 2021 3:37 pm

Tibono wrote:
Mon Jul 19, 2021 6:18 am
Leo wrote:
Sun Jul 18, 2021 8:36 pm
OK. Thanks a lot. What does IMO stand for?
in my opinion, I guess.
Thanks.
Advanced Micro Devices fan.

Albert Silver
Posts: 2991
Joined: Wed Mar 08, 2006 8:57 pm
Location: Rio de Janeiro, Brazil

Re: Fat Fritz question

Post by Albert Silver » Mon Jul 19, 2021 6:31 pm

dkappe wrote:
Sun Jul 18, 2021 9:02 pm
brianr wrote:
Sun Jul 18, 2021 7:09 pm
I would hardly characterize it as trivial, even for the vastly simpler SF-NNUE type nets (as I mentioned).
Naturally, for both types of nets after you have done a few it does become pretty straightforward.
As Sopel said, if you take the public data, etc., and just run the scripts, you basically get the master net, plus or minus some random variation. Zzzzzz.

If you use the data from other engines and build a somewhat different training framework (I’ve got a few laying about from non-chess pytorch projects) it’s a bit more challenging, especially if you have an order of magnitude less data than what the SF project brings to bear.
Yes, generating 16+ million Lc0 games for a 256x20 net at 900 nodes per move is a bit more time consuming and challenging, even with the best GPUs on the market (which was the 2080ti) when I did this. This is not to mention training and testing each and every possible NNUE structure such as:

192x2x16x16
192x2x16x16
192x2x24x24
192x2x32x32
192x2x48x48
192x2x64x64
192x3x32x32
256x2x16x16
256x2x24x24
256x2x32x32
256x2x48x48
256x2x64x64
256x3x32x32
etc.

Each test took an average 4-5 days of computer time on a 32-thread machine, And of course the many hours testing each of them. Needless to say this was all done using the Nodchip training code, not the newer pytorch pipeline.

Some of the results I shared with Dkappe for the benefit of the Komodo project.
I’ve never been interested in training with stockfish data — there are already hundred or thousands of such nets.
Yes, this was true even then.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."

Post Reply