Fat Fritz 2

gaard · Post by **gaard** » Sun Feb 14, 2021 2:00 am

dkappe wrote: ↑Sun Feb 14, 2021 1:56 am
gaard wrote: ↑Sun Feb 14, 2021 1:49 am You don't have a copy of the net either?
I don’t do Windows and most of the chessbase software is Windows based. (I may have an ancient copy of chessbase for Windows NT somewhere.) I’ve restricted my comments to the GPL as I can’t speak to how FF2 plays or whether it would be worth X $ to someone. I somehow assumed that you had a copy, given your confident statements about how strong or weak FF2 was.

I am just going on what CB has advertised wrt FF2's supposed strength, compared to my own testing, which is available under CCC: Tournaments and Matches. I guess we will both have to wait for someone else to answer the call.

dkappe · Post by **dkappe** » Sun Feb 14, 2021 2:05 am

gaard wrote: ↑Sun Feb 14, 2021 2:00 am
dkappe wrote: ↑Sun Feb 14, 2021 1:56 am
gaard wrote: ↑Sun Feb 14, 2021 1:49 am You don't have a copy of the net either?
I don’t do Windows and most of the chessbase software is Windows based. (I may have an ancient copy of chessbase for Windows NT somewhere.) I’ve restricted my comments to the GPL as I can’t speak to how FF2 plays or whether it would be worth X $ to someone. I somehow assumed that you had a copy, given your confident statements about how strong or weak FF2 was.
I am just going on what CB has advertised wrt FF2's supposed strength, compared to my own testing, which is available under CCC: Tournaments and Matches. I guess we will both have to wait for someone else to answer the call.

I see some 1 thread sf12 vs sfdev tests there. Are those the ones to which you are referring?

gaard · Post by **gaard** » Sun Feb 14, 2021 2:12 am

dkappe wrote: ↑Sun Feb 14, 2021 2:05 am
gaard wrote: ↑Sun Feb 14, 2021 2:00 am
dkappe wrote: ↑Sun Feb 14, 2021 1:56 am
gaard wrote: ↑Sun Feb 14, 2021 1:49 am You don't have a copy of the net either?
I don’t do Windows and most of the chessbase software is Windows based. (I may have an ancient copy of chessbase for Windows NT somewhere.) I’ve restricted my comments to the GPL as I can’t speak to how FF2 plays or whether it would be worth X $ to someone. I somehow assumed that you had a copy, given your confident statements about how strong or weak FF2 was.
I am just going on what CB has advertised wrt FF2's supposed strength, compared to my own testing, which is available under CCC: Tournaments and Matches. I guess we will both have to wait for someone else to answer the call.
I see some 1 thread sf12 vs sfdev tests there. Are those the ones to which you are referring?

I've ran tournaments very similar to the conditions that CB has against SF12: http://talkchess.com/forum3/viewtopic.php?f=6&t=76062

FF2 is approximately 2.6 Elo above SF-dev of Dec. 12, 2020.

These are CBs results wrt FF2:

Code: Select all

Score of Fat Fritz 2 vs Stockfish 12:

    286 wins / 99 losses / 1167 draws

    Elo difference: 42.1 +/- 8.5, LOS: 100.0 %, DrawRatio: 75.2 %
    1552 of 1552 games finished.

These are my incomplete results of SF12 vs SF-dev:

Code: Select all

Score of Stockfish 210211 vs Stockfish 200902: 90 - 19 - 264 [0.595]
...      Stockfish 210211 playing White: 3 - 18 - 165  [0.460] 186
...      Stockfish 210211 playing Black: 87 - 1 - 99  [0.730] 187
...      White vs Black: 4 - 105 - 264  [0.365] 373
Elo difference: 67.0 +/- 18.5, LOS: 100.0 %, DrawRatio: 70.8 %
373 of 3600 games finished.

So FF2 is about 25 Elo below SF-dev, but more results to come.

Edit: Changed from about to above

gaard · Post by **gaard** » Sun Feb 14, 2021 3:13 am

dkappe wrote: ↑Sun Feb 14, 2021 2:05 am
gaard wrote: ↑Sun Feb 14, 2021 2:00 am
dkappe wrote: ↑Sun Feb 14, 2021 1:56 am
gaard wrote: ↑Sun Feb 14, 2021 1:49 am You don't have a copy of the net either?
I don’t do Windows and most of the chessbase software is Windows based. (I may have an ancient copy of chessbase for Windows NT somewhere.) I’ve restricted my comments to the GPL as I can’t speak to how FF2 plays or whether it would be worth X $ to someone. I somehow assumed that you had a copy, given your confident statements about how strong or weak FF2 was.
I am just going on what CB has advertised wrt FF2's supposed strength, compared to my own testing, which is available under CCC: Tournaments and Matches. I guess we will both have to wait for someone else to answer the call.
I see some 1 thread sf12 vs sfdev tests there. Are those the ones to which you are referring?

Code: Select all

Score of Stockfish 210211 vs Stockfish 200902: 103 - 23 - 314 [0.591]
...      Stockfish 210211 playing White: 5 - 22 - 193  [0.461] 220
...      Stockfish 210211 playing Black: 98 - 1 - 121  [0.720] 220
...      White vs Black: 6 - 120 - 314  [0.370] 440
Elo difference: 63.9 +/- 16.9, LOS: 100.0 %, DrawRatio: 71.4 %
440 of 3600 games finished.

Still only 12.2% complete but 20+ Elo yet above FF2 wrt SF12.

Modern Times · Post by **Modern Times** » Sun Feb 14, 2021 3:15 am

As always an engine's performance depends on testing conditions. I mentioned a few pages back that I ran a small test, which had FF2 and SF 2021-01-11 about the same:

Silver Suite Large from 2011, 793 positions, at CCRL blitz time control. Similar to the one used for that chessbase article presumably. The score was as below

Code: Select all


Score of Fat Fritz 2 vs Stockfish 2021-01-11 : 106 - 99 - 1381 [0.502]
Elo difference: 1.5 +/- 6.1, LOS: 68.8 %, DrawRatio: 87.1 %

1586 of 1586 games finished.

So what is the point of it people might say ? Very good question. Maybe that double-size network has particular strength in analyses, maybe or maybe not, and maybe it is a very strong base to build on, more so than SF's network, or not. I don't have the knowledge to answer that question.

dkappe · Post by **dkappe** » Sun Feb 14, 2021 3:37 am

gaard wrote: ↑Sun Feb 14, 2021 3:13 am
dkappe wrote: ↑Sun Feb 14, 2021 2:05 am
gaard wrote: ↑Sun Feb 14, 2021 2:00 am
dkappe wrote: ↑Sun Feb 14, 2021 1:56 am
gaard wrote: ↑Sun Feb 14, 2021 1:49 am You don't have a copy of the net either?
I don’t do Windows and most of the chessbase software is Windows based. (I may have an ancient copy of chessbase for Windows NT somewhere.) I’ve restricted my comments to the GPL as I can’t speak to how FF2 plays or whether it would be worth X $ to someone. I somehow assumed that you had a copy, given your confident statements about how strong or weak FF2 was.
I am just going on what CB has advertised wrt FF2's supposed strength, compared to my own testing, which is available under CCC: Tournaments and Matches. I guess we will both have to wait for someone else to answer the call.
I see some 1 thread sf12 vs sfdev tests there. Are those the ones to which you are referring?
Code: Select all
Score of Stockfish 210211 vs Stockfish 200902: 103 - 23 - 314 [0.591]
...      Stockfish 210211 playing White: 5 - 22 - 193  [0.461] 220
...      Stockfish 210211 playing Black: 98 - 1 - 121  [0.720] 220
...      White vs Black: 6 - 120 - 314  [0.370] 440
Elo difference: 63.9 +/- 16.9, LOS: 100.0 %, DrawRatio: 71.4 %
440 of 3600 games finished.
Still only 12.2% complete but 20+ Elo yet above FF2 wrt SF12.

It’s the triangle inequality at work here. That means that the rating gap is no larger than 20 elo.

Alayan · Post by **Alayan** » Sun Feb 14, 2021 3:42 am

dkappe wrote: ↑Sun Feb 14, 2021 1:38 am
twobeer wrote: ↑Sun Feb 14, 2021 1:13 am
h1a8 wrote: ↑Sun Feb 14, 2021 1:04 am If it's clear then why do others disagree? Fair question.
WHo disagrees apart from those persons with commercial intrest in this scam from CB??

It should at least clearly be called Stockfish and not Fat Frits on the list.. the engine is Stockfish and NOTHING else.. Just a specifik network..
For those that say that FF2 is “just” a net and doesn’t add anything, can you provide some chess examples of where that’s true?

Are we supposed to shell out 100$ to the ChessBase profiteers to do a test that would satisfy you ?

FatFritz2 is only a different net and the binary is pure Stockfish dev. That's a known fact. Does the net provides more general strength ? Public test say no. Does the net provide significantly different move suggestions/ordering, thereby offering chess analysis value despite not being stronger ? The onus of the proof is onto ChessBase and FF2 supporters. They could use Ed Shroeder's sim tool on short and long searches and show FF2 not closely clustering with SF-dev... The default assumption is that it closely clusters with SF-dev.

Your own nets are much weaker than Stockfish's default net, so them not clustering closely with SF-Dev's is not a proof that FF2 must behave the same way.

CCRL feeding the ChessBase marketing deceptions by presenting FF2 as different from the Stockfish family (and FF1 as different from the Leela family for that matter, I didn't like that either) is shameful.

Some people might think Stockfish and FF2 are different engine families, but they are not. CCRL should not encourage mistaken beliefs in people using it as reference. But then Madeleine makes a good point that CCRL still doesn't give indication of Fire and Houdini's nature.

dkappe · Post by **dkappe** » Sun Feb 14, 2021 3:51 am

Alayan wrote: ↑Sun Feb 14, 2021 3:42 am Are we supposed to shell out 100$ to the ChessBase profiteers to do a test that would satisfy you ?

FatFritz2 is only a different net and the binary is pure Stockfish dev. That's a known fact. Does the net provides more general strength ? Public test say no. Does the net provide significantly different move suggestions/ordering, thereby offering chess analysis value despite not being stronger ? The onus of the proof is onto ChessBase and FF2 supporters. They could use Ed Shroeder's sim tool on short and long searches and show FF2 not closely clustering with SF-dev... The default assumption is that it closely clusters with SF-dev.

CCRL feeding the ChessBase marketing deceptions by presenting FF2 as different from the Stockfish family (and FF1 as different from the Leela family for that matter, I didn't like that either) is shameful.

Some people might think Stockfish and FF2 are different engine families, but they are not. CCRL should not encourage mistaken beliefs in people using it as reference. But then Madeleine makes a good point that CCRL still doesn't give indication of Fire and Houdini's nature.

I’m not asking you to buy anything. Given that I’ve trained Toga, Frosty (ICE), Night Nurse, Dark Horse, Harmon and Dragon nets, all from different data sources, I think I can claim some expertise in this domain. In my experience, the net matters more than the engine.

I simply assumed with all the confident statements about FF2, someone here must have actually looked at the analysis of it and compared it with SFDev.

twobeer · Post by **twobeer** » Sun Feb 14, 2021 4:42 am

dkappe wrote: ↑Sun Feb 14, 2021 2:50 am
Right. I’ve built quite a number of nnue nets from non-stockfish data. They all have different styles of play. I particularly like the Harmon net — trained from FIDE 2300+ human games — and night nurse — trained from data of the mcts/nn bad gyal nets. You can see some of the dramatic differences in play in the similarity test posts here on talkchess and in some of the position suite tests on some of the German forums.

I’ve done some fixed depth testing with Toga III and the same net on Sfdev. Not surprisingly, the net matters more than the engine.

So, to answer your rude question, I am not a moron. I understand very well that it’s a new or different net. Hopefully you understand that the net matters a great deal. Maybe you have some concrete chess observations to add about this new net?

CB is not marketing a net they are marketing it as a "chess entity".. This is simliar to packaging an engine with other default-values and tweaks for contempt etc. or tweaks to eval, search etc for that matter.. Of course play will be different if you tweak the eval with different weights..It will be even more different if you replace the engne / search functions etc. You simply don't seem to get that, hence my comments, about your seemingly inability to think this trough.

FAT FRITZ 2.0 = Stockfish engine
FAT FRITZ 2.0 = Stockfish engine
FAT FRITZ 2.0 = Stockfish engine
FAT FRITZ 2.0 = Stockfish engine

Get it?

A net is simply just a large parameter file to the engines eval and much less important than Stockfish engine itself..(engine works well without a net, a net does not work at all without an engine).

If this network really is the "major" innoovation as you seem to think, and engine is of lesser importance.. Why not juts upgrade the old Fat Fritz 1.0 engine with this network

.. Or bringa a truly unique engine to the table as Fat Fritz, instead of just ripping of the best Open Source one... OF course even you realize the real value here lies in the Stockfish NNUE code not in the tweaks A.silver and (you?) has made in a few weeks training a new net ....

I challenge you to write an engine yourself to use these weight files that would beat std stockfish even without using any nets at all for eval.. You and this A. Silver guy cannot of course do that, so what do you do? You rip of SF and then try to make a grab the money and run scheme before people realise the emperor has no clothes...

No shame these days in crooked businesmen....

Eduard · Post by **Eduard** » Sun Feb 14, 2021 5:04 am

ChessBase doesn't have to do anything anymore. Every 3 months they take the current Stockfish dev. Version, makes update to a Fat Fritz 2.x and claim Fat Fritz 2.x (and not Stockfish) got 15 Elo better! Great prospects at the expense of the Stockfish developers.

Fat Fritz 2

Re: Fat Fritz 2

Re: Fat Fritz 2

Re: Fat Fritz 2

Fat Fritz 2

Re: Fat Fritz 2

Re: Fat Fritz 2

Re: Fat Fritz 2

Re: Fat Fritz 2

Re: Fat Fritz 2

Re: Fat Fritz 2