SPCC: Testrun of Fat Fritz 1.0 finished

Discussion of computer chess matches and engine tournaments.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
pohl4711
Posts: 1119
Joined: Sat Sep 03, 2011 5:25 am
Location: Berlin, Germany
Contact:

SPCC: Testrun of Fat Fritz 1.0 finished

Post by pohl4711 » Mon Nov 18, 2019 5:16 am

Testrun finished: Fat Fritz 1.0

https://www.sp-cc.de

(Perhaps you have to clear your browsercache or reload the website)

pohl4711
Posts: 1119
Joined: Sat Sep 03, 2011 5:25 am
Location: Berlin, Germany
Contact:

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by pohl4711 » Mon Nov 18, 2019 12:11 pm

pohl4711 wrote:
Mon Nov 18, 2019 5:16 am
Testrun finished: Fat Fritz 1.0
I decided to replay the head-to-head between Fat Fritz vs Stockfish 190622 (500 games, HERT_250 openings, 50''+500ms), which was a part of the 3000 games testrun of Fat Fritz, with 4x longer thinking-time of 200''+2''. Which raises the average game-duration from 3 minutes to 12-13 minutes. Because, I want to end the discussion, that Fat Fritz (or lc0) benefits soooooo much more from longer thinking-time, than Stockfish does.

Result of the 50''+500ms testrun was:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 : 500 (+ 74,=318,-108), 46.6 % (Draws: 63.6%)

What to expect from the rematch with 4x more time (and 4x bigger Hash/NNCacheSize)? The draw-rate will increase, that could push the results towards a 50%-50% result - that will cause the illusion, that Fat Fritz ist getting stronger, even though only the higher number of draws is responsible for that (that effect is well known in computerchess since more than 30 years!). If Fat Fritz gets really stronger with more thinking-time, it should climb over the 50%-level - so it should beat Stockfish 190622. But, I doubt that. But in 4-5 days we will get the result.

schack
Posts: 140
Joined: Thu May 27, 2010 1:32 am
Contact:

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by schack » Mon Nov 18, 2019 4:18 pm

(Ignore. Misread post.)

sovaz1997
Posts: 235
Joined: Sun Nov 13, 2016 9:37 am

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by sovaz1997 » Mon Nov 18, 2019 5:43 pm

schack wrote:
Mon Nov 18, 2019 4:18 pm
(Ignore. Misread post.)
Wdum?
Zevra 2 - my chess engine.
Binary, source and description here: https://github.com/sovaz1997/Zevra2

schack
Posts: 140
Joined: Thu May 27, 2010 1:32 am
Contact:

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by schack » Mon Nov 18, 2019 6:05 pm

I asked a question, but it was based on a misreading of the above. So I edited to remove the question. If I could delete it, I would.

JJJ
Posts: 1287
Joined: Sat Apr 19, 2014 11:47 am

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by JJJ » Tue Nov 19, 2019 5:34 am

So it is weaker than Stockfish and lczero !

pohl4711
Posts: 1119
Joined: Sat Sep 03, 2011 5:25 am
Location: Berlin, Germany
Contact:

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by pohl4711 » Thu Nov 21, 2019 12:24 pm

pohl4711 wrote:
Mon Nov 18, 2019 12:11 pm
pohl4711 wrote:
Mon Nov 18, 2019 5:16 am
Testrun finished: Fat Fritz 1.0
I decided to replay the head-to-head between Fat Fritz vs Stockfish 190622 (500 games, HERT_250 openings, 50''+500ms), which was a part of the 3000 games testrun of Fat Fritz, with 4x longer thinking-time of 200''+2''. Which raises the average game-duration from 3 minutes to 12-13 minutes. Because, I want to end the discussion, that Fat Fritz (or lc0) benefits soooooo much more from longer thinking-time, than Stockfish does.

Result of the 50''+500ms testrun was:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 : 500 (+ 74,=318,-108), 46.6 % (Draws: 63.6%)

What to expect from the rematch with 4x more time (and 4x bigger Hash/NNCacheSize)? The draw-rate will increase, that could push the results towards a 50%-50% result - that will cause the illusion, that Fat Fritz ist getting stronger, even though only the higher number of draws is responsible for that (that effect is well known in computerchess since more than 30 years!). If Fat Fritz gets really stronger with more thinking-time, it should climb over the 50%-level - so it should beat Stockfish 190622. But, I doubt that. But in 4-5 days we will get the result.
Because I need my machines otherwise, I aborted the testrun with 4x more time after 250 of 500 games. The result until then is exactly like I expected:
200''+2000ms:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 :250 (+ 35,=177,- 38), 49.4 % (Draws: 70.8%) =
Measureable higher draw-rate, which pushes the result closer to the 50%-50% score (+7% more draw-rate should push the result around 3-3.5% closer to 50%-50%. That is exactly, what we see.)

QED

lkaufman
Posts: 3757
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by lkaufman » Thu Nov 21, 2019 5:09 pm

pohl4711 wrote:
Thu Nov 21, 2019 12:24 pm
pohl4711 wrote:
Mon Nov 18, 2019 12:11 pm
pohl4711 wrote:
Mon Nov 18, 2019 5:16 am
Testrun finished: Fat Fritz 1.0
I decided to replay the head-to-head between Fat Fritz vs Stockfish 190622 (500 games, HERT_250 openings, 50''+500ms), which was a part of the 3000 games testrun of Fat Fritz, with 4x longer thinking-time of 200''+2''. Which raises the average game-duration from 3 minutes to 12-13 minutes. Because, I want to end the discussion, that Fat Fritz (or lc0) benefits soooooo much more from longer thinking-time, than Stockfish does.

Result of the 50''+500ms testrun was:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 : 500 (+ 74,=318,-108), 46.6 % (Draws: 63.6%)

What to expect from the rematch with 4x more time (and 4x bigger Hash/NNCacheSize)? The draw-rate will increase, that could push the results towards a 50%-50% result - that will cause the illusion, that Fat Fritz ist getting stronger, even though only the higher number of draws is responsible for that (that effect is well known in computerchess since more than 30 years!). If Fat Fritz gets really stronger with more thinking-time, it should climb over the 50%-level - so it should beat Stockfish 190622. But, I doubt that. But in 4-5 days we will get the result.
Because I need my machines otherwise, I aborted the testrun with 4x more time after 250 of 500 games. The result until then is exactly like I expected:
200''+2000ms:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 :250 (+ 35,=177,- 38), 49.4 % (Draws: 70.8%) =
Measureable higher draw-rate, which pushes the result closer to the 50%-50% score (+7% more draw-rate should push the result around 3-3.5% closer to 50%-50%. That is exactly, what we see.)

QED
I would draw the opposite conclusion from your results. If we discard the draws, the score went from 74-108 to 35-38, a very marked improved (though the error margins are large). Probably one more doubling in TC would put Fat Fritz ahead of SF under your test conditions. I don't have a dog in that dogfight, but I will say that I do rather like Fat Fritz and I have the subjective impression that it needs more time than Lc0 to shine.
Komodo rules!

Raphexon
Posts: 115
Joined: Sun Mar 17, 2019 11:00 am
Full name: Henk Drost

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by Raphexon » Thu Nov 21, 2019 7:19 pm

lkaufman wrote:
Thu Nov 21, 2019 5:09 pm
pohl4711 wrote:
Thu Nov 21, 2019 12:24 pm
pohl4711 wrote:
Mon Nov 18, 2019 12:11 pm
pohl4711 wrote:
Mon Nov 18, 2019 5:16 am
Testrun finished: Fat Fritz 1.0
I decided to replay the head-to-head between Fat Fritz vs Stockfish 190622 (500 games, HERT_250 openings, 50''+500ms), which was a part of the 3000 games testrun of Fat Fritz, with 4x longer thinking-time of 200''+2''. Which raises the average game-duration from 3 minutes to 12-13 minutes. Because, I want to end the discussion, that Fat Fritz (or lc0) benefits soooooo much more from longer thinking-time, than Stockfish does.

Result of the 50''+500ms testrun was:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 : 500 (+ 74,=318,-108), 46.6 % (Draws: 63.6%)

What to expect from the rematch with 4x more time (and 4x bigger Hash/NNCacheSize)? The draw-rate will increase, that could push the results towards a 50%-50% result - that will cause the illusion, that Fat Fritz ist getting stronger, even though only the higher number of draws is responsible for that (that effect is well known in computerchess since more than 30 years!). If Fat Fritz gets really stronger with more thinking-time, it should climb over the 50%-level - so it should beat Stockfish 190622. But, I doubt that. But in 4-5 days we will get the result.
Because I need my machines otherwise, I aborted the testrun with 4x more time after 250 of 500 games. The result until then is exactly like I expected:
200''+2000ms:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 :250 (+ 35,=177,- 38), 49.4 % (Draws: 70.8%) =
Measureable higher draw-rate, which pushes the result closer to the 50%-50% score (+7% more draw-rate should push the result around 3-3.5% closer to 50%-50%. That is exactly, what we see.)

QED
I would draw the opposite conclusion from your results. If we discard the draws, the score went from 74-108 to 35-38, a very marked improved (though the error margins are large). Probably one more doubling in TC would put Fat Fritz ahead of SF under your test conditions. I don't have a dog in that dogfight, but I will say that I do rather like Fat Fritz and I have the subjective impression that it needs more time than Lc0 to shine.
I think NN's have a very asymptotic scaling.
At low nodes per move they can badly suffer from a tactical horizon effect.
But once you get past that they don't really change their mind anymore.

Dann Corbit
Posts: 10204
Joined: Wed Mar 08, 2006 7:57 pm
Location: Redmond, WA USA
Contact:

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by Dann Corbit » Thu Nov 21, 2019 8:18 pm

I get a big difference between 12 minutes (720 seconds) per position and 3000 seconds per position, analyzing opening positions.
So I think that they are like other engines and do better with more time.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.

Post Reply