SPCC: Testrun of Fat Fritz 1.0 finished

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
pohl4711
Posts: 2432
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

SPCC: Testrun of Fat Fritz 1.0 finished

Post by pohl4711 »

Testrun finished: Fat Fritz 1.0

https://www.sp-cc.de

(Perhaps you have to clear your browsercache or reload the website)
User avatar
pohl4711
Posts: 2432
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by pohl4711 »

pohl4711 wrote: Mon Nov 18, 2019 6:16 am Testrun finished: Fat Fritz 1.0
I decided to replay the head-to-head between Fat Fritz vs Stockfish 190622 (500 games, HERT_250 openings, 50''+500ms), which was a part of the 3000 games testrun of Fat Fritz, with 4x longer thinking-time of 200''+2''. Which raises the average game-duration from 3 minutes to 12-13 minutes. Because, I want to end the discussion, that Fat Fritz (or lc0) benefits soooooo much more from longer thinking-time, than Stockfish does.

Result of the 50''+500ms testrun was:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 : 500 (+ 74,=318,-108), 46.6 % (Draws: 63.6%)

What to expect from the rematch with 4x more time (and 4x bigger Hash/NNCacheSize)? The draw-rate will increase, that could push the results towards a 50%-50% result - that will cause the illusion, that Fat Fritz ist getting stronger, even though only the higher number of draws is responsible for that (that effect is well known in computerchess since more than 30 years!). If Fat Fritz gets really stronger with more thinking-time, it should climb over the 50%-level - so it should beat Stockfish 190622. But, I doubt that. But in 4-5 days we will get the result.
schack
Posts: 172
Joined: Thu May 27, 2010 3:32 am

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by schack »

(Ignore. Misread post.)
sovaz1997
Posts: 261
Joined: Sun Nov 13, 2016 10:37 am

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by sovaz1997 »

schack wrote: Mon Nov 18, 2019 5:18 pm (Ignore. Misread post.)
Wdum?
Zevra 2 is my chess engine. Binary, source and description here: https://github.com/sovaz1997/Zevra2
Zevra v2.5 is last version of Zevra: https://github.com/sovaz1997/Zevra2/releases
schack
Posts: 172
Joined: Thu May 27, 2010 3:32 am

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by schack »

I asked a question, but it was based on a misreading of the above. So I edited to remove the question. If I could delete it, I would.
JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by JJJ »

So it is weaker than Stockfish and lczero !
User avatar
pohl4711
Posts: 2432
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by pohl4711 »

pohl4711 wrote: Mon Nov 18, 2019 1:11 pm
pohl4711 wrote: Mon Nov 18, 2019 6:16 am Testrun finished: Fat Fritz 1.0
I decided to replay the head-to-head between Fat Fritz vs Stockfish 190622 (500 games, HERT_250 openings, 50''+500ms), which was a part of the 3000 games testrun of Fat Fritz, with 4x longer thinking-time of 200''+2''. Which raises the average game-duration from 3 minutes to 12-13 minutes. Because, I want to end the discussion, that Fat Fritz (or lc0) benefits soooooo much more from longer thinking-time, than Stockfish does.

Result of the 50''+500ms testrun was:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 : 500 (+ 74,=318,-108), 46.6 % (Draws: 63.6%)

What to expect from the rematch with 4x more time (and 4x bigger Hash/NNCacheSize)? The draw-rate will increase, that could push the results towards a 50%-50% result - that will cause the illusion, that Fat Fritz ist getting stronger, even though only the higher number of draws is responsible for that (that effect is well known in computerchess since more than 30 years!). If Fat Fritz gets really stronger with more thinking-time, it should climb over the 50%-level - so it should beat Stockfish 190622. But, I doubt that. But in 4-5 days we will get the result.
Because I need my machines otherwise, I aborted the testrun with 4x more time after 250 of 500 games. The result until then is exactly like I expected:
200''+2000ms:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 :250 (+ 35,=177,- 38), 49.4 % (Draws: 70.8%) =
Measureable higher draw-rate, which pushes the result closer to the 50%-50% score (+7% more draw-rate should push the result around 3-3.5% closer to 50%-50%. That is exactly, what we see.)

QED
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by lkaufman »

pohl4711 wrote: Thu Nov 21, 2019 1:24 pm
pohl4711 wrote: Mon Nov 18, 2019 1:11 pm
pohl4711 wrote: Mon Nov 18, 2019 6:16 am Testrun finished: Fat Fritz 1.0
I decided to replay the head-to-head between Fat Fritz vs Stockfish 190622 (500 games, HERT_250 openings, 50''+500ms), which was a part of the 3000 games testrun of Fat Fritz, with 4x longer thinking-time of 200''+2''. Which raises the average game-duration from 3 minutes to 12-13 minutes. Because, I want to end the discussion, that Fat Fritz (or lc0) benefits soooooo much more from longer thinking-time, than Stockfish does.

Result of the 50''+500ms testrun was:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 : 500 (+ 74,=318,-108), 46.6 % (Draws: 63.6%)

What to expect from the rematch with 4x more time (and 4x bigger Hash/NNCacheSize)? The draw-rate will increase, that could push the results towards a 50%-50% result - that will cause the illusion, that Fat Fritz ist getting stronger, even though only the higher number of draws is responsible for that (that effect is well known in computerchess since more than 30 years!). If Fat Fritz gets really stronger with more thinking-time, it should climb over the 50%-level - so it should beat Stockfish 190622. But, I doubt that. But in 4-5 days we will get the result.
Because I need my machines otherwise, I aborted the testrun with 4x more time after 250 of 500 games. The result until then is exactly like I expected:
200''+2000ms:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 :250 (+ 35,=177,- 38), 49.4 % (Draws: 70.8%) =
Measureable higher draw-rate, which pushes the result closer to the 50%-50% score (+7% more draw-rate should push the result around 3-3.5% closer to 50%-50%. That is exactly, what we see.)

QED
I would draw the opposite conclusion from your results. If we discard the draws, the score went from 74-108 to 35-38, a very marked improved (though the error margins are large). Probably one more doubling in TC would put Fat Fritz ahead of SF under your test conditions. I don't have a dog in that dogfight, but I will say that I do rather like Fat Fritz and I have the subjective impression that it needs more time than Lc0 to shine.
Komodo rules!
Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by Raphexon »

lkaufman wrote: Thu Nov 21, 2019 6:09 pm
pohl4711 wrote: Thu Nov 21, 2019 1:24 pm
pohl4711 wrote: Mon Nov 18, 2019 1:11 pm
pohl4711 wrote: Mon Nov 18, 2019 6:16 am Testrun finished: Fat Fritz 1.0
I decided to replay the head-to-head between Fat Fritz vs Stockfish 190622 (500 games, HERT_250 openings, 50''+500ms), which was a part of the 3000 games testrun of Fat Fritz, with 4x longer thinking-time of 200''+2''. Which raises the average game-duration from 3 minutes to 12-13 minutes. Because, I want to end the discussion, that Fat Fritz (or lc0) benefits soooooo much more from longer thinking-time, than Stockfish does.

Result of the 50''+500ms testrun was:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 : 500 (+ 74,=318,-108), 46.6 % (Draws: 63.6%)

What to expect from the rematch with 4x more time (and 4x bigger Hash/NNCacheSize)? The draw-rate will increase, that could push the results towards a 50%-50% result - that will cause the illusion, that Fat Fritz ist getting stronger, even though only the higher number of draws is responsible for that (that effect is well known in computerchess since more than 30 years!). If Fat Fritz gets really stronger with more thinking-time, it should climb over the 50%-level - so it should beat Stockfish 190622. But, I doubt that. But in 4-5 days we will get the result.
Because I need my machines otherwise, I aborted the testrun with 4x more time after 250 of 500 games. The result until then is exactly like I expected:
200''+2000ms:
Fat Fritz 1.0 vs. Stockfish 190622 bmi2 :250 (+ 35,=177,- 38), 49.4 % (Draws: 70.8%) =
Measureable higher draw-rate, which pushes the result closer to the 50%-50% score (+7% more draw-rate should push the result around 3-3.5% closer to 50%-50%. That is exactly, what we see.)

QED
I would draw the opposite conclusion from your results. If we discard the draws, the score went from 74-108 to 35-38, a very marked improved (though the error margins are large). Probably one more doubling in TC would put Fat Fritz ahead of SF under your test conditions. I don't have a dog in that dogfight, but I will say that I do rather like Fat Fritz and I have the subjective impression that it needs more time than Lc0 to shine.
I think NN's have a very asymptotic scaling.
At low nodes per move they can badly suffer from a tactical horizon effect.
But once you get past that they don't really change their mind anymore.
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: SPCC: Testrun of Fat Fritz 1.0 finished

Post by Dann Corbit »

I get a big difference between 12 minutes (720 seconds) per position and 3000 seconds per position, analyzing opening positions.
So I think that they are like other engines and do better with more time.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.