I’m not so much in contact as more of a thorn in their side.
There’s a lot of confirmation bias in the tests (most of which are STC). Regardless of what’s being tested, the nets are always magically 20-60 elo ahead of sfdev. Periodic tests of the pgn’s show there are quite often substantial duplicates.
Not sure that will change until there’s a decisive defeat.
Why do the latest Sergio Vieri 384x30b networks scale so badly?
Moderators: hgm, Rebel, chrisw
-
- Posts: 1631
- Joined: Tue Aug 21, 2018 7:52 pm
- Full name: Dietrich Kappe
Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?
Another badly scaling net from SV, 384x30-t60-4206.pb.gz. All the nets since 3290 LR 0.02 have the same bad scaling. It might explain the current performance in TCEC SuFi, although only 25 or so games were played.
The pentanomial error margins are 25% smaller than those shown.
The pentanomial error margins are 35% smaller than those shown.
The bad scaling is outside error margins and reasonable doubt. I guess it might scale badly to longer time controls too. OC-ed RTX 2070 used.
Code: Select all
6s + 0.1s
Score of SV_384x30_4206 vs SV_384x30_3010: 461 - 352 - 187 [0.554] 1000
... SV_384x30_4206 playing White: 297 - 108 - 95 [0.689] 500
... SV_384x30_4206 playing Black: 164 - 244 - 92 [0.420] 500
... White vs Black: 541 - 272 - 187 [0.634] 1000
Elo difference: 38.0 +/- 19.5, LOS: 100.0 %, DrawRatio: 18.7 %
Finished match
Code: Select all
60s + 1s
Score of SV_384x30_4206 vs SV_384x30_3010: 42 - 51 - 107 [0.477] 200
... SV_384x30_4206 playing White: 42 - 0 - 58 [0.710] 100
... SV_384x30_4206 playing Black: 0 - 51 - 49 [0.245] 100
... White vs Black: 93 - 0 - 107 [0.733] 200
Elo difference: -15.6 +/- 32.9, LOS: 17.5 %, DrawRatio: 53.5 %
Finished match
The bad scaling is outside error margins and reasonable doubt. I guess it might scale badly to longer time controls too. OC-ed RTX 2070 used.
-
- Posts: 1631
- Joined: Tue Aug 21, 2018 7:52 pm
- Full name: Dietrich Kappe
Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?
Which version of lc0 and which mlh settings are you using?
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
-
- Posts: 1631
- Joined: Tue Aug 21, 2018 7:52 pm
- Full name: Dietrich Kappe
Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?
Code: Select all
--show-hidden
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
-
- Posts: 1631
- Joined: Tue Aug 21, 2018 7:52 pm
- Full name: Dietrich Kappe
Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?
I suspect you don’t have MLH turned on. Here are some params:
Code: Select all
!mlh mlh1 mlh2 mlh3 mlh4
MovesLeftMaxEffect 0.15 0.2179 0.2 0.2 0.15
MovesLeftThreshold 0.3 0.0 0.0 0.0 0.3
MovesLeftSlope 0.01 0.0346 0.007 0.007 0.015
MovesLeftQuadraticFactor 0.85 1.0 0.85 0.0 1.0
MovesLeftScaledFactor 0.0 0.0 0.15 1.0 0.0
MovesLeftConstantFactor 0.15 0.0 0.0 0.0 0.0
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?
Thanks! I guess MLH shouldn't change Elo, if Leela was trolling in the endgames without actually missing conversion? But does MLH change the goal from scoring optimization to something like scoring + moves left optimization? That would change Elo.dkappe wrote: ↑Tue Jun 23, 2020 10:01 pm I suspect you don’t have MLH turned on. Here are some params:
Code: Select all
!mlh mlh1 mlh2 mlh3 mlh4 MovesLeftMaxEffect 0.15 0.2179 0.2 0.2 0.15 MovesLeftThreshold 0.3 0.0 0.0 0.0 0.3 MovesLeftSlope 0.01 0.0346 0.007 0.007 0.015 MovesLeftQuadraticFactor 0.85 1.0 0.85 0.0 1.0 MovesLeftScaledFactor 0.0 0.0 0.15 1.0 0.0 MovesLeftConstantFactor 0.15 0.0 0.0 0.0 0.0
-
- Posts: 1631
- Joined: Tue Aug 21, 2018 7:52 pm
- Full name: Dietrich Kappe
Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?
It’s an extra head, so a little bit of a slowdown. If MLH isn’t on, you probably search fewer nodes without a benefit.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?
I used the first mlh for both nets
Here is the result for 6s+0.1s
Code: Select all
Score of SV_384x30_4206_mlh vs SV_384x30_3010_mlh: 465 - 340 - 195 [0.563] 1000
... SV_384x30_4206_mlh playing White: 305 - 97 - 98 [0.708] 500
... SV_384x30_4206_mlh playing Black: 160 - 243 - 97 [0.417] 500
... White vs Black: 548 - 257 - 195 [0.645] 1000
Elo difference: 43.7 +/- 19.4, LOS: 100.0 %, DrawRatio: 19.5 %
Finished match
I am now running the same at 60s+1s and then will check the net 4206_mlh versus 4206 no mlh.
-
- Posts: 204
- Joined: Tue Oct 15, 2013 2:34 am
- Location: US
- Full name: Mike Babigian
Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?
How many time forfeits did you get running that timecontrol? I tried reproducing your results with similar conditions and I got a ridiculous number of time forfeits. More than 10%. I ran 5s+0.08 as I have a faster GPU and wanted to be close to your nodes/move. Perhaps that's just too fast for reliable results.Here is the result for 6s+0.1s
Did you adjust any of the time settings in LC0?
I was terribly disappointed with cutechess as I see no suspend and resume capability like Arena. Running a test that finishes overnight is doable, but tests that take days are not. I'm not giving up my machine for a week to run silly engine matches. I need a match player that can be killed and restarted where it left off. Is there something I'm missing with not-so-cutechess?
Mike
“Censorship is telling a man he can't have a steak just because a baby can't chew it.” ― Mark Twain