Why do the latest Sergio Vieri 384x30b networks scale so badly?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

dkappe
Posts: 1631
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Post by dkappe »

I’m not so much in contact as more of a thorn in their side.

There’s a lot of confirmation bias in the tests (most of which are STC). Regardless of what’s being tested, the nets are always magically 20-60 elo ahead of sfdev. Periodic tests of the pgn’s show there are quite often substantial duplicates.

Not sure that will change until there’s a decisive defeat.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Post by Laskos »

Another badly scaling net from SV, 384x30-t60-4206.pb.gz. All the nets since 3290 LR 0.02 have the same bad scaling. It might explain the current performance in TCEC SuFi, although only 25 or so games were played.

Code: Select all

6s + 0.1s
Score of SV_384x30_4206 vs SV_384x30_3010: 461 - 352 - 187  [0.554] 1000
...      SV_384x30_4206 playing White: 297 - 108 - 95  [0.689] 500
...      SV_384x30_4206 playing Black: 164 - 244 - 92  [0.420] 500
...      White vs Black: 541 - 272 - 187  [0.634] 1000
Elo difference: 38.0 +/- 19.5, LOS: 100.0 %, DrawRatio: 18.7 %
Finished match
The pentanomial error margins are 25% smaller than those shown.

Code: Select all

60s + 1s
Score of SV_384x30_4206 vs SV_384x30_3010: 42 - 51 - 107  [0.477] 200
...      SV_384x30_4206 playing White: 42 - 0 - 58  [0.710] 100
...      SV_384x30_4206 playing Black: 0 - 51 - 49  [0.245] 100
...      White vs Black: 93 - 0 - 107  [0.733] 200
Elo difference: -15.6 +/- 32.9, LOS: 17.5 %, DrawRatio: 53.5 %
Finished match
The pentanomial error margins are 35% smaller than those shown.

The bad scaling is outside error margins and reasonable doubt. I guess it might scale badly to longer time controls too. OC-ed RTX 2070 used.
dkappe
Posts: 1631
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Post by dkappe »

Which version of lc0 and which mlh settings are you using?
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Post by Laskos »

dkappe wrote: Tue Jun 23, 2020 5:04 pm Which version of lc0 and which mlh settings are you using?
v0.25.1 cudnn-fp16
No MLH settings (moves left head, right?). They are not accessible via easy to use UCI settings and I don't know what values to put anyway.
dkappe
Posts: 1631
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Post by dkappe »

Laskos wrote: Tue Jun 23, 2020 9:21 pm
dkappe wrote: Tue Jun 23, 2020 5:04 pm Which version of lc0 and which mlh settings are you using?
v0.25.1 cudnn-fp16
No MLH settings (moves left head, right?). They are not accessible via easy to use UCI settings and I don't know what values to put anyway.

Code: Select all

--show-hidden
That’s the flag that makes mlh visible in uci. Don’t ask me why they aren’t visible by default.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
dkappe
Posts: 1631
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Post by dkappe »

I suspect you don’t have MLH turned on. Here are some params:

Code: Select all

                                !mlh  mlh1    mlh2    mlh3    mlh4
MovesLeftMaxEffect              0.15  0.2179  0.2     0.2     0.15
MovesLeftThreshold              0.3   0.0     0.0     0.0     0.3
MovesLeftSlope                  0.01  0.0346  0.007   0.007   0.015
MovesLeftQuadraticFactor        0.85  1.0     0.85    0.0     1.0
MovesLeftScaledFactor           0.0   0.0     0.15    1.0     0.0
MovesLeftConstantFactor         0.15  0.0     0.0     0.0     0.0
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Post by Laskos »

dkappe wrote: Tue Jun 23, 2020 10:01 pm I suspect you don’t have MLH turned on. Here are some params:

Code: Select all

                                !mlh  mlh1    mlh2    mlh3    mlh4
MovesLeftMaxEffect              0.15  0.2179  0.2     0.2     0.15
MovesLeftThreshold              0.3   0.0     0.0     0.0     0.3
MovesLeftSlope                  0.01  0.0346  0.007   0.007   0.015
MovesLeftQuadraticFactor        0.85  1.0     0.85    0.0     1.0
MovesLeftScaledFactor           0.0   0.0     0.15    1.0     0.0
MovesLeftConstantFactor         0.15  0.0     0.0     0.0     0.0
Thanks! I guess MLH shouldn't change Elo, if Leela was trolling in the endgames without actually missing conversion? But does MLH change the goal from scoring optimization to something like scoring + moves left optimization? That would change Elo.
dkappe
Posts: 1631
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Post by dkappe »

Laskos wrote: Tue Jun 23, 2020 11:43 pm Thanks! I guess MLH shouldn't change Elo, if Leela was trolling in the endgames without actually missing conversion? But does MLH change the goal from scoring optimization to something like scoring + moves left optimization? That would change Elo.
It’s an extra head, so a little bit of a slowdown. If MLH isn’t on, you probably search fewer nodes without a benefit.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Post by Laskos »

dkappe wrote: Wed Jun 24, 2020 3:03 am
Laskos wrote: Tue Jun 23, 2020 11:43 pm Thanks! I guess MLH shouldn't change Elo, if Leela was trolling in the endgames without actually missing conversion? But does MLH change the goal from scoring optimization to something like scoring + moves left optimization? That would change Elo.
It’s an extra head, so a little bit of a slowdown. If MLH isn’t on, you probably search fewer nodes without a benefit.
I used the first mlh for both nets

Here is the result for 6s+0.1s

Code: Select all

Score of SV_384x30_4206_mlh vs SV_384x30_3010_mlh: 465 - 340 - 195  [0.563] 1000
...      SV_384x30_4206_mlh playing White: 305 - 97 - 98  [0.708] 500
...      SV_384x30_4206_mlh playing Black: 160 - 243 - 97  [0.417] 500
...      White vs Black: 548 - 257 - 195  [0.645] 1000
Elo difference: 43.7 +/- 19.4, LOS: 100.0 %, DrawRatio: 19.5 %
Finished match
Within error margins of the previous no-mlh result.

I am now running the same at 60s+1s and then will check the net 4206_mlh versus 4206 no mlh.
mbabigian
Posts: 204
Joined: Tue Oct 15, 2013 2:34 am
Location: US
Full name: Mike Babigian

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Post by mbabigian »

Here is the result for 6s+0.1s
How many time forfeits did you get running that timecontrol? I tried reproducing your results with similar conditions and I got a ridiculous number of time forfeits. More than 10%. I ran 5s+0.08 as I have a faster GPU and wanted to be close to your nodes/move. Perhaps that's just too fast for reliable results.

Did you adjust any of the time settings in LC0?

I was terribly disappointed with cutechess as I see no suspend and resume capability like Arena. Running a test that finishes overnight is doable, but tests that take days are not. I'm not giving up my machine for a week to run silly engine matches. I need a match player that can be killed and restarted where it left off. Is there something I'm missing with not-so-cutechess?

Mike
“Censorship is telling a man he can't have a steak just because a baby can't chew it.” ― Mark Twain