Why do the latest Sergio Vieri 384x30b networks scale so badly?

dkappe · Post by **dkappe** » Mon Jun 22, 2020 2:43 am

I’m not so much in contact as more of a thorn in their side.

There’s a lot of confirmation bias in the tests (most of which are STC). Regardless of what’s being tested, the nets are always magically 20-60 elo ahead of sfdev. Periodic tests of the pgn’s show there are quite often substantial duplicates.

Not sure that will change until there’s a decisive defeat.

Laskos · Post by **Laskos** » Tue Jun 23, 2020 12:43 am

Another badly scaling net from SV, 384x30-t60-4206.pb.gz. All the nets since 3290 LR 0.02 have the same bad scaling. It might explain the current performance in TCEC SuFi, although only 25 or so games were played.

Code: Select all

6s + 0.1s
Score of SV_384x30_4206 vs SV_384x30_3010: 461 - 352 - 187  [0.554] 1000
...      SV_384x30_4206 playing White: 297 - 108 - 95  [0.689] 500
...      SV_384x30_4206 playing Black: 164 - 244 - 92  [0.420] 500
...      White vs Black: 541 - 272 - 187  [0.634] 1000
Elo difference: 38.0 +/- 19.5, LOS: 100.0 %, DrawRatio: 18.7 %
Finished match

The pentanomial error margins are 25% smaller than those shown.

Code: Select all

60s + 1s
Score of SV_384x30_4206 vs SV_384x30_3010: 42 - 51 - 107  [0.477] 200
...      SV_384x30_4206 playing White: 42 - 0 - 58  [0.710] 100
...      SV_384x30_4206 playing Black: 0 - 51 - 49  [0.245] 100
...      White vs Black: 93 - 0 - 107  [0.733] 200
Elo difference: -15.6 +/- 32.9, LOS: 17.5 %, DrawRatio: 53.5 %
Finished match

The pentanomial error margins are 35% smaller than those shown.

The bad scaling is outside error margins and reasonable doubt. I guess it might scale badly to longer time controls too. OC-ed RTX 2070 used.

dkappe · Post by **dkappe** » Tue Jun 23, 2020 5:04 pm

Which version of lc0 and which mlh settings are you using?

Laskos · Post by **Laskos** » Tue Jun 23, 2020 9:21 pm

dkappe wrote: ↑Tue Jun 23, 2020 5:04 pm Which version of lc0 and which mlh settings are you using?

v0.25.1 cudnn-fp16
No MLH settings (moves left head, right?). They are not accessible via easy to use UCI settings and I don't know what values to put anyway.

dkappe · Post by **dkappe** » Tue Jun 23, 2020 9:55 pm

Laskos wrote: ↑Tue Jun 23, 2020 9:21 pm
dkappe wrote: ↑Tue Jun 23, 2020 5:04 pm Which version of lc0 and which mlh settings are you using?
v0.25.1 cudnn-fp16
No MLH settings (moves left head, right?). They are not accessible via easy to use UCI settings and I don't know what values to put anyway.

Code: Select all

--show-hidden

That’s the flag that makes mlh visible in uci. Don’t ask me why they aren’t visible by default.

dkappe · Post by **dkappe** » Tue Jun 23, 2020 10:01 pm

I suspect you don’t have MLH turned on. Here are some params:

Code: Select all

                                !mlh  mlh1    mlh2    mlh3    mlh4
MovesLeftMaxEffect              0.15  0.2179  0.2     0.2     0.15
MovesLeftThreshold              0.3   0.0     0.0     0.0     0.3
MovesLeftSlope                  0.01  0.0346  0.007   0.007   0.015
MovesLeftQuadraticFactor        0.85  1.0     0.85    0.0     1.0
MovesLeftScaledFactor           0.0   0.0     0.15    1.0     0.0
MovesLeftConstantFactor         0.15  0.0     0.0     0.0     0.0

Laskos · Post by **Laskos** » Tue Jun 23, 2020 11:43 pm

dkappe wrote: ↑Tue Jun 23, 2020 10:01 pm I suspect you don’t have MLH turned on. Here are some params:

Code: Select all

                                !mlh  mlh1    mlh2    mlh3    mlh4
MovesLeftMaxEffect              0.15  0.2179  0.2     0.2     0.15
MovesLeftThreshold              0.3   0.0     0.0     0.0     0.3
MovesLeftSlope                  0.01  0.0346  0.007   0.007   0.015
MovesLeftQuadraticFactor        0.85  1.0     0.85    0.0     1.0
MovesLeftScaledFactor           0.0   0.0     0.15    1.0     0.0
MovesLeftConstantFactor         0.15  0.0     0.0     0.0     0.0

Thanks! I guess MLH shouldn't change Elo, if Leela was trolling in the endgames without actually missing conversion? But does MLH change the goal from scoring optimization to something like scoring + moves left optimization? That would change Elo.

dkappe · Post by **dkappe** » Wed Jun 24, 2020 3:03 am

Laskos wrote: ↑Tue Jun 23, 2020 11:43 pm Thanks! I guess MLH shouldn't change Elo, if Leela was trolling in the endgames without actually missing conversion? But does MLH change the goal from scoring optimization to something like scoring + moves left optimization? That would change Elo.

It’s an extra head, so a little bit of a slowdown. If MLH isn’t on, you probably search fewer nodes without a benefit.

Laskos · Post by **Laskos** » Wed Jun 24, 2020 10:38 am

dkappe wrote: ↑Wed Jun 24, 2020 3:03 am
Laskos wrote: ↑Tue Jun 23, 2020 11:43 pm Thanks! I guess MLH shouldn't change Elo, if Leela was trolling in the endgames without actually missing conversion? But does MLH change the goal from scoring optimization to something like scoring + moves left optimization? That would change Elo.
It’s an extra head, so a little bit of a slowdown. If MLH isn’t on, you probably search fewer nodes without a benefit.

I used the first mlh for both nets

Here is the result for 6s+0.1s

Code: Select all

Score of SV_384x30_4206_mlh vs SV_384x30_3010_mlh: 465 - 340 - 195  [0.563] 1000
...      SV_384x30_4206_mlh playing White: 305 - 97 - 98  [0.708] 500
...      SV_384x30_4206_mlh playing Black: 160 - 243 - 97  [0.417] 500
...      White vs Black: 548 - 257 - 195  [0.645] 1000
Elo difference: 43.7 +/- 19.4, LOS: 100.0 %, DrawRatio: 19.5 %
Finished match

Within error margins of the previous no-mlh result.

I am now running the same at 60s+1s and then will check the net 4206_mlh versus 4206 no mlh.

mbabigian · Post by **mbabigian** » Wed Jun 24, 2020 10:25 pm

Here is the result for 6s+0.1s

How many time forfeits did you get running that timecontrol? I tried reproducing your results with similar conditions and I got a ridiculous number of time forfeits. More than 10%. I ran 5s+0.08 as I have a faster GPU and wanted to be close to your nodes/move. Perhaps that's just too fast for reliable results.

Did you adjust any of the time settings in LC0?

I was terribly disappointed with cutechess as I see no suspend and resume capability like Arena. Running a test that finishes overnight is doable, but tests that take days are not. I'm not giving up my machine for a week to run silly engine matches. I need a match player that can be killed and restarted where it left off. Is there something I'm missing with not-so-cutechess?

Mike

Why do the latest Sergio Vieri 384x30b networks scale so badly?

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?

Re: Why do the latest Sergio Vieri 384x30b networks scale so badly?