Something goes wrong with lc0 since yesterday?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: Something goes wrong with lc0 since yesterday?

Post by yanquis1972 »

i'd guess based on that that at classical TC they are. 40/40 equiv. would probably be close.

do you use a GUI to run STS? i tried w/ arena but kept getting hangs.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

yanquis1972 wrote: Sun Jul 22, 2018 7:38 pm i'd guess based on that that at classical TC they are. 40/40 equiv. would probably be close.

do you use a GUI to run STS? i tried w/ arena but kept getting hangs.
I am using Polyglot epd testing command line interface.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

Laskos wrote: Sun Jul 22, 2018 6:45 pm
yanquis1972 wrote: Sun Jul 22, 2018 4:35 pm
thanks for running those; the difference b/w 395 & 508 surprises me & i'm not sure what to make of it.

i think it's not till the 15-20 seconds/move the 256x20 nets should show their weight. 5/s move is probably around 0.2s/move relative to A0.
I checked strength-wise ID10136 in short time control games against AB engines, and it seems to be the best net of all 10xxx testerver nets, a bit above 1008x nets, but within error margins. ID10136 is probably only 30-40 Elo points lower than ID508 from mainserver in CCRL 40/4' conditions. But now I am checking the scaling to 10s/position on STS 1500 suite, if testserver ID10136 comes significantly better in scaling than mainserver ID508 (or maybe a later net), then to LTC (say 40/40') maybe the testserver nets are already the best.
I have completed the tests to see the scaling of testserver ID10136 compared to mainserver ID508 using STS 1500 positions suite. It seems in CCRL 40/4' conditions ID10136 is weaker by 30-40 Elo points than ID508 against AB engines (although the error margins are pretty large). But the larger 20x256 net indeed scales better than 15x192 net to longer time control, going from 2s/positions to 10s/position.

ID10136
2.0s
score=767/1500 [averages on correct positions: depth=2.3 time=0.38 nodes=559]
10.0s
score=865/1500 [averages on correct positions: depth=2.4 time=1.19 nodes=2441]

+98


ID508
2.0s
score=1056/1500 [averages on correct positions: depth=2.6 time=0.40 nodes=1292]
10.0s
score=1114/1500 [averages on correct positions: depth=2.8 time=0.88 nodes=3710]

+58

Significantly better scaling of ID10136. So, I guess, ID10136 is about equal to ID508 in strength against AB engines at 40/40', and stronger than it at classical time control. Slowly, the testnet nets overcome the mainserver nets, for now at LTC.
Werewolf
Posts: 1796
Joined: Thu Sep 18, 2008 10:24 pm

Re: Something goes wrong with lc0 since yesterday?

Post by Werewolf »

My tests with 10147 are showing much improved tactical results on my Nvidia 1060 at this early stage of testing.
JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: Something goes wrong with lc0 since yesterday?

Post by JJJ »

Seems an elo jump on main brain with ID 514.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

Laskos wrote: Mon Jul 23, 2018 1:06 am
Laskos wrote: Sun Jul 22, 2018 6:45 pm
yanquis1972 wrote: Sun Jul 22, 2018 4:35 pm
thanks for running those; the difference b/w 395 & 508 surprises me & i'm not sure what to make of it.

i think it's not till the 15-20 seconds/move the 256x20 nets should show their weight. 5/s move is probably around 0.2s/move relative to A0.
I checked strength-wise ID10136 in short time control games against AB engines, and it seems to be the best net of all 10xxx testerver nets, a bit above 1008x nets, but within error margins. ID10136 is probably only 30-40 Elo points lower than ID508 from mainserver in CCRL 40/4' conditions. But now I am checking the scaling to 10s/position on STS 1500 suite, if testserver ID10136 comes significantly better in scaling than mainserver ID508 (or maybe a later net), then to LTC (say 40/40') maybe the testserver nets are already the best.
I have completed the tests to see the scaling of testserver ID10136 compared to mainserver ID508 using STS 1500 positions suite. It seems in CCRL 40/4' conditions ID10136 is weaker by 30-40 Elo points than ID508 against AB engines (although the error margins are pretty large). But the larger 20x256 net indeed scales better than 15x192 net to longer time control, going from 2s/positions to 10s/position.

ID10136
2.0s
score=767/1500 [averages on correct positions: depth=2.3 time=0.38 nodes=559]
10.0s
score=865/1500 [averages on correct positions: depth=2.4 time=1.19 nodes=2441]

+98


ID508
2.0s
score=1056/1500 [averages on correct positions: depth=2.6 time=0.40 nodes=1292]
10.0s
score=1114/1500 [averages on correct positions: depth=2.8 time=0.88 nodes=3710]

+58

Significantly better scaling of ID10136. So, I guess, ID10136 is about equal to ID508 in strength against AB engines at 40/40', and stronger than it at classical time control. Slowly, the testnet nets overcome the mainserver nets, for now at LTC.
Hmm, my theory is all well and nice, but here is a practical result of real games:
Komodo 12.1.1 on one core is about 3450 40/4' CCRL Elo.

At 10min + 10s time control, with a late mainserver net, I got:

+5 -1 =4 for Komodo.

WIth ID10148 of the testserver, I got:

+8 -1 =1 for Komodo.

Only 10 games each, but not very lucky or happy result for a late testserver net at longer TC. Maybe people with more time and hardware could check how they behave at longer time control.
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: Something goes wrong with lc0 since yesterday?

Post by yanquis1972 »

how many cores & what version of komodo?uhrr NM i think at same TC trying to find the right NPS ratio is important when comparing to A0, with the primary bench avaiable to us being a rough idea of nodes/position.

K12 is +20 elo over SF8 4CPU (on my setup i get the closest to a 1:1 ratio using 3 SF8 cores); at 1/s move A0 was probably about +20 over SF8 with the 80KN/70MN ratio.

and while it remains to be seen how much of a factor it was, they used a very limited, extremely brief opening set of no more than 6 plies.

i think if you re-test some number of nets after the next drop in LR & it's not close it's cause for concern, but as i see it we're still one or two drops away from when A0 probably caught SF.
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: Something goes wrong with lc0 since yesterday?

Post by yanquis1972 »

first win was a good portent: https://lichess.org/study/guI3lnix/s9zbEvnn

(draw is automatic at 60 moves & resign at -2.50; ideally i'll go back & finish some of the more interesting ones. the game against SF8 was id10157; all engines 1 core atm, sf5 was supposed to be 4)
Werewolf
Posts: 1796
Joined: Thu Sep 18, 2008 10:24 pm

Re: Something goes wrong with lc0 since yesterday?

Post by Werewolf »

Here are a few results which may / may not be of interest.

LCZero on Nvidia 1060 vs HIARCS 14 on single core @ 4.2 Ghz
TC: G5min + 5sec
Short Nooman book
HIARCS had access to its book, a big advantage, though a single core is surely slower than a 1060 I would guess.

May 29th
LCZero ID 349 CUDA

10 wins
6 draws
4 losses

..................................

July 23rd
ID 10147 Big Net

12 wins
6 draws
2 losses

Some progress...
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

Werewolf wrote: Wed Jul 25, 2018 5:45 pm Here are a few results which may / may not be of interest.

LCZero on Nvidia 1060 vs HIARCS 14 on single core @ 4.2 Ghz
TC: G5min + 5sec
Short Nooman book
HIARCS had access to its book, a big advantage, though a single core is surely slower than a 1060 I would guess.

May 29th
LCZero ID 349 CUDA

10 wins
6 draws
4 losses

..................................

July 23rd
ID 10147 Big Net

12 wins
6 draws
2 losses

Some progress...
Interesting. ID10147 shows about 3300 CCRL 40/4' performance, even though Hiarcs 14 had access to its excellent book. That somehow confirms that test-framework networks are very close to main-branch networks, which at their peak show some 3350 CCRL 40/4' performance on GTX 1060 (I have the same GPU).

I have performed a scaling test in direct matches between best test-nets and best main-nets. That's not exactly a correct procedure, it would be better to compare them in gauntlets against AB engines, but it would need 4 times more games for the same error bars when deriving the scaling. Here are the results:

6'' + 0.1''
Score of lc0_v16 ID10190 vs lc0_v16 ID521: 50 - 97 - 53 [0.383] 200
Elo difference: -83.20 +/- 42.20
Finished match

60'' + 1''
Score of lc0_v16 ID10190 vs lc0_v16 ID521: 42 - 65 - 93 [0.443] 200
Elo difference: -40.13 +/- 35.32
Finished match

So, it is expected that on GTX 1060 6GB, testnet overcomes mainnet at some 600'' + 10'' time control and longer. Error margins are still pretty large, though.