Something goes wrong with lc0 since yesterday?

Werewolf · Post by **Werewolf** » Sat Jul 28, 2018 10:28 am

Laskos wrote: ↑Sat Jul 28, 2018 10:07 am
Interesting. ID10147 shows about 3300 CCRL 40/4' performance, even though Hiarcs 14 had access to its excellent book. That somehow confirms that test-framework networks are very close to main-branch networks, which at their peak show some 3350 CCRL 40/4' performance on GTX 1060 (I have the same GPU).

I have performed a scaling test in direct matches between best test-nets and best main-nets. That's not exactly a correct procedure, it would be better to compare them in gauntlets against AB engines, but it would need 4 times more games for the same error bars when deriving the scaling. Here are the results:

6'' + 0.1''
Score of lc0_v16 ID10190 vs lc0_v16 ID521: 50 - 97 - 53 [0.383] 200
Elo difference: -83.20 +/- 42.20
Finished match

60'' + 1''
Score of lc0_v16 ID10190 vs lc0_v16 ID521: 42 - 65 - 93 [0.443] 200
Elo difference: -40.13 +/- 35.32
Finished match

So, it is expected that on GTX 1060 6GB, testnet overcomes mainnet at some 600'' + 10'' time control and longer. Error margins are still pretty large, though.

That's good to hear, though the current graph is flat as a pancake. Unless something new is tried, it's not going any higher.

Werewolf · Post by **Werewolf** » Sat Jul 28, 2018 2:25 pm

Laskos wrote: ↑Sat Jul 28, 2018 10:07 am
Werewolf wrote: ↑Wed Jul 25, 2018 5:45 pm Here are a few results which may / may not be of interest.

LCZero on Nvidia 1060 vs HIARCS 14 on single core @ 4.2 Ghz
TC: G5min + 5sec
Short Nooman book
HIARCS had access to its book, a big advantage, though a single core is surely slower than a 1060 I would guess.

May 29th
LCZero ID 349 CUDA

10 wins
6 draws
4 losses

..................................

July 23rd
ID 10147 Big Net

12 wins
6 draws
2 losses

Some progress...
Interesting. ID10147 shows about 3300 CCRL 40/4' performance, even though Hiarcs 14 had access to its excellent book. That somehow confirms that test-framework networks are very close to main-branch networks, which at their peak show some 3350 CCRL 40/4' performance on GTX 1060 (I have the same GPU).

I have performed a scaling test in direct matches between best test-nets and best main-nets. That's not exactly a correct procedure, it would be better to compare them in gauntlets against AB engines, but it would need 4 times more games for the same error bars when deriving the scaling. Here are the results:

6'' + 0.1''
Score of lc0_v16 ID10190 vs lc0_v16 ID521: 50 - 97 - 53 [0.383] 200
Elo difference: -83.20 +/- 42.20
Finished match

60'' + 1''
Score of lc0_v16 ID10190 vs lc0_v16 ID521: 42 - 65 - 93 [0.443] 200
Elo difference: -40.13 +/- 35.32
Finished match

So, it is expected that on GTX 1060 6GB, testnet overcomes mainnet at some 600'' + 10'' time control and longer. Error margins are still pretty large, though.

This has prompted a thought:

How much slower is the 1060 than the hardware Alpha Zero ran on? I get about 2500 nps on average and A0 got 80,000. But it had either 4 or 8 (can't remember) special cards, so there must be some parallel search loss there. I wonder if the true speed difference is about 20x

So, if we gave the latest LC0 20x more time vs an alpha beta engine forced to run at normal CCRL time controls (no ponder, obviously) - what elo would LC0 get?

I'm wondering if LC0 had the same hardware as A0 it would be close in strength. If so...we might have got as high as we can go with the project.

Laskos · Post by **Laskos** » Sat Jul 28, 2018 4:19 pm

Werewolf wrote: ↑Sat Jul 28, 2018 2:25 pm
Laskos wrote: ↑Sat Jul 28, 2018 10:07 am
Werewolf wrote: ↑Wed Jul 25, 2018 5:45 pm Here are a few results which may / may not be of interest.

LCZero on Nvidia 1060 vs HIARCS 14 on single core @ 4.2 Ghz
TC: G5min + 5sec
Short Nooman book
HIARCS had access to its book, a big advantage, though a single core is surely slower than a 1060 I would guess.

May 29th
LCZero ID 349 CUDA

10 wins
6 draws
4 losses

..................................

July 23rd
ID 10147 Big Net

12 wins
6 draws
2 losses

Some progress...
Interesting. ID10147 shows about 3300 CCRL 40/4' performance, even though Hiarcs 14 had access to its excellent book. That somehow confirms that test-framework networks are very close to main-branch networks, which at their peak show some 3350 CCRL 40/4' performance on GTX 1060 (I have the same GPU).

I have performed a scaling test in direct matches between best test-nets and best main-nets. That's not exactly a correct procedure, it would be better to compare them in gauntlets against AB engines, but it would need 4 times more games for the same error bars when deriving the scaling. Here are the results:

6'' + 0.1''
Score of lc0_v16 ID10190 vs lc0_v16 ID521: 50 - 97 - 53 [0.383] 200
Elo difference: -83.20 +/- 42.20
Finished match

60'' + 1''
Score of lc0_v16 ID10190 vs lc0_v16 ID521: 42 - 65 - 93 [0.443] 200
Elo difference: -40.13 +/- 35.32
Finished match

So, it is expected that on GTX 1060 6GB, testnet overcomes mainnet at some 600'' + 10'' time control and longer. Error margins are still pretty large, though.
This has prompted a thought:

How much slower is the 1060 than the hardware Alpha Zero ran on? I get about 2500 nps on average and A0 got 80,000. But it had either 4 or 8 (can't remember) special cards, so there must be some parallel search loss there. I wonder if the true speed difference is about 20x

So, if we gave the latest LC0 20x more time vs an alpha beta engine forced to run at normal CCRL time controls (no ponder, obviously) - what elo would LC0 get?

I'm wondering if LC0 had the same hardware as A0 it would be close in strength. If so...we might have got as high as we can go with the project.

No, I guess Lc0 (testnets) is maybe 300 Elo points weaker than A0 on the same hardware, if they scale similarly, otherwise one has to specify the conditions. The issue also is the Elo scale. It might seem fine, just 300 Elo points, 2-3 LR lowerings. But on that strong hardware of A0, 300 Elo points mean a lot. It's not the same getting 300 Elo points at 2500 CCRL Elo level and getting them at 3600 CCRL level. I suspect it is related to the issue that Chess simply is capping, has a pretty low bound for upper range, and Elo becomes a bit wrong measure. I also suspect that Go is capping at much higher values, and the progress should be steadier there Elo-wise. So, there is room for progress for Lc0 to A0 level, but let's see when it will come close.

JJJ · Post by **JJJ** » Mon Jul 30, 2018 8:17 pm

Is Leela still making progress ? I ask because all new id are still loosing against id 390. The graph seems not reliable at all.

megamau · Post by **megamau** » Tue Jul 31, 2018 1:59 am

JJJ wrote: ↑Mon Jul 30, 2018 8:17 pm Is Leela still making progress ? I ask because all new id are still loosing against id 390. The graph seems not reliable at all.

What do you mean ?

All the latest tests ID390 has lost.

http://lczero.org/matches

Code: Select all

Id	Candidate ID	Current ID	Pass	Score	         Elo Delta	Elo Error Margin	Time
549	390	            527	        test	+137 -212 =341	-37.9	         ±18.4		2018-07-30 12:57:34.145877 -0400 EDT
526	390	            505	        test	+145 -184 =349	-20.0	         ±18.2		2018-07-19 14:11:49.229712 -0400 EDT

JJJ · Post by **JJJ** » Tue Jul 31, 2018 2:03 am

megamau wrote: ↑Tue Jul 31, 2018 1:59 am
JJJ wrote: ↑Mon Jul 30, 2018 8:17 pm Is Leela still making progress ? I ask because all new id are still loosing against id 390. The graph seems not reliable at all.
What do you mean ?

All the latest tests ID390 has lost.

http://lczero.org/matches
Code: Select all
Id	Candidate ID	Current ID	Pass	Score	         Elo Delta	Elo Error Margin	Time
549	390	            527	        test	+137 -212 =341	-37.9	         ±18.4		2018-07-30 12:57:34.145877 -0400 EDT
526	390	            505	        test	+145 -184 =349	-20.0	         ±18.2		2018-07-19 14:11:49.229712 -0400 EDT

My mistake ! I was thinking it was the opposite. Thanks you !

Werewolf · Post by **Werewolf** » Tue Jul 31, 2018 12:01 pm

What is going on with the test LC0 ?? Going up like a rocket!

jkiliani · Post by **jkiliani** » Tue Jul 31, 2018 12:03 pm

Werewolf wrote: ↑Tue Jul 31, 2018 12:01 pm What is going on with the test LC0 ?? Going up like a rocket!

The learning rate was reset to the initial value. The nets are probably still considerably weaker than the ones before this reset, but it seems to be learning fast...

Rubinus · Post by **Rubinus** » Tue Jul 31, 2018 8:46 pm

No longer. From heading 495, we're going to make a big difference. Before the reset, the strongest 359, not 390 and 495, surpassed her 44 ELO - gauntlet 120 games. I'm also interested in the new growth, I guess that could be another real 20-50 ELO addition, at version 529.
I have the feeling that they have to change the methodology, maybe they add other bets against UCI engines to the training files, not just LC0 against each other.

George Tsavdaris · Post by **George Tsavdaris** » Fri Aug 10, 2018 5:12 pm

I was out for half a month and get back to see test10 nets are somewhere like 3400 CCRL 40/4 Elo!

It seems Deepmind didn't lie.

Even though we should not forgive them the fact they did no gave(hopefully they will do with the peer reviewed paper even though i highly doubt it) ALL the details and games. And the weights.

In a gaunlet i've run, Lc0 test10 10480 had a very good performance of 3397±52 CCRL 40/4 Elo.
TC was 40/2 to match CCRL 40/4 ratings. Lc0 run on a GTX 1070 Ti.

Lc0 Test10 10480 after 100 games:

Code: Select all

  Program          CCRL Elo   Error(cl 95%)          Games            Score 
Lc0! Test10 10480   3396.5      ±51.5            100 (+43,=47,-10)    66.5 %

   vs.                      :  games (  +,  =,  -),   (%) :    Diff,    SD, CFS (%)
   Stockfish 8              :     20 (  5, 12,  3),  55.0 :   -26.5,  26.3,   15.6
   Fire 7.1                 :     20 (  5, 14,  1),  60.0 :   +55.5,  26.3,   98.3
   Andscacs 9.3             :     20 (  8,  9,  3),  62.5 :  +189.5,  26.3,  100.0
   Gull 3                   :     20 ( 13,  5,  2),  77.5 :  +203.5,  26.3,  100.0
   Texel 1.07               :     20 ( 12,  7,  1),  77.5 :  +235.5,  26.3,  100.0

Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?