Page 1 of 3

my Lc0 spreadsheet

Posted: Fri Apr 19, 2019 12:09 am
by Hugo
Hi all

I started a spreadsheet with my direct comparison between Stockfish 100419 12cpu and Lc0 NN 40 on Rtx 2060-
Leela Ratio is 1.1

Games are played with ponder ON and 5min + 3sec. Each match is 100 games.
Every two days, a new network will be added. So it will grow up fast.
https://docs.google.com/spreadsheets/d/ ... sp=sharing

regards, C.K.

Re: my Lc0 spreadsheet

Posted: Sat Apr 20, 2019 9:14 am
by Hugo
so far Lc0-41997 made the best run !
only -14 Elo to SF100419 12cpu.

https://docs.google.com/spreadsheets/d/ ... sp=sharing

C.K.

Re: my Lc0 spreadsheet

Posted: Tue Apr 23, 2019 10:43 am
by Hugo
first time a match is won by Lc0. Pretty close, just one game.
Rematch is running.

Code: Select all

1) Lc0 v0.21.1-42000                4 :    100 (+15,=71,-14),  50.5 %

    vs.                                :  games (  +,  =,  -),   (%) :   Diff
    Stockfish 100419 64 BMI2-x12       :    100 ( 15, 71, 14),  50.5 :     +4
C.K.

Re: my Lc0 spreadsheet

Posted: Tue Apr 23, 2019 11:52 am
by Laskos
Hugo wrote: Fri Apr 19, 2019 12:09 am Hi all

I started a spreadsheet with my direct comparison between Stockfish 100419 12cpu and Lc0 NN 40 on Rtx 2060-
Leela Ratio is 1.1

Games are played with ponder ON and 5min + 3sec. Each match is 100 games.
Every two days, a new network will be added. So it will grow up fast.
https://docs.google.com/spreadsheets/d/ ... sp=sharing

regards, C.K.
Why ponder ON? How may cores your CPU has? If only 12, then "ponder ON" will harm the game-play.
The "effective" in-game Leela Ratio is not that easy to calculate and I am sure it's not 1.1 at all. A0 team in their _paper_ (not the earlier preprint) spoke of in-game average or median performance, not some benchmarks.

Re: my Lc0 spreadsheet

Posted: Tue Apr 23, 2019 1:18 pm
by Hugo
Laskos wrote: Tue Apr 23, 2019 11:52 am Why ponder ON? How may cores your CPU has? If only 12, then "ponder ON" will harm the game-play.
The "effective" in-game Leela Ratio is not that easy to calculate and I am sure it's not 1.1 at all. A0 team in their _paper_ (not the earlier preprint) spoke of in-game average or median performance, not some benchmarks.
hi

it is easy to find out with two klicks that E5-2697Av4 is a 16 core cpu (turbo mode 3.0 - 3.1 Ghz)
so 12 cpu for SF , 2 for Lc0, (and still 2 left for a try with 2 GPU later).

ponder ON is a little plus in performance to both participants in my eyes. At least it is no fail at all ;)
Leela Ratio is in deed hard to estimate. I would say mine is average 1.1.
From the startpossition Stockfish with 12 cpus has about 18.000 kNs in 1-2 Minutes and Lc0 24.000nps - 25.000 nps. So this is 1.2 ratio.
But when I observe the games, SF is far over 20.000 kNps after very few moves, while Lc0 is below 20.000 nps.
So 1.1 might be a realistic estimation for 5m +3s games.

have a nice day

C.K.

Re: my Lc0 spreadsheet

Posted: Tue Apr 23, 2019 2:31 pm
by Laskos
Hugo wrote: Tue Apr 23, 2019 1:18 pm
Laskos wrote: Tue Apr 23, 2019 11:52 am Why ponder ON? How may cores your CPU has? If only 12, then "ponder ON" will harm the game-play.
The "effective" in-game Leela Ratio is not that easy to calculate and I am sure it's not 1.1 at all. A0 team in their _paper_ (not the earlier preprint) spoke of in-game average or median performance, not some benchmarks.
hi

it is easy to find out with two klicks that E5-2697Av4 is a 16 core cpu (turbo mode 3.0 - 3.1 Ghz)
so 12 cpu for SF , 2 for Lc0, (and still 2 left for a try with 2 GPU later).

ponder ON is a little plus in performance to both participants in my eyes. At least it is no fail at all ;)
Leela Ratio is in deed hard to estimate. I would say mine is average 1.1.
From the startpossition Stockfish with 12 cpus has about 18.000 kNs in 1-2 Minutes and Lc0 24.000nps - 25.000 nps. So this is 1.2 ratio.
But when I observe the games, SF is far over 20.000 kNps after very few moves, while Lc0 is below 20.000 nps.
So 1.1 might be a realistic estimation for 5m +3s games.

have a nice day

C.K.
Thanks, looks good.
I would guess your "effective" Leela Ratio at some 0.9 +/- 0.1, but you should know better. Try to peek over NPS shown across the games.
I never seriously performed an average over the stages of the games for an "effective" Leela Ratio like that in the paper. I tried to adjust my time controls of Lc0 and SF_dev to have an "effective" Leela Ratio of about 1.0 in games, and I guess I managed something in the range 1.0 +/- 0.2. But my results with adjusted time controls (60s + 1s for Lc0 and 150s + 2.5s for SF_dev) to have a Leela Ratio of about 1.0 look a bit different:

Score of lc0_42056 vs 2.5*SF_dev: 19 - 10 - 71 [0.545] 100
Elo difference: 31.35 +/- 36.54
Finished match

I used 4-mover opening PGN of Stephan Pohl.
Leela seems to beat SF_dev in my experiment, but maybe I have an effective Leela Ratio of 1.2 and you of 0.9, and that would explain the discrepancy.
Nasty testing with this Leela :).

Re: my Lc0 spreadsheet

Posted: Tue Apr 23, 2019 4:41 pm
by jp
Laskos wrote: Tue Apr 23, 2019 2:31 pm
Hugo wrote: Tue Apr 23, 2019 1:18 pm Leela Ratio is in deed hard to estimate. I would say mine is average 1.1.
From the startpossition Stockfish with 12 cpus has about 18.000 kNs in 1-2 Minutes and Lc0 24.000nps - 25.000 nps. So this is 1.2 ratio.
But when I observe the games, SF is far over 20.000 kNps after very few moves, while Lc0 is below 20.000 nps.
So 1.1 might be a realistic estimation for 5m +3s games.
I would guess your "effective" Leela Ratio at some 0.9 +/- 0.1, but you should know better. Try to peek over NPS shown across the games.
I never seriously performed an average over the stages of the games for an "effective" Leela Ratio like that in the paper. I tried to adjust my time controls of Lc0 and SF_dev to have an "effective" Leela Ratio of about 1.0 in games, and I guess I managed something in the range 1.0 +/- 0.2. But my results with adjusted time controls (60s + 1s for Lc0 and 150s + 2.5s for SF_dev) to have a Leela Ratio of about 1.0 look a bit different:
Are we sure Lc0 & A0 use the same definition of "node"?

Re: my Lc0 spreadsheet

Posted: Tue Apr 23, 2019 6:22 pm
by Raphexon
jp wrote: Tue Apr 23, 2019 4:41 pm
Laskos wrote: Tue Apr 23, 2019 2:31 pm
Hugo wrote: Tue Apr 23, 2019 1:18 pm Leela Ratio is in deed hard to estimate. I would say mine is average 1.1.
From the startpossition Stockfish with 12 cpus has about 18.000 kNs in 1-2 Minutes and Lc0 24.000nps - 25.000 nps. So this is 1.2 ratio.
But when I observe the games, SF is far over 20.000 kNps after very few moves, while Lc0 is below 20.000 nps.
So 1.1 might be a realistic estimation for 5m +3s games.
I would guess your "effective" Leela Ratio at some 0.9 +/- 0.1, but you should know better. Try to peek over NPS shown across the games.
I never seriously performed an average over the stages of the games for an "effective" Leela Ratio like that in the paper. I tried to adjust my time controls of Lc0 and SF_dev to have an "effective" Leela Ratio of about 1.0 in games, and I guess I managed something in the range 1.0 +/- 0.2. But my results with adjusted time controls (60s + 1s for Lc0 and 150s + 2.5s for SF_dev) to have a Leela Ratio of about 1.0 look a bit different:
Are we sure Lc0 & A0 use the same definition of "node"?
Ratio is flawed to begin with because (for example) a single core machine that manages 50 mnps with SF is going to be stronger than a 36 core machine that manages 50 mnps with SF.

A small Leela net is going to be weaker than a big Leela net with the same "ratio."
A 64x4 Leela net with an 1.2 ratio would be "unfair" to Leela but a hypothetical 40x512 Net with an 1.2 ratio would be "unfair" to SF.


Then again, what constitutes as fair or unfair anyway?

Re: my Lc0 spreadsheet

Posted: Tue Apr 23, 2019 7:55 pm
by jp
Raphexon wrote: Tue Apr 23, 2019 6:22 pm Then again, what constitutes as fair or unfair anyway?
Yes, it's not about fairness but about comparison with A0, so the name itself is wrong because it has nothing to do with Leela.

And the number 1 does give the wrong idea that it is "fair", when really it just means "equally unfair as A0's hardware advantage was".

Still, it's useful to be able to make a comparison with A0, and we know the NN size they used.

Re: my Lc0 spreadsheet

Posted: Tue Apr 23, 2019 8:04 pm
by Laskos
jp wrote: Tue Apr 23, 2019 4:41 pm
Laskos wrote: Tue Apr 23, 2019 2:31 pm
Hugo wrote: Tue Apr 23, 2019 1:18 pm Leela Ratio is in deed hard to estimate. I would say mine is average 1.1.
From the startpossition Stockfish with 12 cpus has about 18.000 kNs in 1-2 Minutes and Lc0 24.000nps - 25.000 nps. So this is 1.2 ratio.
But when I observe the games, SF is far over 20.000 kNps after very few moves, while Lc0 is below 20.000 nps.
So 1.1 might be a realistic estimation for 5m +3s games.
I would guess your "effective" Leela Ratio at some 0.9 +/- 0.1, but you should know better. Try to peek over NPS shown across the games.
I never seriously performed an average over the stages of the games for an "effective" Leela Ratio like that in the paper. I tried to adjust my time controls of Lc0 and SF_dev to have an "effective" Leela Ratio of about 1.0 in games, and I guess I managed something in the range 1.0 +/- 0.2. But my results with adjusted time controls (60s + 1s for Lc0 and 150s + 2.5s for SF_dev) to have a Leela Ratio of about 1.0 look a bit different:
Are we sure Lc0 & A0 use the same definition of "node"?
No, but it's quite possible. Also, I was thinking of the upcoming TCEC superfinal, where the effective Leela Ratio (as defined in the paper) is close to 1.0. According to my results, Lc0 t40 should win. According to the results of Hugo, it is very doubtful. I would make a bet on Lc0, are bettings open somewhere on TCEC sites?
The result is dependent on openings, but I don't think Jeroen will have very inhuman and tactical openings favoring heavily SF for the superfinal. Observe that SF is leading comfortably in the Premier Division, but I am betting on Lc0 for the title :).