my Lc0 spreadsheet

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

my Lc0 spreadsheet

Post by Hugo »

Hi all

I started a spreadsheet with my direct comparison between Stockfish 100419 12cpu and Lc0 NN 40 on Rtx 2060-
Leela Ratio is 1.1

Games are played with ponder ON and 5min + 3sec. Each match is 100 games.
Every two days, a new network will be added. So it will grow up fast.
https://docs.google.com/spreadsheets/d/ ... sp=sharing

regards, C.K.
Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

Re: my Lc0 spreadsheet

Post by Hugo »

so far Lc0-41997 made the best run !
only -14 Elo to SF100419 12cpu.

https://docs.google.com/spreadsheets/d/ ... sp=sharing

C.K.
Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

Re: my Lc0 spreadsheet

Post by Hugo »

first time a match is won by Lc0. Pretty close, just one game.
Rematch is running.

Code: Select all

1) Lc0 v0.21.1-42000                4 :    100 (+15,=71,-14),  50.5 %

    vs.                                :  games (  +,  =,  -),   (%) :   Diff
    Stockfish 100419 64 BMI2-x12       :    100 ( 15, 71, 14),  50.5 :     +4
C.K.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: my Lc0 spreadsheet

Post by Laskos »

Hugo wrote: Fri Apr 19, 2019 12:09 am Hi all

I started a spreadsheet with my direct comparison between Stockfish 100419 12cpu and Lc0 NN 40 on Rtx 2060-
Leela Ratio is 1.1

Games are played with ponder ON and 5min + 3sec. Each match is 100 games.
Every two days, a new network will be added. So it will grow up fast.
https://docs.google.com/spreadsheets/d/ ... sp=sharing

regards, C.K.
Why ponder ON? How may cores your CPU has? If only 12, then "ponder ON" will harm the game-play.
The "effective" in-game Leela Ratio is not that easy to calculate and I am sure it's not 1.1 at all. A0 team in their _paper_ (not the earlier preprint) spoke of in-game average or median performance, not some benchmarks.
Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

Re: my Lc0 spreadsheet

Post by Hugo »

Laskos wrote: Tue Apr 23, 2019 11:52 am Why ponder ON? How may cores your CPU has? If only 12, then "ponder ON" will harm the game-play.
The "effective" in-game Leela Ratio is not that easy to calculate and I am sure it's not 1.1 at all. A0 team in their _paper_ (not the earlier preprint) spoke of in-game average or median performance, not some benchmarks.
hi

it is easy to find out with two klicks that E5-2697Av4 is a 16 core cpu (turbo mode 3.0 - 3.1 Ghz)
so 12 cpu for SF , 2 for Lc0, (and still 2 left for a try with 2 GPU later).

ponder ON is a little plus in performance to both participants in my eyes. At least it is no fail at all ;)
Leela Ratio is in deed hard to estimate. I would say mine is average 1.1.
From the startpossition Stockfish with 12 cpus has about 18.000 kNs in 1-2 Minutes and Lc0 24.000nps - 25.000 nps. So this is 1.2 ratio.
But when I observe the games, SF is far over 20.000 kNps after very few moves, while Lc0 is below 20.000 nps.
So 1.1 might be a realistic estimation for 5m +3s games.

have a nice day

C.K.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: my Lc0 spreadsheet

Post by Laskos »

Hugo wrote: Tue Apr 23, 2019 1:18 pm
Laskos wrote: Tue Apr 23, 2019 11:52 am Why ponder ON? How may cores your CPU has? If only 12, then "ponder ON" will harm the game-play.
The "effective" in-game Leela Ratio is not that easy to calculate and I am sure it's not 1.1 at all. A0 team in their _paper_ (not the earlier preprint) spoke of in-game average or median performance, not some benchmarks.
hi

it is easy to find out with two klicks that E5-2697Av4 is a 16 core cpu (turbo mode 3.0 - 3.1 Ghz)
so 12 cpu for SF , 2 for Lc0, (and still 2 left for a try with 2 GPU later).

ponder ON is a little plus in performance to both participants in my eyes. At least it is no fail at all ;)
Leela Ratio is in deed hard to estimate. I would say mine is average 1.1.
From the startpossition Stockfish with 12 cpus has about 18.000 kNs in 1-2 Minutes and Lc0 24.000nps - 25.000 nps. So this is 1.2 ratio.
But when I observe the games, SF is far over 20.000 kNps after very few moves, while Lc0 is below 20.000 nps.
So 1.1 might be a realistic estimation for 5m +3s games.

have a nice day

C.K.
Thanks, looks good.
I would guess your "effective" Leela Ratio at some 0.9 +/- 0.1, but you should know better. Try to peek over NPS shown across the games.
I never seriously performed an average over the stages of the games for an "effective" Leela Ratio like that in the paper. I tried to adjust my time controls of Lc0 and SF_dev to have an "effective" Leela Ratio of about 1.0 in games, and I guess I managed something in the range 1.0 +/- 0.2. But my results with adjusted time controls (60s + 1s for Lc0 and 150s + 2.5s for SF_dev) to have a Leela Ratio of about 1.0 look a bit different:

Score of lc0_42056 vs 2.5*SF_dev: 19 - 10 - 71 [0.545] 100
Elo difference: 31.35 +/- 36.54
Finished match

I used 4-mover opening PGN of Stephan Pohl.
Leela seems to beat SF_dev in my experiment, but maybe I have an effective Leela Ratio of 1.2 and you of 0.9, and that would explain the discrepancy.
Nasty testing with this Leela :).
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: my Lc0 spreadsheet

Post by jp »

Laskos wrote: Tue Apr 23, 2019 2:31 pm
Hugo wrote: Tue Apr 23, 2019 1:18 pm Leela Ratio is in deed hard to estimate. I would say mine is average 1.1.
From the startpossition Stockfish with 12 cpus has about 18.000 kNs in 1-2 Minutes and Lc0 24.000nps - 25.000 nps. So this is 1.2 ratio.
But when I observe the games, SF is far over 20.000 kNps after very few moves, while Lc0 is below 20.000 nps.
So 1.1 might be a realistic estimation for 5m +3s games.
I would guess your "effective" Leela Ratio at some 0.9 +/- 0.1, but you should know better. Try to peek over NPS shown across the games.
I never seriously performed an average over the stages of the games for an "effective" Leela Ratio like that in the paper. I tried to adjust my time controls of Lc0 and SF_dev to have an "effective" Leela Ratio of about 1.0 in games, and I guess I managed something in the range 1.0 +/- 0.2. But my results with adjusted time controls (60s + 1s for Lc0 and 150s + 2.5s for SF_dev) to have a Leela Ratio of about 1.0 look a bit different:
Are we sure Lc0 & A0 use the same definition of "node"?
Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: my Lc0 spreadsheet

Post by Raphexon »

jp wrote: Tue Apr 23, 2019 4:41 pm
Laskos wrote: Tue Apr 23, 2019 2:31 pm
Hugo wrote: Tue Apr 23, 2019 1:18 pm Leela Ratio is in deed hard to estimate. I would say mine is average 1.1.
From the startpossition Stockfish with 12 cpus has about 18.000 kNs in 1-2 Minutes and Lc0 24.000nps - 25.000 nps. So this is 1.2 ratio.
But when I observe the games, SF is far over 20.000 kNps after very few moves, while Lc0 is below 20.000 nps.
So 1.1 might be a realistic estimation for 5m +3s games.
I would guess your "effective" Leela Ratio at some 0.9 +/- 0.1, but you should know better. Try to peek over NPS shown across the games.
I never seriously performed an average over the stages of the games for an "effective" Leela Ratio like that in the paper. I tried to adjust my time controls of Lc0 and SF_dev to have an "effective" Leela Ratio of about 1.0 in games, and I guess I managed something in the range 1.0 +/- 0.2. But my results with adjusted time controls (60s + 1s for Lc0 and 150s + 2.5s for SF_dev) to have a Leela Ratio of about 1.0 look a bit different:
Are we sure Lc0 & A0 use the same definition of "node"?
Ratio is flawed to begin with because (for example) a single core machine that manages 50 mnps with SF is going to be stronger than a 36 core machine that manages 50 mnps with SF.

A small Leela net is going to be weaker than a big Leela net with the same "ratio."
A 64x4 Leela net with an 1.2 ratio would be "unfair" to Leela but a hypothetical 40x512 Net with an 1.2 ratio would be "unfair" to SF.


Then again, what constitutes as fair or unfair anyway?
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: my Lc0 spreadsheet

Post by jp »

Raphexon wrote: Tue Apr 23, 2019 6:22 pm Then again, what constitutes as fair or unfair anyway?
Yes, it's not about fairness but about comparison with A0, so the name itself is wrong because it has nothing to do with Leela.

And the number 1 does give the wrong idea that it is "fair", when really it just means "equally unfair as A0's hardware advantage was".

Still, it's useful to be able to make a comparison with A0, and we know the NN size they used.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: my Lc0 spreadsheet

Post by Laskos »

jp wrote: Tue Apr 23, 2019 4:41 pm
Laskos wrote: Tue Apr 23, 2019 2:31 pm
Hugo wrote: Tue Apr 23, 2019 1:18 pm Leela Ratio is in deed hard to estimate. I would say mine is average 1.1.
From the startpossition Stockfish with 12 cpus has about 18.000 kNs in 1-2 Minutes and Lc0 24.000nps - 25.000 nps. So this is 1.2 ratio.
But when I observe the games, SF is far over 20.000 kNps after very few moves, while Lc0 is below 20.000 nps.
So 1.1 might be a realistic estimation for 5m +3s games.
I would guess your "effective" Leela Ratio at some 0.9 +/- 0.1, but you should know better. Try to peek over NPS shown across the games.
I never seriously performed an average over the stages of the games for an "effective" Leela Ratio like that in the paper. I tried to adjust my time controls of Lc0 and SF_dev to have an "effective" Leela Ratio of about 1.0 in games, and I guess I managed something in the range 1.0 +/- 0.2. But my results with adjusted time controls (60s + 1s for Lc0 and 150s + 2.5s for SF_dev) to have a Leela Ratio of about 1.0 look a bit different:
Are we sure Lc0 & A0 use the same definition of "node"?
No, but it's quite possible. Also, I was thinking of the upcoming TCEC superfinal, where the effective Leela Ratio (as defined in the paper) is close to 1.0. According to my results, Lc0 t40 should win. According to the results of Hugo, it is very doubtful. I would make a bet on Lc0, are bettings open somewhere on TCEC sites?
The result is dependent on openings, but I don't think Jeroen will have very inhuman and tactical openings favoring heavily SF for the superfinal. Observe that SF is leading comfortably in the Premier Division, but I am betting on Lc0 for the title :).