LC0-Leela0 Rating Controversy..Whats The Strength?

supersharp77 · Post by **supersharp77** » Mon Jun 24, 2019 3:09 am

Happened to be surfing looking for the latest LeelaFish/LC0 for some final testing and stumbled by accident to some new "Revelations" People around the Globe are having problems coming up with a rating for the "Engine/Program" LC0
Behold the New latest entry...:

https://github.com/LeelaChessZero/lc0/wiki/FAQ

What is the current strength of Lc0 ?

The Elo chart seems inflated.

The chart is not calibrated to CCRL or any other common list. It sets 'the first net' to Elo 0, so it is not comparable, even between different training runs.
The different points are calculated from self-play matches. Self-play tends to (exaggerate) gains in Elo compared to gains when playing other chess engines.

Where can I find Lc0's current Elo?

Many people are keeping their own rating lists, here are some examples:

Aggregated list
Elo Summary collects from many sources and graphs them all on one page.
LCZ vs Stockfish (!zz on Discord)
LCZ CCRL Estimate (!ccrl on Discord)
L.e.e.l.a LcZero ELO Estimate List Approximation by Cscuile (!sheet2 on Discord)
CCLS Rating for LCZ from these gauntlet results. The games can be watched here
LCZ Basic Checkmates
LCZ vs SF Time Handicap (!sheet on Discord)
Reinfeld's Win at Chess
Lc0 test30 Elo estimates (!sheet3 on Discord)
lc0 Elo Ratings -- ID Progression (!sheet4 on Discord)

https://docs.google.com/spreadsheets/d/ ... edit#gid=0

LC0 FIDE rating 2739

LC0 FIDE Rating 2675

LC0 FIDE rating 2671

LC0 FIDE Rating 2656

ALPHA ZERO Paper Estimate ELO 4000+(Minimum)

https://docs.google.com/spreadsheets/d/ ... edit#gid=0

It's been quite a while since Late 2017 when the Google Deep Mind Team Swore that Alpha Zero Defeated Stockfish
with only 24 hours of self learning game training...In the Hours, Days and Months Since The Entire Globe has been trying without success to recreate these fantastic results...Although much progress has been made and more is ahead it is time to "Tell It Like it is".. Those CLAIMS (2017) were PURE FICTION! Where are the Advocates of Alpha Zero Now? Where Are they Hiding? Should we put a asterisk beside those fantastic original results vs Stockfish?

Robert Pope · Post by **Robert Pope** » Mon Jun 24, 2019 4:23 am

supersharp77 wrote: ↑Mon Jun 24, 2019 3:09 am Happened to be surfing looking for the latest LeelaFish/LC0 for some final testing and stumbled by accident to some new "Revelations" People around the Globe are having problems coming up with a rating for the "Engine/Program" LC0
Behold the New latest entry...:

https://github.com/LeelaChessZero/lc0/wiki/FAQ

What is the current strength of Lc0 ?

The Elo chart seems inflated.

The chart is not calibrated to CCRL or any other common list. It sets 'the first net' to Elo 0, so it is not comparable, even between different training runs.
The different points are calculated from self-play matches. Self-play tends to (exaggerate) gains in Elo compared to gains when playing other chess engines.

Where can I find Lc0's current Elo?

Many people are keeping their own rating lists, here are some examples:

Aggregated list
Elo Summary collects from many sources and graphs them all on one page.
LCZ vs Stockfish (!zz on Discord)
LCZ CCRL Estimate (!ccrl on Discord)
L.e.e.l.a LcZero ELO Estimate List Approximation by Cscuile (!sheet2 on Discord)
CCLS Rating for LCZ from these gauntlet results. The games can be watched here
LCZ Basic Checkmates
LCZ vs SF Time Handicap (!sheet on Discord)
Reinfeld's Win at Chess
Lc0 test30 Elo estimates (!sheet3 on Discord)
lc0 Elo Ratings -- ID Progression (!sheet4 on Discord)

https://docs.google.com/spreadsheets/d/ ... edit#gid=0

LC0 FIDE rating 2739

LC0 FIDE Rating 2675

LC0 FIDE rating 2671

LC0 FIDE Rating 2656

ALPHA ZERO Paper Estimate ELO 4000+(Minimum)

https://docs.google.com/spreadsheets/d/ ... edit#gid=0

It's been quite a while since Late 2017 when the Google Deep Mind Team Swore that Alpha Zero Defeated Stockfish
with only 24 hours of self learning game training...In the Hours, Days and Months Since The Entire Globe has been trying without success to recreate these fantastic results...Although much progress has been made and more is ahead it is time to "Tell It Like it is".. Those CLAIMS (2017) were PURE FICTION! Where are the Advocates of Alpha Zero Now? Where Are they Hiding? Should we put a asterisk beside those fantastic original results vs Stockfish?

I think you mixed up your thread title. It should be more like:
LC0-Leela0 Rating Strength..What Controversy?

Laskos · Post by **Laskos** » Mon Jun 24, 2019 9:53 am

supersharp77 wrote: ↑Mon Jun 24, 2019 3:09 am Happened to be surfing looking for the latest LeelaFish/LC0 for some final testing and stumbled by accident to some new "Revelations" People around the Globe are having problems coming up with a rating for the "Engine/Program" LC0
Behold the New latest entry...:

https://github.com/LeelaChessZero/lc0/wiki/FAQ

What is the current strength of Lc0 ?

The Elo chart seems inflated.

The chart is not calibrated to CCRL or any other common list. It sets 'the first net' to Elo 0, so it is not comparable, even between different training runs.
The different points are calculated from self-play matches. Self-play tends to (exaggerate) gains in Elo compared to gains when playing other chess engines.

Where can I find Lc0's current Elo?

Many people are keeping their own rating lists, here are some examples:

Aggregated list
Elo Summary collects from many sources and graphs them all on one page.
LCZ vs Stockfish (!zz on Discord)
LCZ CCRL Estimate (!ccrl on Discord)
L.e.e.l.a LcZero ELO Estimate List Approximation by Cscuile (!sheet2 on Discord)
CCLS Rating for LCZ from these gauntlet results. The games can be watched here
LCZ Basic Checkmates
LCZ vs SF Time Handicap (!sheet on Discord)
Reinfeld's Win at Chess
Lc0 test30 Elo estimates (!sheet3 on Discord)
lc0 Elo Ratings -- ID Progression (!sheet4 on Discord)

https://docs.google.com/spreadsheets/d/ ... edit#gid=0

LC0 FIDE rating 2739

LC0 FIDE Rating 2675

LC0 FIDE rating 2671

LC0 FIDE Rating 2656

ALPHA ZERO Paper Estimate ELO 4000+(Minimum)

https://docs.google.com/spreadsheets/d/ ... edit#gid=0

It's been quite a while since Late 2017 when the Google Deep Mind Team Swore that Alpha Zero Defeated Stockfish
with only 24 hours of self learning game training...In the Hours, Days and Months Since The Entire Globe has been trying without success to recreate these fantastic results...Although much progress has been made and more is ahead it is time to "Tell It Like it is".. Those CLAIMS (2017) were PURE FICTION! Where are the Advocates of Alpha Zero Now? Where Are they Hiding? Should we put a asterisk beside those fantastic original results vs Stockfish?

What are these statements?
Lc0 T40 on an RTX 2070 GPU is stronger head to head than SF_dev on 8 i7 or Ryzen cores, from not too outrageous openings. What FIDE rating that means, you guess. I think it's impossible to _measure_ the FIDE rating of Lc0 T40 RTX 2070 in a pool of human FIDE players, as much too rarely a draw will occur. Aside that, Lc0 in a pool of regular engines, doesn't obey the Elo rating scheme (say CCRL Elo), as I posted here many months ago. So, besides the hardware and openings issues, the whole concept of an "Elo rating number" for Lc0 is theoretically superfluous.
Alpha Zero preprint and paper were a great scientific work which was confirmed by Lc0, and it doesn't need advocacy of some imbeciles.

supersharp77 · Post by **supersharp77** » Mon Jun 24, 2019 8:40 pm

Laskos wrote: ↑Mon Jun 24, 2019 9:53 am

What are these statements?
Lc0 T40 on an RTX 2070 GPU is stronger head to head than SF_dev on 8 i7 or Ryzen cores, from not too outrageous openings. What FIDE rating that means, you guess. I think it's impossible to _measure_ the FIDE rating of Lc0 T40 RTX 2070 in a pool of human FIDE players, as much too rarely a draw will occur. Aside that, Lc0 in a pool of regular engines, doesn't obey the Elo rating scheme (say CCRL Elo), as I posted here many months ago. So, besides the hardware and openings issues, the whole concept of an "Elo rating number" for Lc0 is theoretically superfluous.
Alpha Zero preprint and paper were a great scientific work which was confirmed by Lc0, and it doesn't need advocacy of some imbeciles.

Some Imbeciles? Like Who..You? Been testing LC0 for months and actually the engine rarely wins any games at all..It looks to be not a chess engine but only a NN list of played games with a basic search...My research shows a program that is erratic
disjointed and incapable of beating basic chess engines on CPU...It struggles with strategy, fumbles during the middlegame
and has little to no endgame knowledge..Program seems not to understand basic chess strategy. Performs like a random mover when out of its normal channels...And thats with me giving it the benefit of the doubt by adjudicating numerous games where LC0 should win because it is up material..but it struggles to finish games..It misses simple continuations to win games..3300 ELO 3500 ELO...4000 ELO...? Not in the games I'm viewing...A semi random mover with a HUGE NN opening book...that needs huge speeds to be effective at all...Lost to Bikjump 1.8 x 64 yesterday and King 3.32

Robert Pope · Post by **Robert Pope** » Mon Jun 24, 2019 11:03 pm

supersharp77 wrote: ↑Mon Jun 24, 2019 8:40 pm
Laskos wrote: ↑Mon Jun 24, 2019 9:53 am

What are these statements?
Lc0 T40 on an RTX 2070 GPU is stronger head to head than SF_dev on 8 i7 or Ryzen cores, from not too outrageous openings. What FIDE rating that means, you guess. I think it's impossible to _measure_ the FIDE rating of Lc0 T40 RTX 2070 in a pool of human FIDE players, as much too rarely a draw will occur. Aside that, Lc0 in a pool of regular engines, doesn't obey the Elo rating scheme (say CCRL Elo), as I posted here many months ago. So, besides the hardware and openings issues, the whole concept of an "Elo rating number" for Lc0 is theoretically superfluous.
Alpha Zero preprint and paper were a great scientific work which was confirmed by Lc0, and it doesn't need advocacy of some imbeciles.
Some Imbeciles? Like Who..You? Been testing LC0 for months and actually the engine rarely wins any games at all..It looks to be not a chess engine but only a NN list of played games with a basic search...My research shows a program that is erratic
disjointed and incapable of beating basic chess engines on CPU...It struggles with strategy, fumbles during the middlegame
and has little to no endgame knowledge..Program seems not to understand basic chess strategy. Performs like a random mover when out of its normal channels...And thats with me giving it the benefit of the doubt by adjudicating numerous games where LC0 should win because it is up material..but it struggles to finish games..It misses simple continuations to win games..3300 ELO 3500 ELO...4000 ELO...? Not in the games I'm viewing...A semi random mover with a HUGE NN opening book...that needs huge speeds to be effective at all...Lost to Bikjump 1.8 x 64 yesterday and King 3.32

I don't think anyone disputes that Lc0 plays much worse on a CPU than a GPU. That's hardly a revelation. And has nothing to do with AlphaZero, which never played on just a CPU, that I am aware of.

supersharp77 · Post by **supersharp77** » Tue Jun 25, 2019 4:19 am

Robert Pope wrote: ↑Mon Jun 24, 2019 11:03 pm
supersharp77 wrote: ↑Mon Jun 24, 2019 8:40 pm
Laskos wrote: ↑Mon Jun 24, 2019 9:53 am

What are these statements?
Lc0 T40 on an RTX 2070 GPU is stronger head to head than SF_dev on 8 i7 or Ryzen cores, from not too outrageous openings. What FIDE rating that means, you guess. I think it's impossible to _measure_ the FIDE rating of Lc0 T40 RTX 2070 in a pool of human FIDE players, as much too rarely a draw will occur. Aside that, Lc0 in a pool of regular engines, doesn't obey the Elo rating scheme (say CCRL Elo), as I posted here many months ago. So, besides the hardware and openings issues, the whole concept of an "Elo rating number" for Lc0 is theoretically superfluous.
Alpha Zero preprint and paper were a great scientific work which was confirmed by Lc0, and it doesn't need advocacy of some imbeciles.
Some Imbeciles? Like Who..You? Been testing LC0 for months and actually the engine rarely wins any games at all..It looks to be not a chess engine but only a NN list of played games with a basic search...My research shows a program that is erratic
disjointed and incapable of beating basic chess engines on CPU...It struggles with strategy, fumbles during the middlegame
and has little to no endgame knowledge..Program seems not to understand basic chess strategy. Performs like a random mover when out of its normal channels...And thats with me giving it the benefit of the doubt by adjudicating numerous games where LC0 should win because it is up material..but it struggles to finish games..It misses simple continuations to win games..3300 ELO 3500 ELO...4000 ELO...? Not in the games I'm viewing...A semi random mover with a HUGE NN opening book...that needs huge speeds to be effective at all...Lost to Bikjump 1.8 x 64 yesterday and King 3.32
I don't think anyone disputes that Lc0 plays much worse on a CPU than a GPU. That's hardly a revelation. And has nothing to do with AlphaZero, which never played on just a CPU, that I am aware of.

So back to my original premise..was the Alpha Zero Claim of 4000+ ELO with 24hr of Self Play Games a Proper Claim?
and What is the Actual Strength Of Alpha Zero and Or LC0? Thx AR

mwyoung · Post by **mwyoung** » Tue Jun 25, 2019 9:52 am

supersharp77 wrote: ↑Tue Jun 25, 2019 4:19 am
Robert Pope wrote: ↑Mon Jun 24, 2019 11:03 pm
supersharp77 wrote: ↑Mon Jun 24, 2019 8:40 pm
Laskos wrote: ↑Mon Jun 24, 2019 9:53 am

What are these statements?
Lc0 T40 on an RTX 2070 GPU is stronger head to head than SF_dev on 8 i7 or Ryzen cores, from not too outrageous openings. What FIDE rating that means, you guess. I think it's impossible to _measure_ the FIDE rating of Lc0 T40 RTX 2070 in a pool of human FIDE players, as much too rarely a draw will occur. Aside that, Lc0 in a pool of regular engines, doesn't obey the Elo rating scheme (say CCRL Elo), as I posted here many months ago. So, besides the hardware and openings issues, the whole concept of an "Elo rating number" for Lc0 is theoretically superfluous.
Alpha Zero preprint and paper were a great scientific work which was confirmed by Lc0, and it doesn't need advocacy of some imbeciles.
Some Imbeciles? Like Who..You? Been testing LC0 for months and actually the engine rarely wins any games at all..It looks to be not a chess engine but only a NN list of played games with a basic search...My research shows a program that is erratic
disjointed and incapable of beating basic chess engines on CPU...It struggles with strategy, fumbles during the middlegame
and has little to no endgame knowledge..Program seems not to understand basic chess strategy. Performs like a random mover when out of its normal channels...And thats with me giving it the benefit of the doubt by adjudicating numerous games where LC0 should win because it is up material..but it struggles to finish games..It misses simple continuations to win games..3300 ELO 3500 ELO...4000 ELO...? Not in the games I'm viewing...A semi random mover with a HUGE NN opening book...that needs huge speeds to be effective at all...Lost to Bikjump 1.8 x 64 yesterday and King 3.32
I don't think anyone disputes that Lc0 plays much worse on a CPU than a GPU. That's hardly a revelation. And has nothing to do with AlphaZero, which never played on just a CPU, that I am aware of.
So back to my original premise..was the Alpha Zero Claim of 4000+ ELO with 24hr of Self Play Games a Proper Claim?
and What is the Actual Strength Of Alpha Zero and Or LC0? Thx AR

The Alpha Zero claim of the 24hr NN of 4000+ elo is not knowable by us, unless there is some data of games I have not seen.

And the actual strength of Lc0 and Alpha Zero depends on what NN, and what hardware is being used.

I believe the Alpha Zero that played Stockfish had 9 hours of training. And depending on what match, and games you accept to rate. AZ was between 44 and 78 elo better then Stockfish 8 by performance. Using this hardware Stockfish 8 Hardware 44 CPU cores, Syzygy endgame tablebases, and a 32GB hash size vs Alpha Zero's four TPUs.

If we assume the numbers here are more or less accurate for this speculation.

Stockfish 8 64-bit 3302 CCRL 40/40 rating. Equivalent to 40 moves in 40 minutes on Athlon 64 X2 4600+ (2.4 GHz)
Stockfish 10 64-bit 3389 CCRL 40/40 rating. Equivalent to 40 moves in 40 minutes on Athlon 64 X2 4600+ (2.4 GHz)

What is the playing strength of Stockfish 8 on a 44 core system vs a 1 core Athlon 64 X2 4600+ (2.4 GHz)
Lets assume a win ratio of 95% or 1 draw every 10 games. That would give a rating for Stockfish 8 on the 44 core system of 3814 elo. And at best AZ could be 78 elo better then Stockfish 8 on the played hardware. So AZ's 9 hour NN could be as strong as 3892 Elo.

Now could AZ's 24 hour trained NN be 4000+ elo. Sure, it is possible and probable.

Stockfish 10 by CCRL is 87 elo stronger then Stockfish 8. And it is also possible that Stockfish 10 is stronger then AZ. And it is also probable that Lc0 is stronger now then AZ. Since Lc0 is stronger by performance then Stockfish 10.

But speculation is not real games, but this is the best data we have.

corres · Post by **corres** » Tue Jun 25, 2019 10:48 am

Laskos wrote: ↑Mon Jun 24, 2019 9:53 am ...
Lc0 T40 on an RTX 2070 GPU is stronger head to head than SF_dev on 8 i7 or Ryzen cores, from not too outrageous openings. What FIDE rating that means, you guess. I think it's impossible to _measure_ the FIDE rating of Lc0 T40 RTX 2070 in a pool of human FIDE players, as much too rarely a draw will occur. Aside that, Lc0 in a pool of regular engines, doesn't obey the Elo rating scheme (say CCRL Elo), as I posted here many months ago. So, besides the hardware and openings issues, the whole concept of an "Elo rating number" for Lc0 is theoretically superfluous.
Alpha Zero preprint and paper were a great scientific work which was confirmed by Lc0, and it doesn't need advocacy of some imbeciles.

To speech about "Fide Ratings" in connection with chess engines was an established thing at the beginning of the 1990's years. At that time chess engines were common participants of Fide competitions. After the GMs protested against participation of engines Fide banished them from Fide competitions.
There are lot of sites with Elo(?) rating of chess engines but so many testers so many test methods.
Because of this we can talk established manner only about CCRL ratings, about SSDF ratings, about CEGT ratings, about IPON ratings, etc.

Note
Really, Lc0 is a confirmation of the result of AlphaZero team - even if it is not a scientific work.
But as an engineering work it is outstanding.

LC0-Leela0 Rating Controversy..Whats The Strength?

LC0-Leela0 Rating Controversy..Whats The Strength?

Re: LC0-Leela0 Rating Controversy..Whats The Strength?

Re: LC0-Leela0 Rating Controversy..Whats The Strength?

Re: LC0-Leela0 Rating Controversy..Whats The Strength?

Re: LC0-Leela0 Rating Controversy..Whats The Strength?

Re: LC0-Leela0 Rating Controversy..Whats The Strength?

Re: LC0-Leela0 Rating Controversy..Whats The Strength?

Re: LC0-Leela0 Rating Controversy..Whats The Strength?