Something goes wrong with lc0 since yesterday?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Something goes wrong with lc0 since yesterday?

Post by Laskos » Wed Jun 13, 2018 6:06 am

The latest cuDNN engine is here:
https://crem.xyz/lc0/

They now build their own networks for this engine specifically, and they build them fast, as the engine is much faster than the main LCZero branch (an order of magnitude faster on Nvidia GPU, my is GTX 1060 6GB). Networks are here
http://testserver.lczero.org/networks

For some reason pertaining to their own considerations, they started from scratch (random MCTS player) already 3 times. FIrst 2 times they achieved pretty remarkable results in very short amount of time with not that many contributing to training and building the networks. I just came yesterday from holidays and tested second build-up from scratch ID75 (rated 2823 in their Elo), which was already a 128x10 network. It performed at about 3100 CCRL 40/4' Elo level against Arasan 20.5 and Houdini 1.5a. Which is not that far from the main branch ID373 best performance with cuDNN engine of about 3200+ CCRL 40/4' Elo level. It was progressing fast, and the 128x10 network was far from saturaring. I expected that in a day or two, they will have the strongest Leela Chess. Then they started from scratch again, for whatever reasons. It progressed extremely fast in their Elo, which I expected to be consistent with their earlier Elo calculations. By ID213, still a 64x6 net, they have their rating at 2934, 100+ points above ID75. So, waking up this morning and seeing such an amazing progress with the smallest of networks, I expected to see some sort of 3200 CCRL 40/4' engine built in practically a day of self-learning, if their ratings are consistent. Not so. It performs at some meager 2200-2300 CCRL Elo level against regular AB engines. Something goes wrong with their ratings, if not with their training and networks, in this 3rd building from scratch. Does somebody know what happened? Albert was actively participating in discussions there, so maybe he can say something, or some other people?

User avatar
Scally
Posts: 95
Joined: Thu Sep 28, 2017 7:34 pm
Location: Bermondsey, London
Full name: Alan Cooper
Contact:

Re: Something goes wrong with lc0 since yesterday?

Post by Scally » Wed Jun 13, 2018 7:28 am

Hi,

Can I ask how the ELO is calculated in the main branch?

In the testserver: http://testserver.lczero.org/networks the ELO numbers look like they equate to real ELO

However in the mainserver: http://lczero.org/networks the ELO numbers are too high to be real ELO

Is there a formulae to work out their real ELO?

Apologies if this has been covered before but there’s alot of posts re LCzero.

Thanks,

Al
Alan Cooper
My Chess Computers

User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos » Wed Jun 13, 2018 7:42 am

Scally wrote:
Wed Jun 13, 2018 7:28 am
Hi,

Can I ask how the ELO is calculated in the main branch?

In the testserver: http://testserver.lczero.org/networks the ELO numbers look like they equate to real ELO

However in the mainserver: http://lczero.org/networks the ELO numbers are too high to be real ELO

Is there a formulae to work out their real ELO?

Apologies if this has been covered before but there’s alot of posts re LCzero.

Thanks,

Al
I have little idea. They play self-games from standard opening position, and derive the progress. I can understand the rating inflation from the main branch, self-games are generally inflating the differences, and they played (still play?) with a very narrow opening repertoire given by LCZero little randomness (they introduce noise in the rating matches, but the variety is still low). So, the rating inflation, and even some perverse effects like rating inversions can happen. I don't know why lc0 branch shows no rating inflation. Well, now, the third build from scratch, does exhibit inflation, but something odd happened compared to the first two builds from scratch.

crem
Posts: 116
Joined: Wed May 23, 2018 7:29 pm

Re: Something goes wrong with lc0 since yesterday?

Post by crem » Wed Jun 13, 2018 7:42 am

There is indeed something wrong and we are investigating what exactly.

We've fixed some small bugs after previous runs and just restarted training to confirm everything is good.
But it turned out not to be that good.

User avatar
Scally
Posts: 95
Joined: Thu Sep 28, 2017 7:34 pm
Location: Bermondsey, London
Full name: Alan Cooper
Contact:

Re: Something goes wrong with lc0 since yesterday?

Post by Scally » Wed Jun 13, 2018 9:13 am

Ok,

I had a search around the iNet and found this, it doesn’t explain the formulae but it does give estimated grades to each network ID.

Al.
Alan Cooper
My Chess Computers

Milos
Posts: 3387
Joined: Wed Nov 25, 2009 12:47 am

Re: Something goes wrong with lc0 since yesterday?

Post by Milos » Wed Jun 13, 2018 12:43 pm

crem wrote:
Wed Jun 13, 2018 7:42 am
There is indeed something wrong and we are investigating what exactly.

We've fixed some small bugs after previous runs and just restarted training to confirm everything is good.
But it turned out not to be that good.
By we, you mean you, like Alex, right? :D

yanquis1972
Posts: 1762
Joined: Tue Jun 02, 2009 10:14 pm

Re: Something goes wrong with lc0 since yesterday?

Post by yanquis1972 » Wed Jun 13, 2018 1:54 pm

has inflated elo been looked at? i don't see anything wrong with the progression. do very high (+300, +500) deltas correspond to higher margins of error? the second test had its elo adjusted at some point but i didn't see any indication of that being done with the third.

crem
Posts: 116
Joined: Wed May 23, 2018 7:29 pm

Re: Something goes wrong with lc0 since yesterday?

Post by crem » Wed Jun 13, 2018 4:18 pm

Milos wrote:
Wed Jun 13, 2018 12:43 pm
crem wrote:
Wed Jun 13, 2018 7:42 am
There is indeed something wrong and we are investigating what exactly.

We've fixed some small bugs after previous runs and just restarted training to confirm everything is good.
But it turned out not to be that good.
By we, you mean you, like Alex, right? :D
No, they weren't really bugs, more like infrastructural things to tune (test/training data separation, how data moves from one server to another, training multiple network sizes in parallel, etc), as I personally did very little of that.
It was mostly Tilps, Error323 and nousian (those are nicks at discord) who did that.

Sven
Posts: 3822
Joined: Thu May 15, 2008 7:57 pm
Location: Berlin, Germany
Full name: Sven Schüle
Contact:

Re: Something goes wrong with lc0 since yesterday?

Post by Sven » Wed Jun 13, 2018 9:38 pm

Laskos wrote:
Wed Jun 13, 2018 7:42 am
Scally wrote:
Wed Jun 13, 2018 7:28 am
Hi,

Can I ask how the ELO is calculated in the main branch?

In the testserver: http://testserver.lczero.org/networks the ELO numbers look like they equate to real ELO

However in the mainserver: http://lczero.org/networks the ELO numbers are too high to be real ELO

Is there a formulae to work out their real ELO?

Apologies if this has been covered before but there’s alot of posts re LCzero.

Thanks,

Al
I have little idea. They play self-games from standard opening position, and derive the progress. I can understand the rating inflation from the main branch, self-games are generally inflating the differences, and they played (still play?) with a very narrow opening repertoire given by LCZero little randomness (they introduce noise in the rating matches, but the variety is still low). So, the rating inflation, and even some perverse effects like rating inversions can happen. I don't know why lc0 branch shows no rating inflation. Well, now, the third build from scratch, does exhibit inflation, but something odd happened compared to the first two builds from scratch.
According to the server script code in Github (see calcElo() call in function getProgress()) the Elo rating of network ID X is calculated directly from the rating of X-1 by adding the rating difference derived from the match result of X vs X-1. So all match results are simply "chained". Do you think this is a valid method? If had to do it I would always calculate the ratings from scratch based on all existing match results together.
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)

User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos » Wed Jun 13, 2018 9:58 pm

Sven wrote:
Wed Jun 13, 2018 9:38 pm
Laskos wrote:
Wed Jun 13, 2018 7:42 am
Scally wrote:
Wed Jun 13, 2018 7:28 am
Hi,

Can I ask how the ELO is calculated in the main branch?

In the testserver: http://testserver.lczero.org/networks the ELO numbers look like they equate to real ELO

However in the mainserver: http://lczero.org/networks the ELO numbers are too high to be real ELO

Is there a formulae to work out their real ELO?

Apologies if this has been covered before but there’s alot of posts re LCzero.

Thanks,

Al
I have little idea. They play self-games from standard opening position, and derive the progress. I can understand the rating inflation from the main branch, self-games are generally inflating the differences, and they played (still play?) with a very narrow opening repertoire given by LCZero little randomness (they introduce noise in the rating matches, but the variety is still low). So, the rating inflation, and even some perverse effects like rating inversions can happen. I don't know why lc0 branch shows no rating inflation. Well, now, the third build from scratch, does exhibit inflation, but something odd happened compared to the first two builds from scratch.
According to the server script code in Github (see calcElo() call in function getProgress()) the Elo rating of network ID X is calculated directly from the rating of X-1 by adding the rating difference derived from the match result of X vs X-1. So all match results are simply "chained". Do you think this is a valid method? If had to do it I would always calculate the ratings from scratch based on all existing match results together.
No, it's not a valid method, but can work as an orientative progress. It was shown to have flaws not only as absolute scale, but also as progress itself goes, even long-term iversions can happen, especially considering poor opening repertoire here.

Post Reply