Leela data publicly available for use

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
Madeleine Birchfield
Posts: 407
Joined: Tue Sep 29, 2020 2:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Leela data publicly available for use

Post by Madeleine Birchfield » Tue Jun 15, 2021 8:04 pm

Recently the Leela development team put out the following blog post:

https://lczero.org/blog/2021/06/the-imp ... open-data/
2021-06-14

The importance of open data

In the Leela Chess project, we generate a huge amount of data. We use them to generate the network files to use with Lc0 for further data generation, but also with other chess engines, like Ceres. The same data are often used by individual project contributors to generate additional network files using the “supervised learning” approach.

Our intention has always been for “our” data to be open and available to everyone to use. To that end, we adopted an open license to allow their wide use:
This collection of training data for Leela Chess Zero is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/
Therefore we are very pleased that Stockfish, starting from today, is using a NNUE network file trained on the same data

Both projects have mentioned before that our “teams will join forces to demonstrate our commitment to open source chess engines and training tools, and open data.” This is the first concrete result stemming from this effort, and we promise it won’t be the last.
This might be of great use for training networks and tuning evaluations.
Currently taking a 3-month break from talkchess

noobpwnftw
Posts: 501
Joined: Sun Nov 08, 2015 10:10 pm

Re: Leela data publicly available for use

Post by noobpwnftw » Tue Jun 15, 2021 10:27 pm

It always have been. Efforts were made to keeping those data available for everyone, the only problem is few people don't seem to willingly pay any tribute or even care enough to spell people's last name right.

Andrew
Posts: 183
Joined: Wed Mar 08, 2006 11:51 pm
Location: Australia

Re: Leela data publicly available for use

Post by Andrew » Wed Jun 16, 2021 8:00 am

There have already been two versions on Abrok using this which is nice to see!

Andrew

User avatar
Ozymandias
Posts: 1337
Joined: Sun Oct 25, 2009 12:30 am

Re: Leela data publicly available for use

Post by Ozymandias » Wed Jun 16, 2021 6:13 pm

So they already did the equivalent of what AS did for FF2? Looks already stronger and it can only get better. Time for a SF14 that will retake ALL the first spots in the rating lists.

dkappe
Posts: 1025
Joined: Tue Aug 21, 2018 5:52 pm
Full name: Dietrich Kappe

Re: Leela data publicly available for use

Post by dkappe » Thu Jun 24, 2021 5:17 pm

So I started training Night Nurse from data generated by Bad Gyal 8 (then 9) as an exercise to see what kind of nnue net a mcts/nn engine would spawn. This data was generated using uci over a set of random openings and also a very large set of human < +-200 cp openings.

After generating a large amount of this data, I thought about converting all the “free” Bad Gyal self-play training data I had sitting about. A few lines of python later and I had maybe 250m positions I could add to my existing 300m positions. Instant elo boost, right?

Nope. They added maybe 10 elo. Testing with just the training data, it was maybe 80 elo weaker than a nnue net trained on the non-training data.

Now I had found a sweet spot of lambda = 0.7 for the Bad Gyal data. Moving to lambda 1.0 reduced the difference to 20 elo, but didn’t wipe it out, and the resulting nets were weaker than the 0.7 nets. My hypothesis is that the use of temperature makes the data perform worse when used with lambda < 1.0.

There’s quite a bit to critique about my experiment — data that doesn’t exactly match the source nets, etc. — but the difference was big enough that I stopped using training data as a source.

Wilson
Posts: 80
Joined: Tue Oct 29, 2019 2:20 am
Full name: Anthony Wilson

Re: Leela data publicly available for use

Post by Wilson » Sun Jun 27, 2021 8:24 pm

And what about the "NNUE bandwagon"?

https://lczero.org/blog/2021/04/jumping ... bandwagon/

Madeleine Birchfield
Posts: 407
Joined: Tue Sep 29, 2020 2:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: Leela data publicly available for use

Post by Madeleine Birchfield » Sun Jun 27, 2021 10:07 pm

Wilson wrote:
Sun Jun 27, 2021 8:24 pm
And what about the "NNUE bandwagon"?

https://lczero.org/blog/2021/04/jumping ... bandwagon/
That particular blog post about jumping on the NNUE bandwagon was written on April Fools Day.
Currently taking a 3-month break from talkchess

Post Reply