Leela data publicly available for use

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Madeleine Birchfield
Posts: 512
Joined: Tue Sep 29, 2020 4:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Leela data publicly available for use

Post by Madeleine Birchfield »

Recently the Leela development team put out the following blog post:

https://lczero.org/blog/2021/06/the-imp ... open-data/
2021-06-14

The importance of open data

In the Leela Chess project, we generate a huge amount of data. We use them to generate the network files to use with Lc0 for further data generation, but also with other chess engines, like Ceres. The same data are often used by individual project contributors to generate additional network files using the “supervised learning” approach.

Our intention has always been for “our” data to be open and available to everyone to use. To that end, we adopted an open license to allow their wide use:
This collection of training data for Leela Chess Zero is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/
Therefore we are very pleased that Stockfish, starting from today, is using a NNUE network file trained on the same data

Both projects have mentioned before that our “teams will join forces to demonstrate our commitment to open source chess engines and training tools, and open data.” This is the first concrete result stemming from this effort, and we promise it won’t be the last.
This might be of great use for training networks and tuning evaluations.
noobpwnftw
Posts: 560
Joined: Sun Nov 08, 2015 11:10 pm

Re: Leela data publicly available for use

Post by noobpwnftw »

It always have been. Efforts were made to keeping those data available for everyone, the only problem is few people don't seem to willingly pay any tribute or even care enough to spell people's last name right.
Andrew
Posts: 231
Joined: Thu Mar 09, 2006 12:51 am
Location: Australia

Re: Leela data publicly available for use

Post by Andrew »

There have already been two versions on Abrok using this which is nice to see!

Andrew
User avatar
Ozymandias
Posts: 1532
Joined: Sun Oct 25, 2009 2:30 am

Re: Leela data publicly available for use

Post by Ozymandias »

So they already did the equivalent of what AS did for FF2? Looks already stronger and it can only get better. Time for a SF14 that will retake ALL the first spots in the rating lists.
dkappe
Posts: 1631
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Leela data publicly available for use

Post by dkappe »

So I started training Night Nurse from data generated by Bad Gyal 8 (then 9) as an exercise to see what kind of nnue net a mcts/nn engine would spawn. This data was generated using uci over a set of random openings and also a very large set of human < +-200 cp openings.

After generating a large amount of this data, I thought about converting all the “free” Bad Gyal self-play training data I had sitting about. A few lines of python later and I had maybe 250m positions I could add to my existing 300m positions. Instant elo boost, right?

Nope. They added maybe 10 elo. Testing with just the training data, it was maybe 80 elo weaker than a nnue net trained on the non-training data.

Now I had found a sweet spot of lambda = 0.7 for the Bad Gyal data. Moving to lambda 1.0 reduced the difference to 20 elo, but didn’t wipe it out, and the resulting nets were weaker than the 0.7 nets. My hypothesis is that the use of temperature makes the data perform worse when used with lambda < 1.0.

There’s quite a bit to critique about my experiment — data that doesn’t exactly match the source nets, etc. — but the difference was big enough that I stopped using training data as a source.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Wilson
Posts: 81
Joined: Tue Oct 29, 2019 3:20 am
Full name: Anthony Wilson

Re: Leela data publicly available for use

Post by Wilson »

And what about the "NNUE bandwagon"?

https://lczero.org/blog/2021/04/jumping ... bandwagon/
Madeleine Birchfield
Posts: 512
Joined: Tue Sep 29, 2020 4:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: Leela data publicly available for use

Post by Madeleine Birchfield »

Wilson wrote: Sun Jun 27, 2021 10:24 pm And what about the "NNUE bandwagon"?

https://lczero.org/blog/2021/04/jumping ... bandwagon/
That particular blog post about jumping on the NNUE bandwagon was written on April Fools Day.