3 million games for training neural networks

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
AlvaroBegue
Posts: 912
Joined: Tue Mar 09, 2010 2:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

3 million games for training neural networks

Post by AlvaroBegue » Sat Feb 24, 2018 3:08 am

Hi,

I had my 8-core Ryzen 7 computer spend about 2 months generating quick Stockfish-vs-Stockfish games. The result is 3 million games that I think could be used for training neural networks, or to tune evaluation functions.

I have the games available in PGN format with comments that include the score and search depth, or in a much more compact and easy to parse format: One game per line, consisting of moves in UCI notation followed by the result at the end.

Code: Select all

d2d4 g8f6 g1f3 g7g6 b1c3 f8g7 e2e4 d7d6 c1e3 c7c6 d1d2 b8d7 a2a4 e8g8 e3h6 e7e5 e1c1 d8a5 d2g5 g7h6 g5h6 d7b6 d1d3 c8e6 f3g5 b6d7 f1e2 b7b5 d4e5 d7e5 d3g3 b5b4 c3b1 b4b3 c2b3 c6c5 b1d2 c5c4 g5e6 f7e6 b3c4 a8b8 h6h3 b8b2 h3e6 g8g7 d2b3 b2b3 g3b3 f6e4 e6d5 e4c5 b3b5 a5c3 c1b1 f8f2 d5d6 f2e2 d6e7 g7h6 e7h4 h6g7 h4e7 g7h6 0-1
I used openings from swcr-fq-openings-v3.5.pgn , whose lines a fixed length of 8 moves for each player.

At least one person in this forum has expressed an interest in the data. I will make it available soon, but I wanted to ask this: Would you be interested in the PGN version or the games, or is the simple format enough?

Ferdy
Posts: 3645
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: 3 million games for training neural networks

Post by Ferdy » Sat Feb 24, 2018 4:33 am

What is the time control? Do you apply game adjudications? I like the full pgn file with move comments. I am interested in extracting positions with material imbalance, positions with mate in N where N can be 10 or less and extract epd and add ce where ce is from the move comments in the game itself for evaluation tuning.

User avatar
Ozymandias
Posts: 1001
Joined: Sun Oct 25, 2009 12:30 am

Re: 3 million games for training neural networks

Post by Ozymandias » Sat Feb 24, 2018 7:27 am

Full PGN is preferable. Thx in advance.

smatovic
Posts: 548
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: 3 million games for training neural networks

Post by smatovic » Sat Feb 24, 2018 9:45 am

Would you be interested in the PGN version or the games, or is the simple format enough?
PGN preferred, thanks.

--
Srdja

abulmo2
Posts: 144
Joined: Fri Dec 16, 2016 10:04 am
Contact:

Re: 3 million games for training neural networks

Post by abulmo2 » Sat Feb 24, 2018 10:02 am

AlvaroBegue wrote:I used openings from swcr-fq-openings-v3.5.pgn , whose lines a fixed length of 8 moves for each player.
Are the 5120 opening lines various enough for 3 millions games?
To tune Amoeba's evaluation function, I play the first moves randomly, to be sure to have enough variety. During an alphabeta search, the leaves of the tree can reach some strange positions that need to be evaluated correctly.
Richard Delorme

AlvaroBegue
Posts: 912
Joined: Tue Mar 09, 2010 2:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: 3 million games for training neural networks

Post by AlvaroBegue » Sat Feb 24, 2018 10:15 am

Ferdy wrote:What is the time control? Do you apply game adjudications? I like the full pgn file with move comments. I am interested in extracting positions with material imbalance, positions with mate in N where N can be 10 or less and extract epd and add ce where ce is from the move comments in the game itself for evaluation tuning.
Time control is 1 s + 0.1 s/move. No adjudication, so you'll get plenty of mate-in-n positions. My notes are below.

What do you mean by "ce"?

----------

Engine: Stockfish 101217 64 BMI2

CPU: AMD Ryzen 7 1800X Eight-Core Processor

This is the command used to generate games:

Code: Select all

nohup time unbuffer ~/Downloads/cutechess-master/projects/cli/cutechess-cli -recover -engine cmd=stockfish proto=uci option.SyzygyPath=/home/alvaro/syzygy option.Hash=256 tc=inf/1+.1 -engine cmd=stockfish proto=uci option.SyzygyPath=/home/alvaro/syzygy option.Hash=256 tc=inf/1+.1 -games 1000000 -concurrency 8 -openings file=~/ruy/swcr-fq-openings-v3.5.pgn format=pgn order=random -pgnout db.pgn &
It takes 19.8 days to complete one of these runs.

AlvaroBegue
Posts: 912
Joined: Tue Mar 09, 2010 2:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: 3 million games for training neural networks

Post by AlvaroBegue » Sat Feb 24, 2018 10:25 am


Ferdy
Posts: 3645
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: 3 million games for training neural networks

Post by Ferdy » Sat Feb 24, 2018 11:41 am

AlvaroBegue wrote:What do you mean by "ce"?
It is an opcode in epd standard, called centipawn evaluation.

https://chessprogramming.wikispaces.com ... escription

Example.

Code: Select all

rnbqkbnr/pppp1ppp/8/4p3/4P3/8/PPPP1PPP/RNBQKBNR w KQkq - bm Nf3; ce 15;

Ferdy
Posts: 3645
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: 3 million games for training neural networks

Post by Ferdy » Sat Feb 24, 2018 11:45 am

AlvaroBegue wrote:Here's the link: https://drive.google.com/drive/folders/ ... itamyJD5_k

Enjoy!
Thanks.

Joost Buijs
Posts: 783
Joined: Thu Jul 16, 2009 8:47 am
Location: Almere, The Netherlands

Re: 3 million games for training neural networks

Post by Joost Buijs » Sat Feb 24, 2018 2:45 pm

AlvaroBegue wrote:Here's the link: https://drive.google.com/drive/folders/ ... itamyJD5_k

Enjoy!
Thanks for making these games available!

Atm. I'm not really into NN, but just curious to see if these games will make a difference compared with the self-play games I use to tune the evaluation function.

Post Reply