Page 2 of 3

Re: 3 million games for training neural networks

Posted: Sun Feb 25, 2018 12:33 am
by zenpawn
Likewise, I plan to use them for tuning. Thank you, Alvaro!

To everyone downloading these, don't forget to prune the time forfeits if you're using the result. Either that, or go through them by hand and decide what the results should have been. That might not be too daunting as there are only 8 of them in db_3 (haven't checked the count in the others yet).

Re: 3 million games for training neural networks

Posted: Sun Feb 25, 2018 3:46 am
by AlvaroBegue
zenpawn wrote:To everyone downloading these, don't forget to prune the time forfeits if you're using the result. Either that, or go through them by hand and decide what the results should have been. That might not be too daunting as there are only 8 of them in db_3 (haven't checked the count in the others yet).
That's a very good point. There are 2 on db_2 and none in db_1. I guess the computer had some issues in the third run (swapping?).

They should probably just be removed. Or you can ignore the problem since the pollution is minimal.

Re: 3 million games for training neural networks

Posted: Sun Feb 25, 2018 4:10 am
by AlvaroBegue
AlvaroBegue wrote:That's a very good point. There are 2 on db_2 and none in db_1. I guess the computer had some issues in the third run (swapping?).
Sorry, only 1 on db_2. I don't know what I did earlier.

Re: 3 million games for training neural networks

Posted: Fri Oct 12, 2018 12:04 am
by Daniel Shawul
AlvaroBegue wrote: Sat Feb 24, 2018 4:08 am Hi,

I had my 8-core Ryzen 7 computer spend about 2 months generating quick Stockfish-vs-Stockfish games. The result is 3 million games that I think could be used for training neural networks, or to tune evaluation functions.

I have the games available in PGN format with comments that include the score and search depth, or in a much more compact and easy to parse format: One game per line, consisting of moves in UCI notation followed by the result at the end.

Code: Select all

d2d4 g8f6 g1f3 g7g6 b1c3 f8g7 e2e4 d7d6 c1e3 c7c6 d1d2 b8d7 a2a4 e8g8 e3h6 e7e5 e1c1 d8a5 d2g5 g7h6 g5h6 d7b6 d1d3 c8e6 f3g5 b6d7 f1e2 b7b5 d4e5 d7e5 d3g3 b5b4 c3b1 b4b3 c2b3 c6c5 b1d2 c5c4 g5e6 f7e6 b3c4 a8b8 h6h3 b8b2 h3e6 g8g7 d2b3 b2b3 g3b3 f6e4 e6d5 e4c5 b3b5 a5c3 c1b1 f8f2 d5d6 f2e2 d6e7 g7h6 e7h4 h6g7 h4e7 g7h6 0-1
I used openings from swcr-fq-openings-v3.5.pgn , whose lines a fixed length of 8 moves for each player.

At least one person in this forum has expressed an interest in the data. I will make it available soon, but I wanted to ask this: Would you be interested in the PGN version or the games, or is the simple format enough?
Alvaro, did you have success with training neural networks with these games ?
I have had no success with using ccrl-cegt 40/40 games for training. The accuracy is about 50% and is unable
to beat a network trained from quiet.epd+your_epd_set (total 2M postions). I have trained even with ~1 billion positons
extracted from various PGN databases and using the game outcome but i still weaker network than one trained with 2M pos network.
I guess what i am seeing is quality is more important than quantity.
Maybe your games are better quality since they are all stockfish games (though at fast time control) so I will try.

Daniel

Re: 3 million games for training neural networks

Posted: Fri Oct 12, 2018 1:53 am
by AlvaroBegue
Daniel Shawul wrote: Fri Oct 12, 2018 12:04 am
AlvaroBegue wrote: Sat Feb 24, 2018 4:08 am Hi,

I had my 8-core Ryzen 7 computer spend about 2 months generating quick Stockfish-vs-Stockfish games. The result is 3 million games that I think could be used for training neural networks, or to tune evaluation functions.

I have the games available in PGN format with comments that include the score and search depth, or in a much more compact and easy to parse format: One game per line, consisting of moves in UCI notation followed by the result at the end.

Code: Select all

d2d4 g8f6 g1f3 g7g6 b1c3 f8g7 e2e4 d7d6 c1e3 c7c6 d1d2 b8d7 a2a4 e8g8 e3h6 e7e5 e1c1 d8a5 d2g5 g7h6 g5h6 d7b6 d1d3 c8e6 f3g5 b6d7 f1e2 b7b5 d4e5 d7e5 d3g3 b5b4 c3b1 b4b3 c2b3 c6c5 b1d2 c5c4 g5e6 f7e6 b3c4 a8b8 h6h3 b8b2 h3e6 g8g7 d2b3 b2b3 g3b3 f6e4 e6d5 e4c5 b3b5 a5c3 c1b1 f8f2 d5d6 f2e2 d6e7 g7h6 e7h4 h6g7 h4e7 g7h6 0-1
I used openings from swcr-fq-openings-v3.5.pgn , whose lines a fixed length of 8 moves for each player.

At least one person in this forum has expressed an interest in the data. I will make it available soon, but I wanted to ask this: Would you be interested in the PGN version or the games, or is the simple format enough?
Alvaro, did you have success with training neural networks with these games ?
I have had no success with using ccrl-cegt 40/40 games for training. The accuracy is about 50% and is unable
to beat a network trained from quiet.epd+your_epd_set (total 2M postions). I have trained even with ~1 billion positons
extracted from various PGN databases and using the game outcome but i still weaker network than one trained with 2M pos network.
I guess what i am seeing is quality is more important than quantity.
Maybe your games are better quality since they are all stockfish games (though at fast time control) so I will try.

Daniel
Daniel,

No, life caught up with me and I haven't done much NN training or chess development in general in many months. If you end up using these games for training, I would love to know how it went.

Álvaro.

Re: 3 million games for training neural networks

Posted: Fri Oct 12, 2018 2:28 am
by Dann Corbit
Daniel Shawul wrote: Fri Oct 12, 2018 12:04 am
AlvaroBegue wrote: Sat Feb 24, 2018 4:08 am Hi,

I had my 8-core Ryzen 7 computer spend about 2 months generating quick Stockfish-vs-Stockfish games. The result is 3 million games that I think could be used for training neural networks, or to tune evaluation functions.

I have the games available in PGN format with comments that include the score and search depth, or in a much more compact and easy to parse format: One game per line, consisting of moves in UCI notation followed by the result at the end.

Code: Select all

d2d4 g8f6 g1f3 g7g6 b1c3 f8g7 e2e4 d7d6 c1e3 c7c6 d1d2 b8d7 a2a4 e8g8 e3h6 e7e5 e1c1 d8a5 d2g5 g7h6 g5h6 d7b6 d1d3 c8e6 f3g5 b6d7 f1e2 b7b5 d4e5 d7e5 d3g3 b5b4 c3b1 b4b3 c2b3 c6c5 b1d2 c5c4 g5e6 f7e6 b3c4 a8b8 h6h3 b8b2 h3e6 g8g7 d2b3 b2b3 g3b3 f6e4 e6d5 e4c5 b3b5 a5c3 c1b1 f8f2 d5d6 f2e2 d6e7 g7h6 e7h4 h6g7 h4e7 g7h6 0-1
I used openings from swcr-fq-openings-v3.5.pgn , whose lines a fixed length of 8 moves for each player.

At least one person in this forum has expressed an interest in the data. I will make it available soon, but I wanted to ask this: Would you be interested in the PGN version or the games, or is the simple format enough?
Alvaro, did you have success with training neural networks with these games ?
I have had no success with using ccrl-cegt 40/40 games for training. The accuracy is about 50% and is unable
to beat a network trained from quiet.epd+your_epd_set (total 2M postions). I have trained even with ~1 billion positons
extracted from various PGN databases and using the game outcome but i still weaker network than one trained with 2M pos network.
I guess what i am seeing is quality is more important than quantity.
Maybe your games are better quality since they are all stockfish games (though at fast time control) so I will try.

Daniel
Perhaps collections like high level TCEC games would be useful for training. They have scores attached to each position.
That should give very high quality answers for the actual value of a position.

Re: 3 million games for training neural networks

Posted: Fri Oct 12, 2018 5:41 pm
by Daniel Shawul
Dann Corbit wrote: Fri Oct 12, 2018 2:28 am
Perhaps collections like high level TCEC games would be useful for training. They have scores attached to each position.
That should give very high quality answers for the actual value of a position.
Yes, I think that is what I need. Training from PGN games using the result tag just doesn't seem to work for me for some reason.
I just finished training with Alvaro's 3 mil games, while it plays good chess, still looses to the network trained from 2 M positions.

Daniel

Re: 3 million games for training neural networks

Posted: Fri Oct 12, 2018 9:16 pm
by AlvaroBegue
If you do use games where White and Black are different engines (TCEC or CCRL), it would be interesting to inform the network of the Elo difference between the players (assuming Elo estimates are available). That input can then be used to implement something like contempt factor in a more principled way.

Re: 3 million games for training neural networks

Posted: Fri Oct 12, 2018 9:38 pm
by Ratosh
Are those 2M positions available?

Re: 3 million games for training neural networks

Posted: Fri Oct 12, 2018 11:19 pm
by Daniel Shawul
Ratosh wrote: Fri Oct 12, 2018 9:38 pm Are those 2M positions available?
Those are 770k position from quiet_labeled.epd (zurichess), and 1.x million pos from Alvaro that everybody here has.
I don't know why I am not able to beat those networks from PGN-trained networks so far...
I just tried Cheng's 12 million positions metnioned in this thread, same outcome. But John's big3.epd + lichess.epd did help improve my network.

Daniel