Training a Transformer on UCI lines

Discussion of chess software programming and technical issues.

Moderator: Ras

Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Training a Transformer on UCI lines

Post by Fulvio »

I am thinking of trying, as an experiment, to train a transformer on UCI lines.
Probably starting with nanoGPT (https://github.com/karpathy/nanoGPT) and then trying to fine-tune 7B models like LLaMA (Microsoft researchers claim that starting from a pre-trained model on text still produces better results even in specific fields:
I am undecided whether to provide it with the normal UCI lines (e2e4 e7e5 g1f3 etc...) assigning a token to each possible move (about 4100 tokens) or also insert the type of the moving piece (Pe2e4 Pe7e5 Ng1f3 etc...) (about 25000 tokens).
I expect it to learn how the pieces move, but I have no idea what level it can reach.
What takes more time is preparing the dataset. I thought of trying with 100k games that should be error-free (so recent, among high-level engines, with a decent time per move) and very varied in terms of openings and moves.
This leads me to two requests:
- has anyone already tried something similar?
- does anyone already have a dataset with those characteristics?
smatovic
Posts: 3225
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Training a Transformer on UCI lines

Post by smatovic »

Fulvio wrote: Wed Apr 12, 2023 6:53 pm - has anyone already tried something similar?
I did not try, but others queried present large language models, obv. there was some PGN data in the training data set:

https://talkchess.com/forum3/viewtopic. ... 1&p=935124

Somebody tried the "descriptive approach", but I doubt that GPT-3 can hold currently an internal board representation and apply the rules of chess on it:

https://talkchess.com/forum3/viewtopic. ... 4&p=928550
Fulvio wrote: Wed Apr 12, 2023 6:53 pm - does anyone already have a dataset with those characteristics?
Maybe try with EGTB as a short cut, for training and evaluating the result?

--
Srdja
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: Training a Transformer on UCI lines

Post by Fulvio »

smatovic wrote: Wed Apr 12, 2023 7:01 pm I did not try, but others queried present large language models, obv. there was some PGN data in the training data set:
Thanks, I should also mention that I got the idea after viewing this hilarious video: