I am thinking of trying, as an experiment, to train a transformer on UCI lines.
Probably starting with nanoGPT (https://github.com/karpathy/nanoGPT) and then trying to fine-tune 7B models like LLaMA (Microsoft researchers claim that starting from a pre-trained model on text still produces better results even in specific fields:
I am undecided whether to provide it with the normal UCI lines (e2e4 e7e5 g1f3 etc...) assigning a token to each possible move (about 4100 tokens) or also insert the type of the moving piece (Pe2e4 Pe7e5 Ng1f3 etc...) (about 25000 tokens).
I expect it to learn how the pieces move, but I have no idea what level it can reach.
What takes more time is preparing the dataset. I thought of trying with 100k games that should be error-free (so recent, among high-level engines, with a decent time per move) and very varied in terms of openings and moves.
This leads me to two requests:
- has anyone already tried something similar?
- does anyone already have a dataset with those characteristics?
Training a Transformer on UCI lines
Moderator: Ras
-
- Posts: 3225
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Training a Transformer on UCI lines
I did not try, but others queried present large language models, obv. there was some PGN data in the training data set:
https://talkchess.com/forum3/viewtopic. ... 1&p=935124
Somebody tried the "descriptive approach", but I doubt that GPT-3 can hold currently an internal board representation and apply the rules of chess on it:
https://talkchess.com/forum3/viewtopic. ... 4&p=928550
Maybe try with EGTB as a short cut, for training and evaluating the result?
--
Srdja
-
- Posts: 396
- Joined: Fri Aug 12, 2016 8:43 pm