Model Training: how to weight game outcomes

osvitashev · Post by **osvitashev** » Thu Apr 03, 2025 2:30 am

I doing some experiments with trying to train a model to predict the game outcome given a position. So, basically an evaluation function.
A database of existing games is the training dataset that maps a game position representation to an outcome (-1,0,1), and i am using a neural network with backpropagation... The standard textbook stuff.

Now the question: if i have a sequence of game positions from a game, should they be weighted differently or not?
Some of the options i can think of:
1. Same weights
2. Weights inversely proportional to game length
3. Weights inversely proportional to number of moves left in the game
4. Weights determined by another trained model that predicts how soon the game is going to end. (Haven't tried this one yet.)

Right now, #3 seems most promising in my experiments. (In the sense that i can get my network training to converge, but it is nowhere near being useful as a chess engine evaluation function at this point)

I suspect the choice of weight distribution boils down to what we think of as the nature of winning in chess (or other 2-player games).
Do we expect the winning player win by gradually accumulating advantage with every move?
Or do we expect the winning player to snowball by capitalizing on opponent's blunder?

JacquesRW · Post by **JacquesRW** » Thu Apr 03, 2025 3:49 pm

osvitashev wrote: ↑Thu Apr 03, 2025 2:30 am Now the question: if i have a sequence of game positions from a game, should they be weighted differently or not?
Some of the options i can think of:
1. Same weights
2. Weights inversely proportional to game length
3. Weights inversely proportional to number of moves left in the game
4. Weights determined by another trained model that predicts how soon the game is going to end. (Haven't tried this one yet.)

Right now, #3 seems most promising in my experiments. (In the sense that i can get my network training to converge, but it is nowhere near being useful as a chess engine evaluation function at this point)

For #3, why would you want to downweigh positions nearer the start of the game? That is when the eval matters more (search can resolve endgames just fine even with not so great eval). #3 is converging best because you are simply downweighing the positions that are more complicated, and thus more difficult to predict.

Don't weigh positions differently at all at first. Start as simple as possible and progress in small steps that are each properly tested - by putting the new network and current best network in an engine and playing actual games out to see which is stronger. Fixed node games are very fast to run and a good enough test for a while.

gaard · Post by **gaard** » Thu Apr 03, 2025 8:08 pm

osvitashev wrote: ↑Thu Apr 03, 2025 2:30 am I doing some experiments with trying to train a model to predict the game outcome given a position. So, basically an evaluation function.
A database of existing games is the training dataset that maps a game position representation to an outcome (-1,0,1), and i am using a neural network with backpropagation... The standard textbook stuff.

Now the question: if i have a sequence of game positions from a game, should they be weighted differently or not?
Some of the options i can think of:
1. Same weights
2. Weights inversely proportional to game length
3. Weights inversely proportional to number of moves left in the game
4. Weights determined by another trained model that predicts how soon the game is going to end. (Haven't tried this one yet.)

Right now, #3 seems most promising in my experiments. (In the sense that i can get my network training to converge, but it is nowhere near being useful as a chess engine evaluation function at this point)

I suspect the choice of weight distribution boils down to what we think of as the nature of winning in chess (or other 2-player games).
Do we expect the winning player win by gradually accumulating advantage with every move?
Or do we expect the winning player to snowball by capitalizing on opponent's blunder?

A very simple and useful metric might be game phase, which is proportional to the material on the board, disregarding material and positional imbalances. It scales from undecided (lots of material) to decided (little material). The scale and initiative (aka complexity) methods in the old Stockfish HCE do this more precisely.

Model Training: how to weight game outcomes

Model Training: how to weight game outcomes

Re: Model Training: how to weight game outcomes

Re: Model Training: how to weight game outcomes