Lately, I've been trying to find better positions for tuning using the Texel method. Learn about Texel tuning method here: https://chessprogramming.wikispaces.com ... ing+Method
Before, in order to extract test positions I was doing: play ~200,000 games at 40/1+0.03, extract all quiet positions from the games; out of the 10,000,000 positions randomly pick 1,000,000 positions for tuning.
I tried several ideas, but none of them seem to work:
1. Pick 10 positions from each game. Makes the positions less biased toward long draw games. Idea from Pedone.
2. Disable fifty-move draw rule. It gives very low scores to queen and rook so it clearly doesn't work.
3. Pick the position at the end of PV (the one that gives the actual score). Idea from TD-Leaf / KnigthCap.
4. Proportional score: instead of assigning a fixed 0 / 0.5 / 1 the score is proportional to the game progress: 0.5 * (1-t) + r * t (where r is the result, t is ply / total plies). Idea from Pedone.
5. Extract tactical positions, too. My engine does quiescence search when tuning.
I didn't try doing the other step from TD-Leaf: minimize the error between consecutive positions. It's not clear to me whether TD-Leaf can find better values than the Texel tuning method.
Despite my efforts over the last month I've never managed to tune anything. Best I could do is +2ELO.
Are there any other ideas I could try to pick better positions for tuning?
test positions for texel tuning
Moderators: hgm, Rebel, chrisw
-
- Posts: 433
- Joined: Fri Jan 16, 2015 4:02 pm
-
- Posts: 793
- Joined: Sun Aug 03, 2014 4:48 am
- Location: London, UK
Re: test positions for texel tuning
Well, it's not going to be clear if you don't try it. TD-Leaf helped me gain a few hundred Elo, so it clearly does work very well, at least in some setups.brtzsnr wrote: I didn't try doing the other step from TD-Leaf: minimize the error between consecutive positions. It's not clear to me whether TD-Leaf can find better values than the Texel tuning method.
I start from actual game positions, then apply a random legal move before using them. This increases diversity of training positions. This also resulted in a huge gain for Giraffe.Are there any other ideas I could try to pick better positions for tuning?
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
-
- Posts: 217
- Joined: Fri Apr 11, 2014 10:45 am
- Full name: Fabio Gobbato
Re: test positions for texel tuning
You can try to extract positions where the search score doesn't differ too much from the qsearch score, only if you have saved in the pgn the pv with the score of the position.
Using the Texel's tuning method I have improved my engine a bit but now it doesn't work although my engine is not well tuned.
I have had different result with different positions. The best results has come when I used high quality games from CCRL or CEGT.
Now I'm trying TD-Leaf because I think it's difficult to compare the final result of a position with an engine score. TD-Leaf compares only scores of related positions. So far I have had some good results with it, but I'm trying to filter the positions to improve the tuning.
I removed the positions where the score is too high (more than 1000cp) because the tuning should be very good when the game is not yet decided, and positions where the score is 0 because of a repetition.
I'm trying, in some week I can tell you the results.
Using the Texel's tuning method I have improved my engine a bit but now it doesn't work although my engine is not well tuned.
I have had different result with different positions. The best results has come when I used high quality games from CCRL or CEGT.
Now I'm trying TD-Leaf because I think it's difficult to compare the final result of a position with an engine score. TD-Leaf compares only scores of related positions. So far I have had some good results with it, but I'm trying to filter the positions to improve the tuning.
I removed the positions where the score is too high (more than 1000cp) because the tuning should be very good when the game is not yet decided, and positions where the score is 0 because of a repetition.
I'm trying, in some week I can tell you the results.
-
- Posts: 793
- Joined: Sun Aug 03, 2014 4:48 am
- Location: London, UK
Re: test positions for texel tuning
I have tested that idea quite a bit (only using positions where the score is not too different), and ended up finding that it's actually best to leave them in.Fabio Gobbato wrote:You can try to extract positions where the search score doesn't differ too much from the qsearch score, only if you have saved in the pgn the pv with the score of the position.
Using the Texel's tuning method I have improved my engine a bit but now it doesn't work although my engine is not well tuned.
I have had different result with different positions. The best results has come when I used high quality games from CCRL or CEGT.
Now I'm trying TD-Leaf because I think it's difficult to compare the final result of a position with an engine score. TD-Leaf compares only scores of related positions. So far I have had some good results with it, but I'm trying to filter the positions to improve the tuning.
I removed the positions where the score is too high (more than 1000cp) because the tuning should be very good when the game is not yet decided, and positions where the score is 0 because of a repetition.
I'm trying, in some week I can tell you the results.
However, measures should be taken to ensure that they don't completely dominate training. For example, in Giraffe, I am using L1-loss, which essentially only provides a direction for adjustment, and not magnitude.
That means your evaluation function can learn things like trapped bishop, where there's often high positional gain in a few moves.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
-
- Posts: 433
- Joined: Fri Jan 16, 2015 4:02 pm
Re: test positions for texel tuning
That didn't work for me because it tends to learn that Queens and Rooks are irrelevant to win a game. Basically you lack positions when when side is winning with a queen and pawn up.Fabio Gobbato wrote: I removed the positions where the score is too high (more than 1000cp) because the tuning should be very good when the game is not yet decided, and positions where the score is 0 because of a repetition.
I'm trying, in some week I can tell you the results.
zurichess - http://www.zurichess.xyz
-
- Posts: 217
- Joined: Fri Apr 11, 2014 10:45 am
- Full name: Fabio Gobbato
Re: test positions for texel tuning
I'm trying this idea with TD-Leaf, not with the texel's tuning method. The value of a rook or queen could be found also in positions with a score of less than 1000cp but I think it's not bad also 2000. The idea is that is useless to tune the parameters for a position where the score is very high.
-
- Posts: 433
- Joined: Fri Jan 16, 2015 4:02 pm
Re: test positions for texel tuning
Why is this better than selecting a larger set of games?matthewlai wrote:I start from actual game positions, then apply a random legal move before using them. This increases diversity of training positions. This also resulted in a huge gain for Giraffe.
I use 2moves_v1.pgn opening book (here is the v2 https://groups.google.com/d/msg/fishcoo ... dSlJdDmBEJ) which generates a quite diverse set of games. 2moves books are generated by playing 2 moves on each side and removing all positions for which stockfish gives a high score to one side.
zurichess - http://www.zurichess.xyz
-
- Posts: 793
- Joined: Sun Aug 03, 2014 4:48 am
- Location: London, UK
Re: test positions for texel tuning
That opening book actually achieves the opposite of what I tried to achieve.brtzsnr wrote:Why is this better than selecting a larger set of games?matthewlai wrote:I start from actual game positions, then apply a random legal move before using them. This increases diversity of training positions. This also resulted in a huge gain for Giraffe.
I use 2moves_v1.pgn opening book (here is the v2 https://groups.google.com/d/msg/fishcoo ... dSlJdDmBEJ) which generates a quite diverse set of games. 2moves books are generated by playing 2 moves on each side and removing all positions for which stockfish gives a high score to one side.
I wanted to have a training set with more imbalances, because although they don't show up as often in actual games, they are encountered in search all the time, and the evaluation function needs to know how to deal with them.
I had problems with that in Giraffe when I only used balanced positions. For example, it would just give pawns away for free in openings, because it just doesn't see enough opening positions with high imbalances in training, because those positions almost never show up in real games.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
-
- Posts: 433
- Joined: Fri Jan 16, 2015 4:02 pm
Re: test positions for texel tuning
I tried with texel tuning method and failed miserably. I guess most moves are really bad, e.g. putting you queen in the fire of a pawn.matthewlai wrote: I start from actual game positions, then apply a random legal move before using them. This increases diversity of training positions. This also resulted in a huge gain for Giraffe.
I also tried to use only losing/ties positions (from www.mini.pw.edu.pl/~mandziuk/PRACE/ICONIP04.pdf) and failed again.
zurichess - http://www.zurichess.xyz
-
- Posts: 793
- Joined: Sun Aug 03, 2014 4:48 am
- Location: London, UK
Re: test positions for texel tuning
It's possible that it may not work with the Texel method. It worked very well for me with TD though.brtzsnr wrote:I tried with texel tuning method and failed miserably. I guess most moves are really bad, e.g. putting you queen in the fire of a pawn.matthewlai wrote: I start from actual game positions, then apply a random legal move before using them. This increases diversity of training positions. This also resulted in a huge gain for Giraffe.
I also tried to use only losing/ties positions (from www.mini.pw.edu.pl/~mandziuk/PRACE/ICONIP04.pdf) and failed again.
The fact that many moves are really bad is intentional. Because those really bad positions will be seen at leaves, too, and the evaluation function needs to be able to evaluate them accurately as well.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.