test positions for texel tuning

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 3:02 pm
Contact:

test positions for texel tuning

Post by brtzsnr » Sun Sep 20, 2015 9:34 am

Lately, I've been trying to find better positions for tuning using the Texel method. Learn about Texel tuning method here: https://chessprogramming.wikispaces.com ... ing+Method

Before, in order to extract test positions I was doing: play ~200,000 games at 40/1+0.03, extract all quiet positions from the games; out of the 10,000,000 positions randomly pick 1,000,000 positions for tuning.

I tried several ideas, but none of them seem to work:
1. Pick 10 positions from each game. Makes the positions less biased toward long draw games. Idea from Pedone.
2. Disable fifty-move draw rule. It gives very low scores to queen and rook so it clearly doesn't work.
3. Pick the position at the end of PV (the one that gives the actual score). Idea from TD-Leaf / KnigthCap.
4. Proportional score: instead of assigning a fixed 0 / 0.5 / 1 the score is proportional to the game progress: 0.5 * (1-t) + r * t (where r is the result, t is ply / total plies). Idea from Pedone.
5. Extract tactical positions, too. My engine does quiescence search when tuning.

I didn't try doing the other step from TD-Leaf: minimize the error between consecutive positions. It's not clear to me whether TD-Leaf can find better values than the Texel tuning method.

Despite my efforts over the last month I've never managed to tune anything. Best I could do is +2ELO.

Are there any other ideas I could try to pick better positions for tuning?

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

Re: test positions for texel tuning

Post by matthewlai » Sun Sep 20, 2015 10:30 am

brtzsnr wrote: I didn't try doing the other step from TD-Leaf: minimize the error between consecutive positions. It's not clear to me whether TD-Leaf can find better values than the Texel tuning method.
Well, it's not going to be clear if you don't try it. TD-Leaf helped me gain a few hundred Elo, so it clearly does work very well, at least in some setups.
Are there any other ideas I could try to pick better positions for tuning?
I start from actual game positions, then apply a random legal move before using them. This increases diversity of training positions. This also resulted in a huge gain for Giraffe.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

User avatar
Fabio Gobbato
Posts: 131
Joined: Fri Apr 11, 2014 8:45 am
Contact:

Re: test positions for texel tuning

Post by Fabio Gobbato » Sun Sep 20, 2015 11:13 am

You can try to extract positions where the search score doesn't differ too much from the qsearch score, only if you have saved in the pgn the pv with the score of the position.

Using the Texel's tuning method I have improved my engine a bit but now it doesn't work although my engine is not well tuned.
I have had different result with different positions. The best results has come when I used high quality games from CCRL or CEGT.

Now I'm trying TD-Leaf because I think it's difficult to compare the final result of a position with an engine score. TD-Leaf compares only scores of related positions. So far I have had some good results with it, but I'm trying to filter the positions to improve the tuning.

I removed the positions where the score is too high (more than 1000cp) because the tuning should be very good when the game is not yet decided, and positions where the score is 0 because of a repetition.
I'm trying, in some week I can tell you the results.

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

Re: test positions for texel tuning

Post by matthewlai » Sun Sep 20, 2015 11:17 am

Fabio Gobbato wrote:You can try to extract positions where the search score doesn't differ too much from the qsearch score, only if you have saved in the pgn the pv with the score of the position.

Using the Texel's tuning method I have improved my engine a bit but now it doesn't work although my engine is not well tuned.
I have had different result with different positions. The best results has come when I used high quality games from CCRL or CEGT.

Now I'm trying TD-Leaf because I think it's difficult to compare the final result of a position with an engine score. TD-Leaf compares only scores of related positions. So far I have had some good results with it, but I'm trying to filter the positions to improve the tuning.

I removed the positions where the score is too high (more than 1000cp) because the tuning should be very good when the game is not yet decided, and positions where the score is 0 because of a repetition.
I'm trying, in some week I can tell you the results.
I have tested that idea quite a bit (only using positions where the score is not too different), and ended up finding that it's actually best to leave them in.

However, measures should be taken to ensure that they don't completely dominate training. For example, in Giraffe, I am using L1-loss, which essentially only provides a direction for adjustment, and not magnitude.

That means your evaluation function can learn things like trapped bishop, where there's often high positional gain in a few moves.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 3:02 pm
Contact:

Re: test positions for texel tuning

Post by brtzsnr » Sun Sep 20, 2015 11:26 am

Fabio Gobbato wrote: I removed the positions where the score is too high (more than 1000cp) because the tuning should be very good when the game is not yet decided, and positions where the score is 0 because of a repetition.
I'm trying, in some week I can tell you the results.
That didn't work for me because it tends to learn that Queens and Rooks are irrelevant to win a game. Basically you lack positions when when side is winning with a queen and pawn up.

User avatar
Fabio Gobbato
Posts: 131
Joined: Fri Apr 11, 2014 8:45 am
Contact:

Re: test positions for texel tuning

Post by Fabio Gobbato » Sun Sep 20, 2015 11:42 am

I'm trying this idea with TD-Leaf, not with the texel's tuning method. The value of a rook or queen could be found also in positions with a score of less than 1000cp but I think it's not bad also 2000. The idea is that is useless to tune the parameters for a position where the score is very high.

brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 3:02 pm
Contact:

Re: test positions for texel tuning

Post by brtzsnr » Sun Sep 20, 2015 2:08 pm

matthewlai wrote:I start from actual game positions, then apply a random legal move before using them. This increases diversity of training positions. This also resulted in a huge gain for Giraffe.
Why is this better than selecting a larger set of games?

I use 2moves_v1.pgn opening book (here is the v2 https://groups.google.com/d/msg/fishcoo ... dSlJdDmBEJ) which generates a quite diverse set of games. 2moves books are generated by playing 2 moves on each side and removing all positions for which stockfish gives a high score to one side.

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

Re: test positions for texel tuning

Post by matthewlai » Sun Sep 20, 2015 6:15 pm

brtzsnr wrote:
matthewlai wrote:I start from actual game positions, then apply a random legal move before using them. This increases diversity of training positions. This also resulted in a huge gain for Giraffe.
Why is this better than selecting a larger set of games?

I use 2moves_v1.pgn opening book (here is the v2 https://groups.google.com/d/msg/fishcoo ... dSlJdDmBEJ) which generates a quite diverse set of games. 2moves books are generated by playing 2 moves on each side and removing all positions for which stockfish gives a high score to one side.
That opening book actually achieves the opposite of what I tried to achieve.

I wanted to have a training set with more imbalances, because although they don't show up as often in actual games, they are encountered in search all the time, and the evaluation function needs to know how to deal with them.

I had problems with that in Giraffe when I only used balanced positions. For example, it would just give pawns away for free in openings, because it just doesn't see enough opening positions with high imbalances in training, because those positions almost never show up in real games.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 3:02 pm
Contact:

Re: test positions for texel tuning

Post by brtzsnr » Wed Sep 23, 2015 4:48 pm

matthewlai wrote: I start from actual game positions, then apply a random legal move before using them. This increases diversity of training positions. This also resulted in a huge gain for Giraffe.
I tried with texel tuning method and failed miserably. I guess most moves are really bad, e.g. putting you queen in the fire of a pawn.

I also tried to use only losing/ties positions (from www.mini.pw.edu.pl/~mandziuk/PRACE/ICONIP04.pdf) and failed again.

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

Re: test positions for texel tuning

Post by matthewlai » Wed Sep 23, 2015 4:50 pm

brtzsnr wrote:
matthewlai wrote: I start from actual game positions, then apply a random legal move before using them. This increases diversity of training positions. This also resulted in a huge gain for Giraffe.
I tried with texel tuning method and failed miserably. I guess most moves are really bad, e.g. putting you queen in the fire of a pawn.

I also tried to use only losing/ties positions (from www.mini.pw.edu.pl/~mandziuk/PRACE/ICONIP04.pdf) and failed again.
It's possible that it may not work with the Texel method. It worked very well for me with TD though.

The fact that many moves are really bad is intentional. Because those really bad positions will be seen at leaves, too, and the evaluation function needs to be able to evaluate them accurately as well.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

Post Reply