Cooking Cheese with Texel's tuning method

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
AlvaroBegue
Posts: 922
Joined: Tue Mar 09, 2010 2:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Cooking Cheese with Texel's tuning method

Post by AlvaroBegue » Wed Jun 21, 2017 3:13 pm

jdart wrote: Some people have also used methods such as https://en.wikipedia.org/wiki/BOBYQA or L-BFGS (https://en.wikipedia.org/wiki/Limited-memory_BFGS), which are available in many optimization libraries. These require approximating the gradient and the Hessian (2nd derivative), which is expensive, but on the other hand they converge fast.
I didn't know about BOBYQA, but that Wikipedia page says it does not need a gradient function, so it will create a quadratic approximation to the function using only function evaluations.

L-BFGS requires a gradient function but, as I demonstrated with RuyTune, that can be done automatically from an existing function, with a bit of work and C++ template magic. The approximation of the Hessian is something internal to the workings of L-BFGS.

There are other related algorithms that are variants of the conjugate gradient method, which can use a function that computes the product of the Hessian and a vector. This can also be done using automatic differentiation, although I don't have a good reference that explains how. If anyone is interested, I can try to explain it.

Patrice Duhamel
Posts: 135
Joined: Sat May 25, 2013 9:17 am
Location: France
Full name: Patrice Duhamel
Contact:

Re: Cooking Cheese with Texel's tuning method

Post by Patrice Duhamel » Wed Jun 21, 2017 4:33 pm

asanjuan wrote:I don't have time to work in my own engine, so let's help others instead.
:)
Thanks, I hope you will find more time.
asanjuan wrote:
1. Try to have equal or close to equal distribution of draws, wins and loses results.
2. More varied openings
3. More positions
4. Collect positions having more common evaluation features i.e training positions
have passers, piece outpost, closed/open positions for mobility, pins, pawn weaknesses such
as isolated, doubled, backward, rook in open files, rook on 7th ranks and others.
Collect as many as you can for positions having these features.
The way that I achieve that is collecting games where one side is stronger (say 50 elo) than the other. It narrows the draw percentage, and you can consider them as real draws because the stronger side is not able to win it against a weaker player.
The ELO difference is really important or just to reduce the draw rate ?

I'm trying to do this :

- Generate a big number of games (50000 or more)
- load games and for each position detect "important features" for the evaluation (ex: rook on 7th, outpost, passed pawns, ....)
- for each feature, add a value to a game score
- sort all games by this score (in 3 arrays, win/lost/draw)
- convert pgn to epd and save the number of positions you want to have the same count of win/lost/draw

Do you think it's a good idea ?

jdart
Posts: 3866
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: Cooking Cheese with Texel's tuning method

Post by jdart » Wed Jun 21, 2017 11:45 pm

The only reason gradient approximation (internal or otherwise) is a problem with these methods (BOBYQA, L-BFGS) is that they can require large numbers of function evaluations for complete convergence (down to many decimal places). But for chess parameters you don't have to get the error down to some tiny level. You just need to get reasonably close to the minimum. For these purposes and with this convergence condition they are ok and gradient approximation is not an issue.

Note also these algorithms are not designed for large dimensional problems although they may work ok for those.

--Jon

User avatar
asanjuan
Posts: 211
Joined: Thu Sep 01, 2011 3:38 pm
Location: Seville, Spain

Re: Cooking Cheese with Texel's tuning method

Post by asanjuan » Thu Jun 22, 2017 10:55 am

Patrice Duhamel wrote:
asanjuan wrote:I don't have time to work in my own engine, so let's help others instead.
:)
Thanks, I hope you will find more time.
asanjuan wrote:
1. Try to have equal or close to equal distribution of draws, wins and loses results.
2. More varied openings
3. More positions
4. Collect positions having more common evaluation features i.e training positions
have passers, piece outpost, closed/open positions for mobility, pins, pawn weaknesses such
as isolated, doubled, backward, rook in open files, rook on 7th ranks and others.
Collect as many as you can for positions having these features.
The way that I achieve that is collecting games where one side is stronger (say 50 elo) than the other. It narrows the draw percentage, and you can consider them as real draws because the stronger side is not able to win it against a weaker player.
The ELO difference is really important or just to reduce the draw rate ?

I'm trying to do this :

- Generate a big number of games (50000 or more)
- load games and for each position detect "important features" for the evaluation (ex: rook on 7th, outpost, passed pawns, ....)
- for each feature, add a value to a game score
- sort all games by this score (in 3 arrays, win/lost/draw)
- convert pgn to epd and save the number of positions you want to have the same count of win/lost/draw

Do you think it's a good idea ?
well, there are two important things with the elo criteria:

1. the outcome of the game has to be related to the positional advantage of the indivitual position that you are learning from. The Texel method works by correlating positional parameters with the outcome of the game. So you need that the side with that advantage is able to convert the current positional advantages into a winning endgame.
It is easier if the stronger side has the advantage.

2. You need to narrow the draws. If the training data has lots of draws (this is, evaluation = 0) the learning process will set the parameters close to 0. if the training set doesn't have draws, the resulting evaluation parameters will raise, or even exagerate. So you need a compromise: 33% win, 33% loses, 33% draws, for example.

Take into account that slightly exagerated positional parameters leads to an interesting or aggresive playing style.

Post Reply