Piece weights with regression analysis (in Russian)

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Xann
Posts: 127
Joined: Sat Jan 22, 2011 7:14 pm
Location: Lille, France

Re: Piece weights with regression analysis (in Russian)

Post by Xann »

Hi Marcel,
mvk wrote:Isn't the essence of this method exactly the same what everybody else has been doing now for a couple of years and which has become known as "Texel's tuning method"? (That is, a fit of the evaluation in the percentage domain against the game results?). The only real difference is that we use much larger data sets. All other differences seem not essential at first glance.
Sorry to hijack your post but this topic has been bugging me for a while: isn't "Texel's tuning method" just logistic regression that has been used in games for about twenty years (see Buro's papers for instance)?

I don't see any difference between all three apart from QS vs. eval, which seems a minor issue to me. A single-neuron ANN is also the same.

Fabien.
User avatar
Bloodbane
Posts: 154
Joined: Thu Oct 03, 2013 4:17 pm

Re: Piece weights with regression analysis (in Russian)

Post by Bloodbane »

Xann wrote:Hi Marcel,
mvk wrote:Isn't the essence of this method exactly the same what everybody else has been doing now for a couple of years and which has become known as "Texel's tuning method"? (That is, a fit of the evaluation in the percentage domain against the game results?). The only real difference is that we use much larger data sets. All other differences seem not essential at first glance.
Sorry to hijack your post but this topic has been bugging me for a while: isn't "Texel's tuning method" just logistic regression that has been used in games for about twenty years (see Buro's papers for instance)?

I don't see any difference between all three apart from QS vs. eval, which seems a minor issue to me. A single-neuron ANN is also the same.

Fabien.
Seems that way to me.
Functional programming combines the flexibility and power of abstract mathematics with the intuitive clarity of abstract mathematics.
https://github.com/mAarnos
mvk
Posts: 589
Joined: Tue Jun 04, 2013 10:15 pm

Re: Piece weights with regression analysis (in Russian)

Post by mvk »

Xann wrote:Sorry to hijack your post but this topic has been bugging me for a while: isn't "Texel's tuning method" just logistic regression that has been used in games for about twenty years (see Buro's papers for instance)?

I don't see any difference between all three apart from QS vs. eval, which seems a minor issue to me. A single-neuron ANN is also the same.
I don't know. I remember having read about the idea of assigning elo to othello pattern features. Instead of guessing, do you have links to specific papers? (both Buro, and also ANN?)
[Account deleted]
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Piece weights with regression analysis (in Russian)

Post by Gerd Isenberg »

Xann wrote:Hi Marcel,
mvk wrote:Isn't the essence of this method exactly the same what everybody else has been doing now for a couple of years and which has become known as "Texel's tuning method"? (That is, a fit of the evaluation in the percentage domain against the game results?). The only real difference is that we use much larger data sets. All other differences seem not essential at first glance.
Sorry to hijack your post but this topic has been bugging me for a while: isn't "Texel's tuning method" just logistic regression that has been used in games for about twenty years (see Buro's papers for instance)?

I don't see any difference between all three apart from QS vs. eval, which seems a minor issue to me. A single-neuron ANN is also the same.

Fabien.
Wow, so simple is that ;-)
Slowly the dust settles. Thanks for the insight, yes, a single-neuron ANN.

I don't exactly get that with the cost function requiering gradient descent to find the optimum, i.e. the error formula with least squares as used in Texel Tuning and Deep Thought Tuning and the "wild formulas" in Buro's paper?
http://www.jair.org/media/179/live-179-1475-jair.pdf
Xann
Posts: 127
Joined: Sat Jan 22, 2011 7:14 pm
Location: Lille, France

Re: Piece weights with regression analysis (in Russian)

Post by Xann »

Hi Gerd,
Gerd Isenberg wrote:Wow, so simple is that ;-)
Slowly the dust settles. Thanks for the insight, yes, a single-neuron ANN.

I don't exactly get that with the cost function requiering gradient descent to find the optimum, i.e. the error formula with least squares as used in Texel Tuning and Deep Thought Tuning and the "wild formulas" in Buro's paper?
http://www.jair.org/media/179/live-179-1475-jair.pdf
You're right, I forgot to mention the error functions: cross entropy vs. sum of squares.

I would expect statisticians to be religious about using cross entropy when you want to interpret the output as a probability. In practice, we can just try both :)

I think the practical advice that can be found on the net about ANN learning parameters is both very useful and easier to understand (e.g. the bottom of https://visualstudiomagazine.com/Articl ... spx?Page=1 though this focuses on data representation):
- if the output is continuous, use a linear output and sum of squares as error function
- if the output is binary/probability, use the logistic function + cross entropy
- if the output is discrete, use Softmax + log loss (not sure about the official name)

The binary case is just an optimisation of Softmax with two outputs (since q = 1 - p), but IMO it complicates the formulas.

That was for choosing an error function though, which might not be what your question was about.

Fabien.
Xann
Posts: 127
Joined: Sat Jan 22, 2011 7:14 pm
Location: Lille, France

Re: Piece weights with regression analysis (in Russian)

Post by Xann »

mvk wrote:I don't know. I remember having read about the idea of assigning elo to othello pattern features. Instead of guessing, do you have links to specific papers? (both Buro, and also ANN?)
I used your post but actually meant the question to be for everyone. Gerd found the original paper by Buro (his earlier PhD is in German). Later ones are very interesting too but he switched to linear regression since Othello has "continuous" game results.

The ANN literature is ginormous ... Any resource mentioning logistic output and (if possible) cross entropy should be relevant.
Xann
Posts: 127
Joined: Sat Jan 22, 2011 7:14 pm
Location: Lille, France

Re: Piece weights with regression analysis (in Russian)

Post by Xann »

Gerd Isenberg wrote:I don't exactly get that with the cost function requiering gradient descent to find the optimum, i.e. the error formula with least squares as used in Texel Tuning and Deep Thought Tuning and the "wild formulas" in Buro's paper?
http://www.jair.org/media/179/live-179-1475-jair.pdf
Are you referring to the log(L(beta)) ... page 377?

I think that Buro wanted to show that by assuming that the output is a probability and then maximising the likelihood, you obtain the cross-entropy error function. He needlessly complicated the formulas by introducing n_i though.
Xann
Posts: 127
Joined: Sat Jan 22, 2011 7:14 pm
Location: Lille, France

Re: Piece weights with regression analysis (in Russian)

Post by Xann »

mvk wrote:I don't know. I remember having read about the idea of assigning elo to othello pattern features. Instead of guessing, do you have links to specific papers? (both Buro, and also ANN?)
Try this one for ANN (also for Gerd): http://www.cedar.buffalo.edu/%7Esrihari ... aining.pdf

It seems to mention all the subjects I talked about!
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Piece weights with regression analysis (in Russian)

Post by Gerd Isenberg »

Xann wrote:
Gerd Isenberg wrote:I don't exactly get that with the cost function requiering gradient descent to find the optimum, i.e. the error formula with least squares as used in Texel Tuning and Deep Thought Tuning and the "wild formulas" in Buro's paper?
http://www.jair.org/media/179/live-179-1475-jair.pdf
Are you referring to the log(L(beta)) ... page 377?

I think that Buro wanted to show that by assuming that the output is a probability and then maximising the likelihood, you obtain the cross-entropy error function. He needlessly complicated the formulas by introducing n_i though.
Yep,
and further from Vladimir's article
https://chessprogramming.wikispaces.com ... n+Analysis
minimizing the cost function for the logistic regression
J(θ)=1m[∑i=1my(i)log(hθ(x(i)))+(1−y(i))log(1−hθ(x(i)))]
...
where the components of the gradient of J reg have the form
(∇Jreg)0=1m∑i=1m(hθ(x(i))−y(i))x(i)0
(∇Jreg)j=1m∑i=1m(hθ(x(i))−y(i))x(i)j−λmθj
Dann Corbit
Posts: 12541
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Piece weights with regression analysis (in Russian)

Post by Dann Corbit »

Xann wrote:Hi Marcel,
mvk wrote:Isn't the essence of this method exactly the same what everybody else has been doing now for a couple of years and which has become known as "Texel's tuning method"? (That is, a fit of the evaluation in the percentage domain against the game results?). The only real difference is that we use much larger data sets. All other differences seem not essential at first glance.
Sorry to hijack your post but this topic has been bugging me for a while: isn't "Texel's tuning method" just logistic regression that has been used in games for about twenty years (see Buro's papers for instance)?

I don't see any difference between all three apart from QS vs. eval, which seems a minor issue to me. A single-neuron ANN is also the same.

Fabien.
I think that the Chess Programming Wiki has nothing new in it.
What it does contain is simple, clear explanations about how to do something you need to do to write a good chess engine.
For instance, how do I do a population count? Probably everyone knows how to do it except the utter novice but the inclusion of articles on that is still very nice because it is a clear explanation for someone who wants to do it and also has explanations of how and why, not just what.
This sort of parameter fitting was not found in the wiki before, so I think it is a very good inclusion.
Probably, for all of the expert programmers this stuff is old hat.
But it is the clearest and simplest explanation that I have read.