Piece weights with regression analysis (in Russian)

Xann · Post by **Xann** » Mon May 04, 2015 4:24 pm

Hi Marcel,

mvk wrote:Isn't the essence of this method exactly the same what everybody else has been doing now for a couple of years and which has become known as "Texel's tuning method"? (That is, a fit of the evaluation in the percentage domain against the game results?). The only real difference is that we use much larger data sets. All other differences seem not essential at first glance.

Sorry to hijack your post but this topic has been bugging me for a while: isn't "Texel's tuning method" just logistic regression that has been used in games for about twenty years (see Buro's papers for instance)?

I don't see any difference between all three apart from QS vs. eval, which seems a minor issue to me. A single-neuron ANN is also the same.

Fabien.

Bloodbane · Post by **Bloodbane** » Mon May 04, 2015 4:41 pm

Xann wrote:Hi Marcel,

mvk wrote:Isn't the essence of this method exactly the same what everybody else has been doing now for a couple of years and which has become known as "Texel's tuning method"? (That is, a fit of the evaluation in the percentage domain against the game results?). The only real difference is that we use much larger data sets. All other differences seem not essential at first glance.
Sorry to hijack your post but this topic has been bugging me for a while: isn't "Texel's tuning method" just logistic regression that has been used in games for about twenty years (see Buro's papers for instance)?

I don't see any difference between all three apart from QS vs. eval, which seems a minor issue to me. A single-neuron ANN is also the same.

Fabien.

Seems that way to me.

mvk · Post by **mvk** » Mon May 04, 2015 5:12 pm

Xann wrote:Sorry to hijack your post but this topic has been bugging me for a while: isn't "Texel's tuning method" just logistic regression that has been used in games for about twenty years (see Buro's papers for instance)?

I don't see any difference between all three apart from QS vs. eval, which seems a minor issue to me. A single-neuron ANN is also the same.

I don't know. I remember having read about the idea of assigning elo to othello pattern features. Instead of guessing, do you have links to specific papers? (both Buro, and also ANN?)

Gerd Isenberg · Post by **Gerd Isenberg** » Mon May 04, 2015 5:27 pm

Xann wrote:Hi Marcel,

mvk wrote:Isn't the essence of this method exactly the same what everybody else has been doing now for a couple of years and which has become known as "Texel's tuning method"? (That is, a fit of the evaluation in the percentage domain against the game results?). The only real difference is that we use much larger data sets. All other differences seem not essential at first glance.
Sorry to hijack your post but this topic has been bugging me for a while: isn't "Texel's tuning method" just logistic regression that has been used in games for about twenty years (see Buro's papers for instance)?

I don't see any difference between all three apart from QS vs. eval, which seems a minor issue to me. A single-neuron ANN is also the same.

Fabien.

Wow, so simple is that

Slowly the dust settles. Thanks for the insight, yes, a single-neuron ANN.

I don't exactly get that with the cost function requiering gradient descent to find the optimum, i.e. the error formula with least squares as used in Texel Tuning and Deep Thought Tuning and the "wild formulas" in Buro's paper?
http://www.jair.org/media/179/live-179-1475-jair.pdf

Xann · Post by **Xann** » Mon May 04, 2015 6:04 pm

Hi Gerd,

Gerd Isenberg wrote:Wow, so simple is that
Slowly the dust settles. Thanks for the insight, yes, a single-neuron ANN.

I don't exactly get that with the cost function requiering gradient descent to find the optimum, i.e. the error formula with least squares as used in Texel Tuning and Deep Thought Tuning and the "wild formulas" in Buro's paper?
http://www.jair.org/media/179/live-179-1475-jair.pdf

You're right, I forgot to mention the error functions: cross entropy vs. sum of squares.

I would expect statisticians to be religious about using cross entropy when you want to interpret the output as a probability. In practice, we can just try both

I think the practical advice that can be found on the net about ANN learning parameters is both very useful and easier to understand (e.g. the bottom of https://visualstudiomagazine.com/Articl ... spx?Page=1 though this focuses on data representation):
- if the output is continuous, use a linear output and sum of squares as error function
- if the output is binary/probability, use the logistic function + cross entropy
- if the output is discrete, use Softmax + log loss (not sure about the official name)

The binary case is just an optimisation of Softmax with two outputs (since q = 1 - p), but IMO it complicates the formulas.

That was for choosing an error function though, which might not be what your question was about.

Fabien.

Xann · Post by **Xann** » Mon May 04, 2015 6:20 pm

mvk wrote:I don't know. I remember having read about the idea of assigning elo to othello pattern features. Instead of guessing, do you have links to specific papers? (both Buro, and also ANN?)

I used your post but actually meant the question to be for everyone. Gerd found the original paper by Buro (his earlier PhD is in German). Later ones are very interesting too but he switched to linear regression since Othello has "continuous" game results.

The ANN literature is ginormous ... Any resource mentioning logistic output and (if possible) cross entropy should be relevant.

Xann · Post by **Xann** » Mon May 04, 2015 6:29 pm

Gerd Isenberg wrote:I don't exactly get that with the cost function requiering gradient descent to find the optimum, i.e. the error formula with least squares as used in Texel Tuning and Deep Thought Tuning and the "wild formulas" in Buro's paper?
http://www.jair.org/media/179/live-179-1475-jair.pdf

Are you referring to the log(L(beta)) ... page 377?

I think that Buro wanted to show that by assuming that the output is a probability and then maximising the likelihood, you obtain the cross-entropy error function. He needlessly complicated the formulas by introducing n_i though.

Xann · Post by **Xann** » Mon May 04, 2015 7:00 pm

mvk wrote:I don't know. I remember having read about the idea of assigning elo to othello pattern features. Instead of guessing, do you have links to specific papers? (both Buro, and also ANN?)

Try this one for ANN (also for Gerd): http://www.cedar.buffalo.edu/%7Esrihari ... aining.pdf

It seems to mention all the subjects I talked about!

Gerd Isenberg · Post by **Gerd Isenberg** » Mon May 04, 2015 7:17 pm

Xann wrote:
Gerd Isenberg wrote:I don't exactly get that with the cost function requiering gradient descent to find the optimum, i.e. the error formula with least squares as used in Texel Tuning and Deep Thought Tuning and the "wild formulas" in Buro's paper?
http://www.jair.org/media/179/live-179-1475-jair.pdf
Are you referring to the log(L(beta)) ... page 377?

I think that Buro wanted to show that by assuming that the output is a probability and then maximising the likelihood, you obtain the cross-entropy error function. He needlessly complicated the formulas by introducing n_i though.

Yep,
and further from Vladimir's article
https://chessprogramming.wikispaces.com ... n+Analysis
minimizing the cost function for the logistic regression
J(θ)=1m[∑i=1my(i)log(hθ(x(i)))+(1−y(i))log(1−hθ(x(i)))]
...
where the components of the gradient of J reg have the form
(∇Jreg)0=1m∑i=1m(hθ(x(i))−y(i))x(i)0
(∇Jreg)j=1m∑i=1m(hθ(x(i))−y(i))x(i)j−λmθj

Dann Corbit · Post by **Dann Corbit** » Mon May 04, 2015 8:56 pm

Xann wrote:Hi Marcel,

mvk wrote:Isn't the essence of this method exactly the same what everybody else has been doing now for a couple of years and which has become known as "Texel's tuning method"? (That is, a fit of the evaluation in the percentage domain against the game results?). The only real difference is that we use much larger data sets. All other differences seem not essential at first glance.
Sorry to hijack your post but this topic has been bugging me for a while: isn't "Texel's tuning method" just logistic regression that has been used in games for about twenty years (see Buro's papers for instance)?

I don't see any difference between all three apart from QS vs. eval, which seems a minor issue to me. A single-neuron ANN is also the same.

Fabien.

I think that the Chess Programming Wiki has nothing new in it.
What it does contain is simple, clear explanations about how to do something you need to do to write a good chess engine.
For instance, how do I do a population count? Probably everyone knows how to do it except the utter novice but the inclusion of articles on that is still very nice because it is a clear explanation for someone who wants to do it and also has explanations of how and why, not just what.
This sort of parameter fitting was not found in the wiki before, so I think it is a very good inclusion.
Probably, for all of the expert programmers this stuff is old hat.
But it is the clearest and simplest explanation that I have read.

Piece weights with regression analysis (in Russian)

Re: Piece weights with regression analysis (in Russian)

Re: Piece weights with regression analysis (in Russian)

Re: Piece weights with regression analysis (in Russian)

Re: Piece weights with regression analysis (in Russian)

Re: Piece weights with regression analysis (in Russian)

Re: Piece weights with regression analysis (in Russian)

Re: Piece weights with regression analysis (in Russian)

Re: Piece weights with regression analysis (in Russian)

Re: Piece weights with regression analysis (in Russian)

Re: Piece weights with regression analysis (in Russian)