How to calculate piece weights with logistic regression?

maksimKorzh · Post by **maksimKorzh** » Thu Oct 01, 2020 8:50 pm

Hi guys

I'm trying to calculate piece weights via applying logistic regression algorithm (from python statsmodels package) to
calculate piece weights.

I have created the following test data set:

Code: Select all

[
  {
    "P": 8,
    "R": 2,
    "N": 2,
    "B": 2,
    "Q": 2,
    "p": 8,
    "n": 2,
    "r": 2,
    "b": 2,
    "q": 2,
    "Result": "1-0"
  },
  {
    "P": 8,
    "R": 2,
    "N": 2,
    "B": 2,
    "Q": 2,
    "p": 8,
    "n": 2,
    "r": 2,
    "b": 3,
    "q": 2,
    "Result": "1-0"
  },
  {
    "P": 8,
    "R": 2,
    "N": 2,
    "B": 2,
    "Q": 2,
    "p": 8,
    "n": 2,
    "r": 2,
    "b": 2,
    "q": 2,
    "Result": "1-0"
  },
  ...
  ]

Via extracting FENs for every move in PGN and counting number of pieces

Here's my python code inspired by this post:
https://www.r-bloggers.com/2015/06/big- ... ss-pieces/

Code: Select all

# packages
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf

# create data frame from JSON file
positions = pd.read_json('positions.json');

# filter draws
positions = positions[positions['Result'] != '1/2-1/2']

# replace results with binary outcomes
positions['Result'] = positions['Result'].replace('1-0', 1).replace('0-1', -1)
positions['white_win'] = (positions['Result'] == 1)

# define formula variables
win = positions['white_win']
P = positions['P']
N = positions['N']
B = positions['B']
R = positions['R']
Q = positions['Q']
p = positions['p']
n = positions['n']
b = positions['b']
r = positions['r']
q = positions['q']

# define formula
formula = '''
    win ~ I(P - p) +
          I(N - n) +
          I(B - b) +
          I(R - r) +
          I(Q - q)
'''

# builg model
piece_weights = smf.glm(formula, family=sm.families.Binomial(), data = positions).fit()

print(piece_weights.summary())

This code produces following output:

Code: Select all

                      Generalized Linear Model Regression Results                      
=======================================================================================
Dep. Variable:     ['win[False]', 'win[True]']   No. Observations:                 1209
Model:                                     GLM   Df Residuals:                     1203
Model Family:                         Binomial   Df Model:                            5
Link Function:                           logit   Scale:                          1.0000
Method:                                   IRLS   Log-Likelihood:                -660.14
Date:                         Thu, 01 Oct 2020   Deviance:                       1320.3
Time:                                 21:40:23   Pearson chi2:                 1.38e+03
No. Iterations:                              5                                         
Covariance Type:                     nonrobust                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.0514      0.100      0.514      0.607      -0.145       0.248
I(P - p)      -0.5222      0.104     -5.022      0.000      -0.726      -0.318
I(N - n)      -0.5374      0.131     -4.116      0.000      -0.793      -0.282
I(B - b)      -0.4953      0.107     -4.650      0.000      -0.704      -0.287
I(R - r)      -0.2619      0.140     -1.870      0.062      -0.536       0.013
I(Q - q)      -2.5064      0.259     -9.695      0.000      -3.013      -2.000
==============================================================================

Piece weights are in "coef" column but the values are weird.
I'm suffering second day in the row and completely stuck.
I just want to calculate piece weights but it feels like I would never achieve that goal.

Code from original post is in R which I don't know:

Code: Select all

games <- as.data.frame(readRDS("milionbase_matrix.rds"))

# filter draws
games <- games[games$result != 0, ]

# create white_win col, assign 1 to it if white wins
games$white_win <- games$result == 1

fit <- glm(white_win ~ I(white_pawn - black_pawn) + I(white_knight - black_knight) + 
                    I(white_bishop - black_bishop) + I(white_rook - black_rook) + I(white_queen - black_queen), 
           data = games, family="binomial")

this one produces "normal" weights.
I feel very very confused and depressed for my efforts does not help.
How on earth can I calculate piece weights using python??

Please help. Thanks in advance.

Tony P. · Post by **Tony P.** » Thu Oct 01, 2020 10:51 pm

Did you see Andrew Grant's article? It's the go-to for all things chess logistic regression

As a bonus, his code examples are in C. Why would you need Python at all?

The PDF hosting is down now, let us know if it fails to go back up soon.

mvanthoor · Post by **mvanthoor** » Fri Oct 02, 2020 1:43 am

Tony P. wrote: ↑Thu Oct 01, 2020 10:51 pm Did you see Andrew Grant's article? It's the go-to for all things chess logistic regression As a bonus, his code examples are in C. Why would you need Python at all?

The PDF hosting is down now, let us know if it fails to go back up soon.

Because he wants to understand how its done, and then do it himself (except for inventing the math).

Ferdy · Post by **Ferdy** » Sat Oct 03, 2020 4:56 am

maksimKorzh wrote: ↑Thu Oct 01, 2020 8:50 pm
# replace results with binary outcomes
positions['Result'] = positions['Result'].replace('1-0', 1).replace('0-1', -1)

Perhaps the win probability should be mapped to [0, 1], not [-1, 1]. In that case that would be:
positions['Result'] = positions['Result'].replace('1-0', 1).replace('0-1', 0)

AndrewGrant · Post by **AndrewGrant** » Sat Oct 03, 2020 6:04 am

mvanthoor wrote: ↑Fri Oct 02, 2020 1:43 am
Tony P. wrote: ↑Thu Oct 01, 2020 10:51 pm Did you see Andrew Grant's article? It's the go-to for all things chess logistic regression As a bonus, his code examples are in C. Why would you need Python at all?

The PDF hosting is down now, let us know if it fails to go back up soon.
Because he wants to understand how its done, and then do it himself (except for inventing the math).

+1 . What I've done is an end product. If you do it yourself, you'll learn more. And maybe you'll improve upon already known ideas

EDIT: IF the pdf is dead, I can share it. I'm soon to upload it into Ethereal's repo, once I finalize somethings.

Milos · Post by **Milos** » Sat Oct 03, 2020 8:17 am

AndrewGrant wrote: ↑Sat Oct 03, 2020 6:04 am
mvanthoor wrote: ↑Fri Oct 02, 2020 1:43 am
Tony P. wrote: ↑Thu Oct 01, 2020 10:51 pm Did you see Andrew Grant's article? It's the go-to for all things chess logistic regression As a bonus, his code examples are in C. Why would you need Python at all?

The PDF hosting is down now, let us know if it fails to go back up soon.
Because he wants to understand how its done, and then do it himself (except for inventing the math).
+1 . What I've done is an end product. If you do it yourself, you'll learn more. And maybe you'll improve upon already known ideas

EDIT: IF the pdf is dead, I can share it. I'm soon to upload it into Ethereal's repo, once I finalize somethings.

Man you reinvented the wheel so many times. Astonishing. A small piece of advice instead of reinventing logistic regression try using something that is known to give much better results like gradient or tree boosting. And no need for writing it from scratch, existing frameworks like XGBoost or Sklearn would give you more accurate results much faster

.

Gerd Isenberg · Post by **Gerd Isenberg** » Sat Oct 03, 2020 9:17 am

Hi Maksim,
no idea on the fly what is wrong with your trials and code. But see Vladimir Medvedev's article on that topic.
Gerd

maksimKorzh · Post by **maksimKorzh** » Sat Oct 03, 2020 10:14 am

Ferdy wrote: ↑Sat Oct 03, 2020 4:56 am
maksimKorzh wrote: ↑Thu Oct 01, 2020 8:50 pm
# replace results with binary outcomes
positions['Result'] = positions['Result'].replace('1-0', 1).replace('0-1', -1)
Perhaps the win probability should be mapped to [0, 1], not [-1, 1]. In that case that would be:
positions['Result'] = positions['Result'].replace('1-0', 1).replace('0-1', 0)

I've already tracked the issue - I just had the malformed dataset.
There's no need even to replace strings with integers for python statsmodels lib creates dummy variables in this case automatically.

maksimKorzh · Post by **maksimKorzh** » Sat Oct 03, 2020 10:20 am

Ok, now I finally got done with piece weights - after fixing the malformed dataset (it was counting queens and kings from castling rights apart from position only!) I've eventually got a reasonable piece weights.

Now the challenge is to add PST impact into the logistic regression model.
What I can't figure out at the moment is how to define the impact of a certain square occupied by a piece on the game result?
Should I have a set of ALL 64 squares for all piece types as features to predict their impact?
Or there exist some better way of doing this?

AndrewGrant · Post by **AndrewGrant** » Sat Oct 03, 2020 8:23 pm

Milos wrote: ↑Sat Oct 03, 2020 8:17 am
AndrewGrant wrote: ↑Sat Oct 03, 2020 6:04 am
mvanthoor wrote: ↑Fri Oct 02, 2020 1:43 am
Tony P. wrote: ↑Thu Oct 01, 2020 10:51 pm Did you see Andrew Grant's article? It's the go-to for all things chess logistic regression As a bonus, his code examples are in C. Why would you need Python at all?

The PDF hosting is down now, let us know if it fails to go back up soon.
Because he wants to understand how its done, and then do it himself (except for inventing the math).
+1 . What I've done is an end product. If you do it yourself, you'll learn more. And maybe you'll improve upon already known ideas

EDIT: IF the pdf is dead, I can share it. I'm soon to upload it into Ethereal's repo, once I finalize somethings.
Man you reinvented the wheel so many times. Astonishing. A small piece of advice instead of reinventing logistic regression try using something that is known to give much better results like gradient or tree boosting. And no need for writing it from scratch, existing frameworks like XGBoost or Sklearn would give you more accurate results much faster .

In my eyes, Reinventing the wheel is the only way to understand the wheel.

How to calculate piece weights with logistic regression?

How to calculate piece weights with logistic regression?

Re: How to calculate piece weights with logistic regression?

Re: How to calculate piece weights with logistic regression?

Re: How to calculate piece weights with logistic regression?

Re: How to calculate piece weights with logistic regression?

Re: How to calculate piece weights with logistic regression?

Re: How to calculate piece weights with logistic regression?

Re: How to calculate piece weights with logistic regression?

Re: How to calculate piece weights with logistic regression?

Re: How to calculate piece weights with logistic regression?