Page 1 of 2

How to calculate piece weights with logistic regression?

Posted: Thu Oct 01, 2020 8:50 pm
by maksimKorzh
Hi guys

I'm trying to calculate piece weights via applying logistic regression algorithm (from python statsmodels package) to
calculate piece weights.

I have created the following test data set:

Code: Select all

[
  {
    "P": 8,
    "R": 2,
    "N": 2,
    "B": 2,
    "Q": 2,
    "p": 8,
    "n": 2,
    "r": 2,
    "b": 2,
    "q": 2,
    "Result": "1-0"
  },
  {
    "P": 8,
    "R": 2,
    "N": 2,
    "B": 2,
    "Q": 2,
    "p": 8,
    "n": 2,
    "r": 2,
    "b": 3,
    "q": 2,
    "Result": "1-0"
  },
  {
    "P": 8,
    "R": 2,
    "N": 2,
    "B": 2,
    "Q": 2,
    "p": 8,
    "n": 2,
    "r": 2,
    "b": 2,
    "q": 2,
    "Result": "1-0"
  },
  ...
  ]
Via extracting FENs for every move in PGN and counting number of pieces

Here's my python code inspired by this post:
https://www.r-bloggers.com/2015/06/big- ... ss-pieces/

Code: Select all

# packages
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf

# create data frame from JSON file
positions = pd.read_json('positions.json');

# filter draws
positions = positions[positions['Result'] != '1/2-1/2']

# replace results with binary outcomes
positions['Result'] = positions['Result'].replace('1-0', 1).replace('0-1', -1)
positions['white_win'] = (positions['Result'] == 1)

# define formula variables
win = positions['white_win']
P = positions['P']
N = positions['N']
B = positions['B']
R = positions['R']
Q = positions['Q']
p = positions['p']
n = positions['n']
b = positions['b']
r = positions['r']
q = positions['q']

# define formula
formula = '''
    win ~ I(P - p) +
          I(N - n) +
          I(B - b) +
          I(R - r) +
          I(Q - q)
'''

# builg model
piece_weights = smf.glm(formula, family=sm.families.Binomial(), data = positions).fit()

print(piece_weights.summary())
This code produces following output:

Code: Select all

                      Generalized Linear Model Regression Results                      
=======================================================================================
Dep. Variable:     ['win[False]', 'win[True]']   No. Observations:                 1209
Model:                                     GLM   Df Residuals:                     1203
Model Family:                         Binomial   Df Model:                            5
Link Function:                           logit   Scale:                          1.0000
Method:                                   IRLS   Log-Likelihood:                -660.14
Date:                         Thu, 01 Oct 2020   Deviance:                       1320.3
Time:                                 21:40:23   Pearson chi2:                 1.38e+03
No. Iterations:                              5                                         
Covariance Type:                     nonrobust                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.0514      0.100      0.514      0.607      -0.145       0.248
I(P - p)      -0.5222      0.104     -5.022      0.000      -0.726      -0.318
I(N - n)      -0.5374      0.131     -4.116      0.000      -0.793      -0.282
I(B - b)      -0.4953      0.107     -4.650      0.000      -0.704      -0.287
I(R - r)      -0.2619      0.140     -1.870      0.062      -0.536       0.013
I(Q - q)      -2.5064      0.259     -9.695      0.000      -3.013      -2.000
==============================================================================

Piece weights are in "coef" column but the values are weird.
I'm suffering second day in the row and completely stuck.
I just want to calculate piece weights but it feels like I would never achieve that goal.

Code from original post is in R which I don't know:

Code: Select all

games <- as.data.frame(readRDS("milionbase_matrix.rds"))

# filter draws
games <- games[games$result != 0, ]

# create white_win col, assign 1 to it if white wins
games$white_win <- games$result == 1

fit <- glm(white_win ~ I(white_pawn - black_pawn) + I(white_knight - black_knight) + 
                    I(white_bishop - black_bishop) + I(white_rook - black_rook) + I(white_queen - black_queen), 
           data = games, family="binomial")
this one produces "normal" weights.
I feel very very confused and depressed for my efforts does not help.
How on earth can I calculate piece weights using python??

Please help. Thanks in advance.

Re: How to calculate piece weights with logistic regression?

Posted: Thu Oct 01, 2020 10:51 pm
by Tony P.
Did you see Andrew Grant's article? It's the go-to for all things chess logistic regression :D As a bonus, his code examples are in C. Why would you need Python at all?

The PDF hosting is down now, let us know if it fails to go back up soon.

Re: How to calculate piece weights with logistic regression?

Posted: Fri Oct 02, 2020 1:43 am
by mvanthoor
Tony P. wrote: Thu Oct 01, 2020 10:51 pm Did you see Andrew Grant's article? It's the go-to for all things chess logistic regression :D As a bonus, his code examples are in C. Why would you need Python at all?

The PDF hosting is down now, let us know if it fails to go back up soon.
Because he wants to understand how its done, and then do it himself (except for inventing the math).

Re: How to calculate piece weights with logistic regression?

Posted: Sat Oct 03, 2020 4:56 am
by Ferdy
maksimKorzh wrote: Thu Oct 01, 2020 8:50 pm
# replace results with binary outcomes
positions['Result'] = positions['Result'].replace('1-0', 1).replace('0-1', -1)
Perhaps the win probability should be mapped to [0, 1], not [-1, 1]. In that case that would be:
positions['Result'] = positions['Result'].replace('1-0', 1).replace('0-1', 0)

Re: How to calculate piece weights with logistic regression?

Posted: Sat Oct 03, 2020 6:04 am
by AndrewGrant
mvanthoor wrote: Fri Oct 02, 2020 1:43 am
Tony P. wrote: Thu Oct 01, 2020 10:51 pm Did you see Andrew Grant's article? It's the go-to for all things chess logistic regression :D As a bonus, his code examples are in C. Why would you need Python at all?

The PDF hosting is down now, let us know if it fails to go back up soon.
Because he wants to understand how its done, and then do it himself (except for inventing the math).
+1 . What I've done is an end product. If you do it yourself, you'll learn more. And maybe you'll improve upon already known ideas :)

EDIT: IF the pdf is dead, I can share it. I'm soon to upload it into Ethereal's repo, once I finalize somethings.

Re: How to calculate piece weights with logistic regression?

Posted: Sat Oct 03, 2020 8:17 am
by Milos
AndrewGrant wrote: Sat Oct 03, 2020 6:04 am
mvanthoor wrote: Fri Oct 02, 2020 1:43 am
Tony P. wrote: Thu Oct 01, 2020 10:51 pm Did you see Andrew Grant's article? It's the go-to for all things chess logistic regression :D As a bonus, his code examples are in C. Why would you need Python at all?

The PDF hosting is down now, let us know if it fails to go back up soon.
Because he wants to understand how its done, and then do it himself (except for inventing the math).
+1 . What I've done is an end product. If you do it yourself, you'll learn more. And maybe you'll improve upon already known ideas :)

EDIT: IF the pdf is dead, I can share it. I'm soon to upload it into Ethereal's repo, once I finalize somethings.
Man you reinvented the wheel so many times. Astonishing. A small piece of advice instead of reinventing logistic regression try using something that is known to give much better results like gradient or tree boosting. And no need for writing it from scratch, existing frameworks like XGBoost or Sklearn would give you more accurate results much faster ;).

Re: How to calculate piece weights with logistic regression?

Posted: Sat Oct 03, 2020 9:17 am
by Gerd Isenberg
Hi Maksim,
no idea on the fly what is wrong with your trials and code. But see Vladimir Medvedev's article on that topic.
Gerd

Re: How to calculate piece weights with logistic regression?

Posted: Sat Oct 03, 2020 10:14 am
by maksimKorzh
Ferdy wrote: Sat Oct 03, 2020 4:56 am
maksimKorzh wrote: Thu Oct 01, 2020 8:50 pm
# replace results with binary outcomes
positions['Result'] = positions['Result'].replace('1-0', 1).replace('0-1', -1)
Perhaps the win probability should be mapped to [0, 1], not [-1, 1]. In that case that would be:
positions['Result'] = positions['Result'].replace('1-0', 1).replace('0-1', 0)
I've already tracked the issue - I just had the malformed dataset.
There's no need even to replace strings with integers for python statsmodels lib creates dummy variables in this case automatically.

Re: How to calculate piece weights with logistic regression?

Posted: Sat Oct 03, 2020 10:20 am
by maksimKorzh
Ok, now I finally got done with piece weights - after fixing the malformed dataset (it was counting queens and kings from castling rights apart from position only!) I've eventually got a reasonable piece weights.

Now the challenge is to add PST impact into the logistic regression model.
What I can't figure out at the moment is how to define the impact of a certain square occupied by a piece on the game result?
Should I have a set of ALL 64 squares for all piece types as features to predict their impact?
Or there exist some better way of doing this?

Re: How to calculate piece weights with logistic regression?

Posted: Sat Oct 03, 2020 8:23 pm
by AndrewGrant
Milos wrote: Sat Oct 03, 2020 8:17 am
AndrewGrant wrote: Sat Oct 03, 2020 6:04 am
mvanthoor wrote: Fri Oct 02, 2020 1:43 am
Tony P. wrote: Thu Oct 01, 2020 10:51 pm Did you see Andrew Grant's article? It's the go-to for all things chess logistic regression :D As a bonus, his code examples are in C. Why would you need Python at all?

The PDF hosting is down now, let us know if it fails to go back up soon.
Because he wants to understand how its done, and then do it himself (except for inventing the math).
+1 . What I've done is an end product. If you do it yourself, you'll learn more. And maybe you'll improve upon already known ideas :)

EDIT: IF the pdf is dead, I can share it. I'm soon to upload it into Ethereal's repo, once I finalize somethings.
Man you reinvented the wheel so many times. Astonishing. A small piece of advice instead of reinventing logistic regression try using something that is known to give much better results like gradient or tree boosting. And no need for writing it from scratch, existing frameworks like XGBoost or Sklearn would give you more accurate results much faster ;).
In my eyes, Reinventing the wheel is the only way to understand the wheel.