Rating System

cms271828 · Post by **cms271828** » Fri May 01, 2009 4:32 pm

Hi, I'm building a chess server with a 3D board, each new player will start off with a 1200 rating.

I need a way to increase/decrease ratings in a game.

Eg, Player 1 is 1800, Player 2 is 1400
If player 1 wins, then his score should go up by X, and 2's down by X

If player 2 wins, then his score should go up by Y, and 1's down by Y

But Y should be bigger than X, as player 2 winning deserves more points.

Similarly, a draw should work the same, but with a smaller size score

So a draw between 2 equally rated players, should result in no change.

Can anyone think of the best way to do this?

I was considering something like, for 1800 and 1400, player 1 has 18/14 = 1.286 of player B's score. And 0.778 the other way round.

Suppose we multiply by a constant 25, this gives...
32 points and 19 points

So if 1(1800) wins, he gets 19 points, and 2(1400) loses 19 points

If 1(1800) loses, he loses 32 points, and 2(1400) gains 32 points.

But that doesnt quite work well in the case of a draw between 2 players of same rating, if you were take a smaller constant, say 10 for a draw score.

Any thoughts? Thanks

Laskos · Post by **Laskos** » Fri May 01, 2009 8:57 pm

I think the FIDE formula with a K-factor would work for you. Try the calculator,
http://ratings.fide.com/calculator_rtd.phtml
then google to see the actual formula, if I remember well it is something very simple.

Kai

cms271828 · Post by **cms271828** » Sat May 02, 2009 12:23 am

Thanks, thats helpful.

I'm just a little unsure of the K thing.
I guess I could just use a fixed value of K, its not that important.
But still, I donno what the formula is, and I can't seem to google it.

Any thoughts? Thanks

Laskos · Post by **Laskos** » Sat May 02, 2009 2:11 am

cms271828 wrote:Thanks, thats helpful.

I'm just a little unsure of the K thing.
I guess I could just use a fixed value of K, its not that important.
But still, I donno what the formula is, and I can't seem to google it.

Any thoughts? Thanks

Maybe this link would help you, search the part with the K-factor
http://en.wikipedia.org/wiki/Elo_rating_system
I guess you would use a fixed K-factor, too low (8 for example) will diminish sensitivity of the game contribution to the rating, too high (50) will increase. Chose some value (I guess 25-30 for not a life-time job), and play on

Regards,
Kai

MattieShoes · Post by **MattieShoes** » Sat May 02, 2009 4:39 am

The problem with dividing ratings is 1800/1400 is different than 2400/2000. So the difference in ratings would change meaning based on the ratings. It's generally preferable to use an absolute rating difference.

Normally rating sytems do something like this.

1) Calculate rating difference

Code: Select all

ratingDiff = ratingA - ratingB

2) calculate expected score based on rating difference

Code: Select all

expectedScore = 1/(1+10^(ratingdiff/spread))

3) calculate difference between real and expected score

Code: Select all

scoreDiff = actualScore - expectedScore

4) multiply the difference between real and expected score by a number, the K factor.

Code: Select all

newRating = oldRating + scoreDiff * kFactor

Most rating systems use this scheme and it works well. How they do each step may vary....

Some add a "home field advantage" value to the rating difference in step 1 (player with white pieces is expected to outperform his rating slightly, player with black pieces to underperform slightly). Using Elo'ish numbers, the home field advantage is probably between 15 and 40 points. For chess variants, home field advantage might be significantly higher or lower.

Spread is usually 400, which means that if you're 200 points higher than your opponent, you should score about 76%. This means a 1400 should score 76% against a 1200, and a 2800 should score 76% against a 2600.

The exact formula for expected score can be different -- Elo used some complex formula and constructed a table. USCF and FIDE have done away with this and gone to a logistic regression. Sonas advocated a linear prediction scheme.

Number 3 is the same about everywhere, (score-expected) gives you how much you outperformed or underperformed. I think Bayeselo does more complex things here, but with sufficient games against a range of ratings, it's probably not terribly significant.

K factor can be a fixed number, 25 perhaps, or shifted based on ratings (FIDE makes high rated players shift ratings slower), or it can be dynamic (for instance, Glicko). Lower K factor makes for slower rating changes, higher makes it react faster. For real OTB chess, something in the neighborhood of 25 seems about ideal, but for online chess where there's more, faster games, a lower value is probably better.

How one deals with newbies is different from system to system too. Ideally, you want to get the newbie to his real rating as fast as possible so he doesn't screw up everybody else's ratings in the meantime. Glicko does this by making the system not conserve points -- with newbie vs established player, the newbie's rating changes a lot and the established player's rating changes very little. This makes sense but it's a bit cumbersome. There are other systems.

A few other things:

The rating systems tends to be deflationary, because people tend to get better as they play games. Nobody has really made a good system to counteract this.

In a chess server environment, people pick their own opponents. Cherry picking can distort ratings severely. There's not really a good way to to prevent this other than perhaps keeping track of a separate rating where the player doesn't get to pick their opponents. One could try to enforce playing a wider range of opponents, but that's problematic too.

Ratings are relative to the pool in which they were achieved. If you suddenly have an influx of bad players, the average strength will decrease while average rating won't. Higher rated players will have their ratings skyrocket because now they're further above the new average strength. I think the availability of chess to the masses is part of the reason GM ratings are inflating in FIDE -- there are more weak players choosing to play competitively in the system than there was 50 years ago. That's just my own opinion though, nothing to back it.

I don't think it has anything to do with GMs being better than they were 50 years ago. They might be, but that's not what's causing the inflation IMHO.

One interesting idea:
Rather than having ratings be relative to the pool, have them relative to a computer that plays at a relatively fixed strength. This could be achieved by simply not adjusting the computer's rating at the end of a game, but adjusting the human's as normal. Care would have to be taken to make sure the computer plays randomly enough that repeated games with the same losing line aren't likely. Now if bad players join the pool, the average strength of the pool will decrease. The comp will win more, and take more points from the pool than it puts in, moving the average rating down to where it should be. The inverse would also be true. This would also help counteract ratings deflation due to players improving -- the comp would always distribute or remove points from the pool to keep everything relative to the comp's strength. It's not a perfect system but I think it has merit.

cms271828 · Post by **cms271828** » Sun May 03, 2009 3:49 am

Thanks for the reply, but I dont really get it all.

I'm not sure what 'spread' is, and also 'actual score'.

And how does it work for wins/draws?

I believe two equally rated players that obtain a draw should have their ratings remain the same.

Thanks again

MattieShoes · Post by **MattieShoes** » Sun May 03, 2009 7:06 am

Spread controls how far apart the ratings are. Lower spread would create a bunch of tightly packed ratings, and a smaller rating difference would be more significant. Higher spread would do the opposite. Since most rating systems use about 400, it's best to use something near that just so people familiar with other rating systems correctly interpret what the rating system is telling them.

Actual score is 1.0 for a win, 0.5 for a draw, 0.0 for a loss.

If a 2000 plays an 1800. Lets assume no home field advantage, spread of 400, and K factor of 25

Code: Select all

For the 2000, the rating difference is 1800 - 2000 = -200
For the 1800, the rating difference is 2000 - 1800 = 200

The expected score for the 2000 is 
1/(1+10^(-200/400)) = ~0.76

The expected score for the 1800 is 
1/(1+10^(200/400)) = ~0.24

If the 2000 wins the game...
The 2000's actual score would be 1.0
New rating would be 2000 + (1.0 - 0.76) * 25 = 2006

The 1800's actual score would be 0.0
New rating would be 1800 + (0.0 - 0.24) * 25 = 1794


If they got a draw instead....
The 2000's actual score would be 0.5
New rating would be 2000 + (0.5 - 0.76) * 25 = 1993.5

The 1800's actual score score would be 0.5
New rating would be 1800 + (0.5 - 0.24) * 25 = 1806.5


And if the 1800 won the game...
The 2000's actual score would be 0.0
New rating would be 2000 + (0.0 - 0.76) * 25 = 1981

The 1800's actual score would be 1.0
New rating would be 1800 + (1.0 - 0.24) * 25 = 1819

The expected score for two equally rated players would be 0.5 so in the case of a draw, the ratings would stay the same.

... Unless you're taking the advantage of playing the white pieces into account, in which case the player with the white pieces might lose about a point, the player with the black pieces might gain a point.

cms271828 · Post by **cms271828** » Sat May 16, 2009 2:48 am

Excellent, I only just saw this post since I last logged in.

This is exactly what I need, I don't want to involve number of games played or anything like, so this is perfect.

Thanks again.

Rating System

Rating System

Re: Rating System

Re: Rating System

Re: Rating System

Re: Rating System

Re: Rating System

Re: Rating System

Re: Rating System