ELO inflation ha ha ha

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: About expected scores and draw ratios.

Post by Adam Hair »

Ajedrecista wrote:Hello Henk:
Henk wrote:Someone told me that the chance of a draw from 2200 against 2600 player would be higher than a draw from 1800 against 2200 player. So that is nonsense.

By the way how do you convert ELO rating back into chance.
I guess that you want to say 'Elo difference into expected score'. The classical, well-known Elo formula is:

Code: Select all

(Elo difference) = (own rating) - (opponent's rating)
(Expected score) = 1/{1 + 10^[-(Elo difference)/400]} = (win ratio) + 0.5*(draw ratio)

(Win ratio) + (draw ratio) + (lose ratio) = 1

It is useful with thousands of games. You can not claim anything with 10 games or so.
There are error bars that are proportional to (games)^(-0.5).
There is much information about error bars in this forum.
Kai said something like 'the expected score is the same in a match of 3800 vs. 3000 or 1800 vs. 1000' but the draw ratio would be very different, of course.

Kai stated a 700-Elo difference and a 3% of draws IIRC. Just using the Elo formula:

Code: Select all

1/{1 + 10^[-(700)/400]}  ~ 0.9825 = 98.25% (for the stronger engine).
1/{1 + 10^[-(-700)/400]} ~ 0.0175 =  1.75% (for the weaker engine).

If the weaker engine can not win a single game after lots of games (win ratio = 0):
(Draw ratio) = [(expected score) - (win ratio)]/0.5 = (0.0175 - 0)/0.5 = 0.035 = 3.5%
Given a fixed Elo gap, the draw ratio is expected to be higher when the average rating is higher, that is, less blunders are expected; however, the expected score should be the same. You have an example here:

Draw Rate

I hope this info could be useful to you.

Regards from Spain.

Ajedrecista.
Always great to see a reference to Kirill's excellent KCEC, even if he has mostly retired from computer chess.
Dirt
Posts: 2851
Joined: Wed Mar 08, 2006 10:01 pm
Location: Irvine, CA, USA

Re: ELO inflation ha ha ha

Post by Dirt »

JJJ wrote:
Henk wrote:TCEC rapid: Stockfish-Delphil 1/2-1/2 but ELO difference 900 points.

I can't imagine I would play a draw against a 800 player if my ELO rating would be 1700.
1700 is more than double of 800
2300 is 72% of 3200

So it's still unlikely to see a draw, but it can happens on very rare occasion.
Double means nothing. The base is arbitrary, so it would be the same if it were 1000 Elo playing 100 Elo.
Deasil is the right way to go.
Uri Blass
Posts: 10268
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: ELO inflation ha ha ha

Post by Uri Blass »

Laskos wrote:
Henk wrote:Isn't the further away from average the less impact an ELO difference has. I guess ELO 1600 is average strength of a player or is it more like 1200.
I don't understand the question. With ELO, only the ELO difference counts, and the same ELO difference has the same winning percentage. So, if the ELO model is correct, then the score of 3800 against 3000 ELO is the same as 1800 versus 1000 ELO.
If the elo model is correct.

I believe it is not correct and it is impossible to have same expected score for every constant difference.

Suppose the expected score for 100 elo difference is 38%

A1 has a rating of 1000
A2 score 62% against A1 so A2 has a rating of 1100
A3 score 62% against A2 so A3 has a rating 1200
A4 score 62% agsinst A3 so A4 has rating of 1300.

I believe that the expected score of A9 against A1 is not the same as the expected score of A29 against A21 and if I am correct it means that the elo model is wrong.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: ELO inflation ha ha ha

Post by Laskos »

Uri Blass wrote:
Laskos wrote:
Henk wrote:Isn't the further away from average the less impact an ELO difference has. I guess ELO 1600 is average strength of a player or is it more like 1200.
I don't understand the question. With ELO, only the ELO difference counts, and the same ELO difference has the same winning percentage. So, if the ELO model is correct, then the score of 3800 against 3000 ELO is the same as 1800 versus 1000 ELO.
If the elo model is correct.

I believe it is not correct and it is impossible to have same expected score for every constant difference.

Suppose the expected score for 100 elo difference is 38%

A1 has a rating of 1000
A2 score 62% against A1 so A2 has a rating of 1100
A3 score 62% against A2 so A3 has a rating 1200
A4 score 62% agsinst A3 so A4 has rating of 1300.

I believe that the expected score of A9 against A1 is not the same as the expected score of A29 against A21 and if I am correct it means that the elo model is wrong.
Uri, I made such experiments, look here:
http://www.talkchess.com/forum/viewtopic.php?t=60791

Logistic model comes off fairly well as ELO model. Gaussian not so much. Also, if you read the thread to then end, I showed that Rao-Kupper draw model used in Bayeselo is pretty much ruled out.
Norm Pollock
Posts: 1056
Joined: Thu Mar 09, 2006 4:15 pm
Location: Long Island, NY, USA

Re: ELO inflation ha ha ha

Post by Norm Pollock »

Can someone explain the basis for saying that Komodo 10.1 is 500 Elo points better than Carlsson? That implies Carlsson would score less than 8% in a head to head match.

CCRL 40/40 Komodo 10.1 = 3375
CEGT 40/20 Komodo 10.1 = 3356

FIDE Standard Carlsson = 2857
Henk
Posts: 7216
Joined: Mon May 27, 2013 10:31 am

Re: ELO inflation ha ha ha

Post by Henk »

Doubt if computer ratings are FIDE ratings. Also they are tuned to play against engines. Maybe grandmasters will adapt to engines if they are using them very much.
Norm Pollock
Posts: 1056
Joined: Thu Mar 09, 2006 4:15 pm
Location: Long Island, NY, USA

Re: ELO inflation ha ha ha

Post by Norm Pollock »

Elo inflation is to be expected as more is known about the openings and the other intricacies of the game. Carlsson is about 100 points higher than Fischer was, and that's over a 45 year gap.

But the inflation with computer chess engine Elo has been much faster due to the rapid development of hardware and software. There is no organization comparable to FIDE for chess Elo ratings. Each rating system uses its own proprietary ratings.

And there is no basis to compare human chess and computer chess ratings. The Deep Blue v Kasparov match showed they were equally competitive. But that was a very short match.

There should either be new Elo, call it "EloC" for computers, or have some basis for comparison. I could see a 200 to 300 advantage for computer engines at the top level, but 500 is ridiculous.
APassionForCriminalJustic
Posts: 417
Joined: Sat May 24, 2014 9:16 am

Re: ELO inflation ha ha ha

Post by APassionForCriminalJustic »

Laskos wrote:
Henk wrote:Isn't the further away from average the less impact an ELO difference has. I guess ELO 1600 is average strength of a player or is it more like 1200.
I don't understand the question. With ELO, only the ELO difference counts, and the same ELO difference has the same winning percentage. So, if the ELO model is correct, then the score of 3800 against 3000 ELO is the same as 1800 versus 1000 ELO.
ELO difference is NOT the only difference that counts. The playing strength of both players matters greatly. A 1600 rated player should win "most"\ALL games versus a 1000 rated player solely because a 1600 rated player is tournament strength with decent chess skills - whereas a 1000 rated player is nothing more than a novice who would lack even basic chess literature and fundamental play. If you had a 2750 rated player playing a candidate master of 2000 FIDE strength then I would argue that the draw is certainly more probable than say a 1750 rated player matching a 1000 rated player. The 2000 rated player has good skill, good comprehension of both middlegame and endgame technique(s), his or her own opening-book repertoire, etc,. What does a 1000 rated player have? Absolutely nothing but novice play. Math and statistical models have their place; something like this is just downright common sense. Delphi is a weak engine by engine standards - but it still plays PROFESIONAL\MASTER level chess. Stockfish would win virtually every single game at every single time control versus the 2300 rated "David" (David versus Goliath), but not all. There is nothing inconceivable with Delphi's astonishing draw.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: ELO inflation ha ha ha

Post by Laskos »

APassionForCriminalJustic wrote:
Laskos wrote:
Henk wrote:Isn't the further away from average the less impact an ELO difference has. I guess ELO 1600 is average strength of a player or is it more like 1200.
I don't understand the question. With ELO, only the ELO difference counts, and the same ELO difference has the same winning percentage. So, if the ELO model is correct, then the score of 3800 against 3000 ELO is the same as 1800 versus 1000 ELO.
ELO difference is NOT the only difference that counts. The playing strength of both players matters greatly. A 1600 rated player should win "most"\ALL games versus a 1000 rated player solely because a 1600 rated player is tournament strength with decent chess skills - whereas a 1000 rated player is nothing more than a novice who would lack even basic chess literature and fundamental play. If you had a 2750 rated player playing a candidate master of 2000 FIDE strength then I would argue that the draw is certainly more probable than say a 1750 rated player matching a 1000 rated player. The 2000 rated player has good skill, good comprehension of both middlegame and endgame technique(s), his or her own opening-book repertoire, etc,. What does a 1000 rated player have? Absolutely nothing but novice play. Math and statistical models have their place; something like this is just downright common sense. Delphi is a weak engine by engine standards - but it still plays PROFESIONAL\MASTER level chess. Stockfish would win virtually every single game at every single time control versus the 2300 rated "David" (David versus Goliath), but not all. There is nothing inconceivable with Delphi's astonishing draw.
An ELO table is here:
http://www.pradu.us/old/Nov27_2008/Buzz/elotable.html