## ELO inflation ha ha ha

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Ajedrecista
Posts: 1405
Joined: Wed Jul 13, 2011 7:04 pm
Contact:

### About expected scores and draw ratios.

Hello Henk:
Henk wrote:Someone told me that the chance of a draw from 2200 against 2600 player would be higher than a draw from 1800 against 2200 player. So that is nonsense.

By the way how do you convert ELO rating back into chance.
I guess that you want to say 'Elo difference into expected score'. The classical, well-known Elo formula is:

Code: Select all

``````&#40;Elo difference&#41; = &#40;own rating&#41; - &#40;opponent's rating&#41;
&#40;Expected score&#41; = 1/&#123;1 + 10^&#91;-&#40;Elo difference&#41;/400&#93;&#125; = &#40;win ratio&#41; + 0.5*&#40;draw ratio&#41;

&#40;Win ratio&#41; + &#40;draw ratio&#41; + &#40;lose ratio&#41; = 1

It is useful with thousands of games. You can not claim anything with 10 games or so.
There are error bars that are proportional to &#40;games&#41;^(-0.5&#41;.
There is much information about error bars in this forum.``````
Kai said something like 'the expected score is the same in a match of 3800 vs. 3000 or 1800 vs. 1000' but the draw ratio would be very different, of course.

Kai stated a 700-Elo difference and a 3% of draws IIRC. Just using the Elo formula:

Code: Select all

``````1/&#123;1 + 10^&#91;-&#40;700&#41;/400&#93;&#125;  ~ 0.9825 = 98.25% &#40;for the stronger engine&#41;.
1/&#123;1 + 10^&#91;-(-700&#41;/400&#93;&#125; ~ 0.0175 =  1.75% &#40;for the weaker engine&#41;.

If the weaker engine can not win a single game after lots of games &#40;win ratio = 0&#41;&#58;
&#40;Draw ratio&#41; = &#91;&#40;expected score&#41; - &#40;win ratio&#41;&#93;/0.5 = &#40;0.0175 - 0&#41;/0.5 = 0.035 = 3.5%``````
Given a fixed Elo gap, the draw ratio is expected to be higher when the average rating is higher, that is, less blunders are expected; however, the expected score should be the same. You have an example here:

Draw Rate

I hope this info could be useful to you.

Regards from Spain.

Ajedrecista.

Norm Pollock
Posts: 1018
Joined: Thu Mar 09, 2006 3:15 pm
Location: Long Island, NY, USA
Contact:

### Re: About expected scores and draw ratios.

Elos are based on the past performance, which predicts, on average, future performance. When there is a result that greatly contradicts Elo expectations, like this game v Delphil, then it gives the SF team a heads-up to a deficiency. Perhaps they will find the cause and make the correction. Without these "hiccups", engine development will be much slower.

Posts: 3205
Joined: Wed May 06, 2009 8:31 pm
Location: Fuquay-Varina, North Carolina

### Re: ELO inflation ha ha ha

Henk wrote:
Henk wrote:Isn't the further away from average the less impact an ELO difference has. I guess ELO 1600 is average strength of a player or is it more like 1200.
I don't understand the question. With ELO, only the ELO difference counts, and the same ELO difference has the same winning percentage. So, if the ELO model is correct, then the score of 3800 against 3000 ELO is the same as 1800 versus 1000 ELO.
Someone told me that the chance of a draw from 2200 against 2600 player would be higher than a draw from 1800 against 2200 player. So that is nonsense.

By the way how do you convert ELO rating back into chance.
Kai's statement is correct. But the Elo model does not account for higher draw rates in matches between higher quality opponents.

Expected score of player A = 1/(1+10^((Rb-Ra)/400)) where Ra and Rb are the Elo ratings of players A and B.

Posts: 3205
Joined: Wed May 06, 2009 8:31 pm
Location: Fuquay-Varina, North Carolina

### Re: About expected scores and draw ratios.

Ajedrecista wrote:Hello Henk:
Henk wrote:Someone told me that the chance of a draw from 2200 against 2600 player would be higher than a draw from 1800 against 2200 player. So that is nonsense.

By the way how do you convert ELO rating back into chance.
I guess that you want to say 'Elo difference into expected score'. The classical, well-known Elo formula is:

Code: Select all

``````&#40;Elo difference&#41; = &#40;own rating&#41; - &#40;opponent's rating&#41;
&#40;Expected score&#41; = 1/&#123;1 + 10^&#91;-&#40;Elo difference&#41;/400&#93;&#125; = &#40;win ratio&#41; + 0.5*&#40;draw ratio&#41;

&#40;Win ratio&#41; + &#40;draw ratio&#41; + &#40;lose ratio&#41; = 1

It is useful with thousands of games. You can not claim anything with 10 games or so.
There are error bars that are proportional to &#40;games&#41;^(-0.5&#41;.
There is much information about error bars in this forum.``````
Kai said something like 'the expected score is the same in a match of 3800 vs. 3000 or 1800 vs. 1000' but the draw ratio would be very different, of course.

Kai stated a 700-Elo difference and a 3% of draws IIRC. Just using the Elo formula:

Code: Select all

``````1/&#123;1 + 10^&#91;-&#40;700&#41;/400&#93;&#125;  ~ 0.9825 = 98.25% &#40;for the stronger engine&#41;.
1/&#123;1 + 10^&#91;-(-700&#41;/400&#93;&#125; ~ 0.0175 =  1.75% &#40;for the weaker engine&#41;.

If the weaker engine can not win a single game after lots of games &#40;win ratio = 0&#41;&#58;
&#40;Draw ratio&#41; = &#91;&#40;expected score&#41; - &#40;win ratio&#41;&#93;/0.5 = &#40;0.0175 - 0&#41;/0.5 = 0.035 = 3.5%``````
Given a fixed Elo gap, the draw ratio is expected to be higher when the average rating is higher, that is, less blunders are expected; however, the expected score should be the same. You have an example here:

Draw Rate

I hope this info could be useful to you.

Regards from Spain.

Ajedrecista.
Always great to see a reference to Kirill's excellent KCEC, even if he has mostly retired from computer chess.

Dirt
Posts: 2851
Joined: Wed Mar 08, 2006 9:01 pm
Location: Irvine, CA, USA

### Re: ELO inflation ha ha ha

JJJ wrote:
Henk wrote:TCEC rapid: Stockfish-Delphil 1/2-1/2 but ELO difference 900 points.

I can't imagine I would play a draw against a 800 player if my ELO rating would be 1700.
1700 is more than double of 800
2300 is 72% of 3200

So it's still unlikely to see a draw, but it can happens on very rare occasion.
Double means nothing. The base is arbitrary, so it would be the same if it were 1000 Elo playing 100 Elo.
Deasil is the right way to go.

Uri Blass
Posts: 8642
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

### Re: ELO inflation ha ha ha

Henk wrote:Isn't the further away from average the less impact an ELO difference has. I guess ELO 1600 is average strength of a player or is it more like 1200.
I don't understand the question. With ELO, only the ELO difference counts, and the same ELO difference has the same winning percentage. So, if the ELO model is correct, then the score of 3800 against 3000 ELO is the same as 1800 versus 1000 ELO.
If the elo model is correct.

I believe it is not correct and it is impossible to have same expected score for every constant difference.

Suppose the expected score for 100 elo difference is 38%

A1 has a rating of 1000
A2 score 62% against A1 so A2 has a rating of 1100
A3 score 62% against A2 so A3 has a rating 1200
A4 score 62% agsinst A3 so A4 has rating of 1300.

I believe that the expected score of A9 against A1 is not the same as the expected score of A29 against A21 and if I am correct it means that the elo model is wrong.

Posts: 9725
Joined: Wed Jul 26, 2006 8:21 pm

### Re: ELO inflation ha ha ha

Uri Blass wrote:
Henk wrote:Isn't the further away from average the less impact an ELO difference has. I guess ELO 1600 is average strength of a player or is it more like 1200.
I don't understand the question. With ELO, only the ELO difference counts, and the same ELO difference has the same winning percentage. So, if the ELO model is correct, then the score of 3800 against 3000 ELO is the same as 1800 versus 1000 ELO.
If the elo model is correct.

I believe it is not correct and it is impossible to have same expected score for every constant difference.

Suppose the expected score for 100 elo difference is 38%

A1 has a rating of 1000
A2 score 62% against A1 so A2 has a rating of 1100
A3 score 62% against A2 so A3 has a rating 1200
A4 score 62% agsinst A3 so A4 has rating of 1300.

I believe that the expected score of A9 against A1 is not the same as the expected score of A29 against A21 and if I am correct it means that the elo model is wrong.
Uri, I made such experiments, look here:
http://www.talkchess.com/forum/viewtopic.php?t=60791

Logistic model comes off fairly well as ELO model. Gaussian not so much. Also, if you read the thread to then end, I showed that Rao-Kupper draw model used in Bayeselo is pretty much ruled out.

Norm Pollock
Posts: 1018
Joined: Thu Mar 09, 2006 3:15 pm
Location: Long Island, NY, USA
Contact:

### Re: ELO inflation ha ha ha

Can someone explain the basis for saying that Komodo 10.1 is 500 Elo points better than Carlsson? That implies Carlsson would score less than 8% in a head to head match.

CCRL 40/40 Komodo 10.1 = 3375
CEGT 40/20 Komodo 10.1 = 3356

FIDE Standard Carlsson = 2857

Henk
Posts: 5855
Joined: Mon May 27, 2013 8:31 am

### Re: ELO inflation ha ha ha

Doubt if computer ratings are FIDE ratings. Also they are tuned to play against engines. Maybe grandmasters will adapt to engines if they are using them very much.

Norm Pollock
Posts: 1018
Joined: Thu Mar 09, 2006 3:15 pm
Location: Long Island, NY, USA
Contact:

### Re: ELO inflation ha ha ha

Elo inflation is to be expected as more is known about the openings and the other intricacies of the game. Carlsson is about 100 points higher than Fischer was, and that's over a 45 year gap.

But the inflation with computer chess engine Elo has been much faster due to the rapid development of hardware and software. There is no organization comparable to FIDE for chess Elo ratings. Each rating system uses its own proprietary ratings.

And there is no basis to compare human chess and computer chess ratings. The Deep Blue v Kasparov match showed they were equally competitive. But that was a very short match.

There should either be new Elo, call it "EloC" for computers, or have some basis for comparison. I could see a 200 to 300 advantage for computer engines at the top level, but 500 is ridiculous.