ELO inflation ha ha ha

Adam Hair · Post by **Adam Hair** » Sat Sep 17, 2016 5:52 pm

Ajedrecista wrote:Hello Henk:

Henk wrote:Someone told me that the chance of a draw from 2200 against 2600 player would be higher than a draw from 1800 against 2200 player. So that is nonsense.

By the way how do you convert ELO rating back into chance.
I guess that you want to say 'Elo difference into expected score'. The classical, well-known Elo formula is:
Code: Select all
&#40;Elo difference&#41; = &#40;own rating&#41; - &#40;opponent's rating&#41;
&#40;Expected score&#41; = 1/&#123;1 + 10^&#91;-&#40;Elo difference&#41;/400&#93;&#125; = &#40;win ratio&#41; + 0.5*&#40;draw ratio&#41;

&#40;Win ratio&#41; + &#40;draw ratio&#41; + &#40;lose ratio&#41; = 1

It is useful with thousands of games. You can not claim anything with 10 games or so.
There are error bars that are proportional to &#40;games&#41;^(-0.5&#41;.
There is much information about error bars in this forum.
Kai said something like 'the expected score is the same in a match of 3800 vs. 3000 or 1800 vs. 1000' but the draw ratio would be very different, of course.

Kai stated a 700-Elo difference and a 3% of draws IIRC. Just using the Elo formula:
Code: Select all
1/&#123;1 + 10^&#91;-&#40;700&#41;/400&#93;&#125;  ~ 0.9825 = 98.25% &#40;for the stronger engine&#41;.
1/&#123;1 + 10^&#91;-(-700&#41;/400&#93;&#125; ~ 0.0175 =  1.75% &#40;for the weaker engine&#41;.

If the weaker engine can not win a single game after lots of games &#40;win ratio = 0&#41;&#58;
&#40;Draw ratio&#41; = &#91;&#40;expected score&#41; - &#40;win ratio&#41;&#93;/0.5 = &#40;0.0175 - 0&#41;/0.5 = 0.035 = 3.5%
Given a fixed Elo gap, the draw ratio is expected to be higher when the average rating is higher, that is, less blunders are expected; however, the expected score should be the same. You have an example here:

Draw Rate

I hope this info could be useful to you.

Regards from Spain.

Ajedrecista.

Always great to see a reference to Kirill's excellent KCEC, even if he has mostly retired from computer chess.

Dirt · Post by **Dirt** » Sun Sep 18, 2016 11:56 am

JJJ wrote:
Henk wrote:TCEC rapid: Stockfish-Delphil 1/2-1/2 but ELO difference 900 points.

I can't imagine I would play a draw against a 800 player if my ELO rating would be 1700.
1700 is more than double of 800
2300 is 72% of 3200

So it's still unlikely to see a draw, but it can happens on very rare occasion.

Double means nothing. The base is arbitrary, so it would be the same if it were 1000 Elo playing 100 Elo.

Uri Blass · Post by **Uri Blass** » Sun Sep 18, 2016 4:39 pm

Laskos wrote:
Henk wrote:Isn't the further away from average the less impact an ELO difference has. I guess ELO 1600 is average strength of a player or is it more like 1200.
I don't understand the question. With ELO, only the ELO difference counts, and the same ELO difference has the same winning percentage. So, if the ELO model is correct, then the score of 3800 against 3000 ELO is the same as 1800 versus 1000 ELO.

If the elo model is correct.

I believe it is not correct and it is impossible to have same expected score for every constant difference.

Suppose the expected score for 100 elo difference is 38%

A1 has a rating of 1000
A2 score 62% against A1 so A2 has a rating of 1100
A3 score 62% against A2 so A3 has a rating 1200
A4 score 62% agsinst A3 so A4 has rating of 1300.

I believe that the expected score of A9 against A1 is not the same as the expected score of A29 against A21 and if I am correct it means that the elo model is wrong.

Laskos · Post by **Laskos** » Sun Sep 18, 2016 4:47 pm

Uri Blass wrote:
Laskos wrote:
Henk wrote:Isn't the further away from average the less impact an ELO difference has. I guess ELO 1600 is average strength of a player or is it more like 1200.
I don't understand the question. With ELO, only the ELO difference counts, and the same ELO difference has the same winning percentage. So, if the ELO model is correct, then the score of 3800 against 3000 ELO is the same as 1800 versus 1000 ELO.
If the elo model is correct.

I believe it is not correct and it is impossible to have same expected score for every constant difference.

Suppose the expected score for 100 elo difference is 38%

A1 has a rating of 1000
A2 score 62% against A1 so A2 has a rating of 1100
A3 score 62% against A2 so A3 has a rating 1200
A4 score 62% agsinst A3 so A4 has rating of 1300.

I believe that the expected score of A9 against A1 is not the same as the expected score of A29 against A21 and if I am correct it means that the elo model is wrong.

Uri, I made such experiments, look here:
http://www.talkchess.com/forum/viewtopic.php?t=60791

Logistic model comes off fairly well as ELO model. Gaussian not so much. Also, if you read the thread to then end, I showed that Rao-Kupper draw model used in Bayeselo is pretty much ruled out.

Norm Pollock · Post by **Norm Pollock** » Sun Sep 18, 2016 7:45 pm

Can someone explain the basis for saying that Komodo 10.1 is 500 Elo points better than Carlsson? That implies Carlsson would score less than 8% in a head to head match.

CCRL 40/40 Komodo 10.1 = 3375
CEGT 40/20 Komodo 10.1 = 3356

FIDE Standard Carlsson = 2857

Henk · Post by **Henk** » Mon Sep 19, 2016 12:02 am

Doubt if computer ratings are FIDE ratings. Also they are tuned to play against engines. Maybe grandmasters will adapt to engines if they are using them very much.

Norm Pollock · Post by **Norm Pollock** » Mon Sep 19, 2016 2:12 pm

Elo inflation is to be expected as more is known about the openings and the other intricacies of the game. Carlsson is about 100 points higher than Fischer was, and that's over a 45 year gap.

But the inflation with computer chess engine Elo has been much faster due to the rapid development of hardware and software. There is no organization comparable to FIDE for chess Elo ratings. Each rating system uses its own proprietary ratings.

And there is no basis to compare human chess and computer chess ratings. The Deep Blue v Kasparov match showed they were equally competitive. But that was a very short match.

There should either be new Elo, call it "EloC" for computers, or have some basis for comparison. I could see a 200 to 300 advantage for computer engines at the top level, but 500 is ridiculous.

APassionForCriminalJustic · Mon Sep 19, 2016 3:20 pm

Laskos wrote:
Henk wrote:Isn't the further away from average the less impact an ELO difference has. I guess ELO 1600 is average strength of a player or is it more like 1200.
I don't understand the question. With ELO, only the ELO difference counts, and the same ELO difference has the same winning percentage. So, if the ELO model is correct, then the score of 3800 against 3000 ELO is the same as 1800 versus 1000 ELO.

ELO difference is NOT the only difference that counts. The playing strength of both players matters greatly. A 1600 rated player should win "most"\ALL games versus a 1000 rated player solely because a 1600 rated player is tournament strength with decent chess skills - whereas a 1000 rated player is nothing more than a novice who would lack even basic chess literature and fundamental play. If you had a 2750 rated player playing a candidate master of 2000 FIDE strength then I would argue that the draw is certainly more probable than say a 1750 rated player matching a 1000 rated player. The 2000 rated player has good skill, good comprehension of both middlegame and endgame technique(s), his or her own opening-book repertoire, etc,. What does a 1000 rated player have? Absolutely nothing but novice play. Math and statistical models have their place; something like this is just downright common sense. Delphi is a weak engine by engine standards - but it still plays PROFESIONAL\MASTER level chess. Stockfish would win virtually every single game at every single time control versus the 2300 rated "David" (David versus Goliath), but not all. There is nothing inconceivable with Delphi's astonishing draw.

Laskos · Post by **Laskos** » Mon Sep 19, 2016 3:37 pm

APassionForCriminalJustic wrote:
Laskos wrote:
Henk wrote:Isn't the further away from average the less impact an ELO difference has. I guess ELO 1600 is average strength of a player or is it more like 1200.
I don't understand the question. With ELO, only the ELO difference counts, and the same ELO difference has the same winning percentage. So, if the ELO model is correct, then the score of 3800 against 3000 ELO is the same as 1800 versus 1000 ELO.
ELO difference is NOT the only difference that counts. The playing strength of both players matters greatly. A 1600 rated player should win "most"\ALL games versus a 1000 rated player solely because a 1600 rated player is tournament strength with decent chess skills - whereas a 1000 rated player is nothing more than a novice who would lack even basic chess literature and fundamental play. If you had a 2750 rated player playing a candidate master of 2000 FIDE strength then I would argue that the draw is certainly more probable than say a 1750 rated player matching a 1000 rated player. The 2000 rated player has good skill, good comprehension of both middlegame and endgame technique(s), his or her own opening-book repertoire, etc,. What does a 1000 rated player have? Absolutely nothing but novice play. Math and statistical models have their place; something like this is just downright common sense. Delphi is a weak engine by engine standards - but it still plays PROFESIONAL\MASTER level chess. Stockfish would win virtually every single game at every single time control versus the 2300 rated "David" (David versus Goliath), but not all. There is nothing inconceivable with Delphi's astonishing draw.

An ELO table is here:
http://www.pradu.us/old/Nov27_2008/Buzz/elotable.html

ELO inflation ha ha ha

Re: About expected scores and draw ratios.

Re: ELO inflation ha ha ha

Re: ELO inflation ha ha ha

Re: ELO inflation ha ha ha

Re: ELO inflation ha ha ha

Re: ELO inflation ha ha ha

Re: ELO inflation ha ha ha

Re: ELO inflation ha ha ha

Re: ELO inflation ha ha ha