Draw value

Kempelen · Post by **Kempelen** » Mon Aug 06, 2012 10:59 am

Hello,

During last week I have been working in a tool to calculate ELO from PGN, like elostat or bayeselo. Not because I don't trust them, but because I want to learn how they work and have more control over what I test, even to make experiments like the one I comment here.

I have modified my tool to take into account white elo advantage. During the course of programming it, I started to debate with myself if draws should value other than 0.5. I have done experiment downloading all games from CCRL 40/40 and run my tool with different draw values, and standing classification changed noticeable. I have tested different draw values even progressive according to opponent elo.

I launch a few questions that round my head:

* Do you think is a good testing strategy to give different draw values? I.e. 0.45 for draw black and 0.40 for draw white.,... or like in footbal 3-1-0. This way you encourage a win-risky engine. In spain we say best defense is a good attack...

* Should CCC comunity change draw value for calculating ELO public lists?

...
Fermin

P.S.: Althought is in a testing phase and can has errors, I can give CCRL list I have done with different draw values to any interested. PM.

Daniel Shawul · Post by **Daniel Shawul** » Mon Aug 06, 2012 2:00 pm

Hello,

During last week I have been working in a tool to calculate ELO from PGN, like elostat or bayeselo. Not because I don't trust them, but because I want to learn how they work and have more control over what I test, even to make experiments like the one I comment here.

Hi Fermin
That is great. I haven't actually programmed anything but have been studying elo estimation algorithms. I think you would program elostat blindfold

and bayeselo is not that difficult either. Mean strength of players using MM algorithms is easy but determining variance could take some matrix inversions.

I have modified my tool to take into account white elo advantage. During the course of programming it, I started to debate with myself if draws should value other than 0.5. I have done experiment downloading all games from CCRL 40/40 and run my tool with different draw values, and standing classification changed noticeable. I have tested different draw values even progressive according to opponent elo.

Well you can't do much about that as it is the standard used by FIDE. 1-0.5-0. So a draw is half as good as a win so you can not change it to any value.

I launch a few questions that round my head:

* Do you think is a good testing strategy to give different draw values? I.e. 0.45 for draw black and 0.40 for draw white.,... or like in footbal 3-1-0. This way you encourage a win-risky engine. In spain we say best defense is a good attack...

I understand what you mean. For example Barca has a big home advantage and never loose at home, and the draw ratio away from home is high for most teams. So there is home advantage, and there are draws that reduce winning ration. I think when both are combined you will get what you need. Elostat doesn't differentiate but bayeselo has 4 terms in the MM optimization algorithm that takes care of those. One may argue Barca's home advantage is significantly higher than other team's and probably deserves higher value than the rest of the teams. I am not sure but a constant draw ratio is just a prior for bayeselo then it modifies it to match the observed data.
Using a 3-1 system will change the whole complexion of the game so you should forget about that. The rewards are fixed at 1-0.5-0 unless FIDE changes it.

* Should CCC comunity change draw value for calculating ELO public lists?

Bayeselo uses a 33 elo advantage based on some WBEC data. So the recommendation is to calculate it from the data itself first. Different tournament situations (time control) etc may have an effect on both home advantage and draw ration.

cheers

hgm · Post by **hgm** » Mon Aug 06, 2012 2:13 pm

I don't think you can get a consistent rating system by awarding draws with anything but 0.5 point. Suppose that you have 2 players, playing each other 100 times, and that every game is a draw. Then they both have a score of 60%, and thus each should both be 70 Elo stronger as well as 70 Elo weaker than their opponent.

Daniel Shawul · Post by **Daniel Shawul** » Mon Aug 06, 2012 2:50 pm

hgm wrote:I don't think you can get a consistent rating system by awarding draws with anything but 0.5 point. Suppose that you have 2 players, playing each other 100 times, and that every game is a draw. Then they both have a score of 60%, and thus each should both be 70 Elo stronger as well as 70 Elo weaker than their opponent.

I think it is only the relative importance of draws with respect to a win that will be changed if the 3-1-0 system is used. So three draws will be as good as a win and two losses now. For your example, the percentages are still 50-50 since when two teams draw the available points are 2 and the rest are lost to penalize those team. 3 points are always available but only 2 points are gained when a draw occurs. So we can not calculate percentages right after a game but after all the points are summed. Similar situation with multi-player games that assign ranks. It doesn't make sense to calculate percentage in such cases..

hgm · Post by **hgm** » Mon Aug 06, 2012 3:20 pm

Well, if you interpret it that way, I guess you would have to give 1 point for both a draw and a loss. Then because your opponent also gets a point, the draw cunts for two games: one win, one loss. This is exactly what BayesElo does.

Sven · Post by **Sven** » Mon Aug 06, 2012 3:42 pm

Daniel Shawul wrote:
hgm wrote:I don't think you can get a consistent rating system by awarding draws with anything but 0.5 point. Suppose that you have 2 players, playing each other 100 times, and that every game is a draw. Then they both have a score of 60%, and thus each should both be 70 Elo stronger as well as 70 Elo weaker than their opponent.
I think it is only the relative importance of draws with respect to a win that will be changed if the 3-1-0 system is used. So three draws will be as good as a win and two losses now. For your example, the percentages are still 50-50 since when two teams draw the available points are 2 and the rest are lost to penalize those team. 3 points are always available but only 2 points are gained when a draw occurs. So we can not calculate percentages right after a game but after all the points are summed. Similar situation with multi-player games that assign ranks. It doesn't make sense to calculate percentage in such cases..

I think HGM is right. The existing ELO rating formula would simply not work when not scoring a draw with 0.5 - at least they won't work without changes.

From the FIDE Handbook:

8.55

(a) Use table 8.1(b) to determine the player’s score probability PD
(b) ΔR = score – PD. For each game, the score is 1, 0.5 or 0.
(c) ΣΔR x K = the Rating Change for a given tournament, or Rating period.

Let's assume a draw gets assigned a score SCD (SCD != 0.5). The two players are in fact equally rated. Table 8.1(b) gives PD=0 for each game. ΔR = SCD - 0.5
ΣΔR x K = NumberOfGames x (SCD - 0.5) x K
If NumberOfGames = 4 and SCD = 0.6 and K = 10 then both players get a rating change of +4 after that event, for drawing 4 games. That is an "Easy ELO" system, I might be tempted to play 364 games per year under those rules

The reason why it does not work that way is that chess is a zero-sum game. One player gets rewarded what his opponent loses. Perhaps a modified table 8.1(b) that is specific for the chosen value of SCD might help. In the special case above with two equally rated players it would solve the case since assigning a PD value of exactly SCD for a rating difference of 0 would result in a rating change of 0 for both after playing a match of only draws. In the most frequent case of two players with different ratings we would get a result as in the following example:

Assume a rating difference of 20. Currently 8.1(b) shows PD=0.53 resp. 0.47, with SCD=0.6 one could assign PD=0.63 for the better player P1 and PD=0.57 for the weaker player P2.

Now for P1:
ΔR = SCD - 0.63 = -0.03
ΣΔR x K = NumberOfGames x -0.03 x K = -1.2

And for P2:
ΔR = SCD - 0.57 = +0.03
ΣΔR x K = NumberOfGames x +0.03 x K = +1.2

So it would now be "zero-sum". The values in the modified table 8.1(b) would have to be chosen carefully, and I am not sure whether it is possible to correctly cover all cases.

But it is obvious at least that for such a change of the draw score also a change of the winning probability table would be required.

Note that my examples above are still very simple since I only followed up on HGM's example of a match with 100% drawn games. The reality would be different, though, but I leave that to others.

Sven

Daniel Shawul · Post by **Daniel Shawul** » Mon Aug 06, 2012 3:56 pm

hgm wrote:Well, if you interpret it that way, I guess you would have to give 1 point for both a draw and a loss. Then because your opponent also gets a point, the draw cunts for two games: one win, one loss. This is exactly what BayesElo does.

Actually no. Bayeselo assume a win and a loss is equal to one draw. The other model mentioned in the BT model (Davidson) assumes a win and a loss are equal to two draws.
Why should a loss be given one point? It is a 3-1-0 system.

Daniel Shawul · Post by **Daniel Shawul** » Mon Aug 06, 2012 4:01 pm

I think HGM is right. The existing ELO rating formula would simply not work when not scoring a draw with 0.5 - at least they won't work without changes.

From the FIDE Handbook:

By that you mean I am wrong ? I pointed out that percentages shouldn't be calculated from the total available points but from the total points gained during a game. As I said it doesn't make lots of sense in this case and multi-player games when different amount of total points could be gained. The total number of points available vary from game to game in the case of draws it is 2 instead of 3. So you can not say one side scored 60% when all they did was draw 100%. So instead of a 1/3 for a draw it is a 1/2 = 50% since only 2 points got used and the other 1 point is used to penalize the teams so that they don't play for a draw. FIDE has a fixed 1-0.5-0 so I don't understand why you try to use their formulas ..

Daniel Shawul · Post by **Daniel Shawul** » Mon Aug 06, 2012 4:13 pm

Never mind about a draw being a loss and a win. You said the same thing.

Sven · Post by **Sven** » Mon Aug 06, 2012 4:24 pm

Daniel Shawul wrote:
I think HGM is right. The existing ELO rating formula would simply not work when not scoring a draw with 0.5 - at least they won't work without changes.

From the FIDE Handbook:
By that you mean I am wrong ? I pointed out that percentages shouldn't be calculated from the total available points but from the total points gained during a game. As I said it doesn't make lots of sense in this case and multi-player games when different amount of total points could be gained. The total number of points available vary from game to game in the case of draws it is 2 instead of 3. So you can not say one side scored 60% when all they did was draw 100%. So instead of a 1/3 for a draw it is a 1/2 = 50% since only 2 points got used and the other 1 point is used to penalize the teams so that they don't play for a draw. FIDE has a fixed 1-0.5-0 so I don't understand why you try to use their formulas ..

Please calm down, Daniel

I wrote a long reply that was initially meant as a reply to HGM but in the meantime you had already answered so I chose to click on "Reply" at your post. I did not react directly on anything you wrote, I didn't even notice the full contents. So my intention was not to say that you are wrong

Now regarding the topic itself: I don't favor that "draw score != 0.5" proposal at all. I just wanted to show that it already would not work for a trivial case unless you change the percentage expectancy table AND of course the fixed "1 - 0.5 - 0" rule. So we both agree in general, I think!

Whether the actual draw score is 0.6 or 0.45 or 0.4 or 1/3 (as in the "3-1-0" system if I understand it correctly - see also further down) does not matter in principle when analyzing whether a modified ELO system would work at all that way. My current opinion is, it will break at some point.

Regarding the "3-1-0" system, what I understand is it would mean that a win gets scored 3 points, a draw 1 point and a loss 0 points. To easily transform this into the existing ELO system you would divide by 3 and thus keep wins and losses at 1 resp. 0 points, and draws would get 1/3 points. That is not 50% so I don't get your point in that part.

On everything else I believe we both agree!

Sven

Draw value

Draw value

Re: Draw value

Re: Draw value

Re: Draw value

Re: Draw value

Re: Draw value

Re: Draw value

Re: Draw value

Re: Draw value

Re: Draw value