Hello,
During last week I have been working in a tool to calculate ELO from PGN, like elostat or bayeselo. Not because I don't trust them, but because I want to learn how they work and have more control over what I test, even to make experiments like the one I comment here.
I have modified my tool to take into account white elo advantage. During the course of programming it, I started to debate with myself if draws should value other than 0.5. I have done experiment downloading all games from CCRL 40/40 and run my tool with different draw values, and standing classification changed noticeable. I have tested different draw values even progressive according to opponent elo.
I launch a few questions that round my head:
* Do you think is a good testing strategy to give different draw values? I.e. 0.45 for draw black and 0.40 for draw white.,... or like in footbal 3-1-0. This way you encourage a win-risky engine. In spain we say best defense is a good attack...
* Should CCC comunity change draw value for calculating ELO public lists?
...
Fermin
P.S.: Althought is in a testing phase and can has errors, I can give CCRL list I have done with different draw values to any interested. PM.
Draw value
Moderator: Ras
-
Kempelen
- Posts: 620
- Joined: Fri Feb 08, 2008 10:44 am
- Location: Madrid - Spain
-
Daniel Shawul
- Posts: 4186
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Draw value
Hi FerminHello,
During last week I have been working in a tool to calculate ELO from PGN, like elostat or bayeselo. Not because I don't trust them, but because I want to learn how they work and have more control over what I test, even to make experiments like the one I comment here.
That is great. I haven't actually programmed anything but have been studying elo estimation algorithms. I think you would program elostat blindfold
Well you can't do much about that as it is the standard used by FIDE. 1-0.5-0. So a draw is half as good as a win so you can not change it to any value.I have modified my tool to take into account white elo advantage. During the course of programming it, I started to debate with myself if draws should value other than 0.5. I have done experiment downloading all games from CCRL 40/40 and run my tool with different draw values, and standing classification changed noticeable. I have tested different draw values even progressive according to opponent elo.
I understand what you mean. For example Barca has a big home advantage and never loose at home, and the draw ratio away from home is high for most teams. So there is home advantage, and there are draws that reduce winning ration. I think when both are combined you will get what you need. Elostat doesn't differentiate but bayeselo has 4 terms in the MM optimization algorithm that takes care of those. One may argue Barca's home advantage is significantly higher than other team's and probably deserves higher value than the rest of the teams. I am not sure but a constant draw ratio is just a prior for bayeselo then it modifies it to match the observed data.I launch a few questions that round my head:
* Do you think is a good testing strategy to give different draw values? I.e. 0.45 for draw black and 0.40 for draw white.,... or like in footbal 3-1-0. This way you encourage a win-risky engine. In spain we say best defense is a good attack...
Using a 3-1 system will change the whole complexion of the game so you should forget about that. The rewards are fixed at 1-0.5-0 unless FIDE changes it.
Bayeselo uses a 33 elo advantage based on some WBEC data. So the recommendation is to calculate it from the data itself first. Different tournament situations (time control) etc may have an effect on both home advantage and draw ration.* Should CCC comunity change draw value for calculating ELO public lists?
cheers
-
hgm
- Posts: 28419
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Draw value
I don't think you can get a consistent rating system by awarding draws with anything but 0.5 point. Suppose that you have 2 players, playing each other 100 times, and that every game is a draw. Then they both have a score of 60%, and thus each should both be 70 Elo stronger as well as 70 Elo weaker than their opponent.
-
Daniel Shawul
- Posts: 4186
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Draw value
I think it is only the relative importance of draws with respect to a win that will be changed if the 3-1-0 system is used. So three draws will be as good as a win and two losses now. For your example, the percentages are still 50-50 since when two teams draw the available points are 2 and the rest are lost to penalize those team. 3 points are always available but only 2 points are gained when a draw occurs. So we can not calculate percentages right after a game but after all the points are summed. Similar situation with multi-player games that assign ranks. It doesn't make sense to calculate percentage in such cases..hgm wrote:I don't think you can get a consistent rating system by awarding draws with anything but 0.5 point. Suppose that you have 2 players, playing each other 100 times, and that every game is a draw. Then they both have a score of 60%, and thus each should both be 70 Elo stronger as well as 70 Elo weaker than their opponent.
-
hgm
- Posts: 28419
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Draw value
Well, if you interpret it that way, I guess you would have to give 1 point for both a draw and a loss. Then because your opponent also gets a point, the draw cunts for two games: one win, one loss. This is exactly what BayesElo does.
-
Sven
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Draw value
I think HGM is right. The existing ELO rating formula would simply not work when not scoring a draw with 0.5 - at least they won't work without changes.Daniel Shawul wrote:I think it is only the relative importance of draws with respect to a win that will be changed if the 3-1-0 system is used. So three draws will be as good as a win and two losses now. For your example, the percentages are still 50-50 since when two teams draw the available points are 2 and the rest are lost to penalize those team. 3 points are always available but only 2 points are gained when a draw occurs. So we can not calculate percentages right after a game but after all the points are summed. Similar situation with multi-player games that assign ranks. It doesn't make sense to calculate percentage in such cases..hgm wrote:I don't think you can get a consistent rating system by awarding draws with anything but 0.5 point. Suppose that you have 2 players, playing each other 100 times, and that every game is a draw. Then they both have a score of 60%, and thus each should both be 70 Elo stronger as well as 70 Elo weaker than their opponent.
From the FIDE Handbook:
Let's assume a draw gets assigned a score SCD (SCD != 0.5). The two players are in fact equally rated. Table 8.1(b) gives PD=0 for each game. ΔR = SCD - 0.58.55
(a) Use table 8.1(b) to determine the player’s score probability PD
(b) ΔR = score – PD. For each game, the score is 1, 0.5 or 0.
(c) ΣΔR x K = the Rating Change for a given tournament, or Rating period.
ΣΔR x K = NumberOfGames x (SCD - 0.5) x K
If NumberOfGames = 4 and SCD = 0.6 and K = 10 then both players get a rating change of +4 after that event, for drawing 4 games. That is an "Easy ELO" system, I might be tempted to play 364 games per year under those rules
The reason why it does not work that way is that chess is a zero-sum game. One player gets rewarded what his opponent loses. Perhaps a modified table 8.1(b) that is specific for the chosen value of SCD might help. In the special case above with two equally rated players it would solve the case since assigning a PD value of exactly SCD for a rating difference of 0 would result in a rating change of 0 for both after playing a match of only draws. In the most frequent case of two players with different ratings we would get a result as in the following example:
Assume a rating difference of 20. Currently 8.1(b) shows PD=0.53 resp. 0.47, with SCD=0.6 one could assign PD=0.63 for the better player P1 and PD=0.57 for the weaker player P2.
Now for P1:
ΔR = SCD - 0.63 = -0.03
ΣΔR x K = NumberOfGames x -0.03 x K = -1.2
And for P2:
ΔR = SCD - 0.57 = +0.03
ΣΔR x K = NumberOfGames x +0.03 x K = +1.2
So it would now be "zero-sum". The values in the modified table 8.1(b) would have to be chosen carefully, and I am not sure whether it is possible to correctly cover all cases.
But it is obvious at least that for such a change of the draw score also a change of the winning probability table would be required.
Note that my examples above are still very simple since I only followed up on HGM's example of a match with 100% drawn games. The reality would be different, though, but I leave that to others.
Sven
-
Daniel Shawul
- Posts: 4186
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Draw value
Actually no. Bayeselo assume a win and a loss is equal to one draw. The other model mentioned in the BT model (Davidson) assumes a win and a loss are equal to two draws.hgm wrote:Well, if you interpret it that way, I guess you would have to give 1 point for both a draw and a loss. Then because your opponent also gets a point, the draw cunts for two games: one win, one loss. This is exactly what BayesElo does.
Why should a loss be given one point? It is a 3-1-0 system.
Last edited by Daniel Shawul on Mon Aug 06, 2012 4:07 pm, edited 1 time in total.
-
Daniel Shawul
- Posts: 4186
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Draw value
By that you mean I am wrong ? I pointed out that percentages shouldn't be calculated from the total available points but from the total points gained during a game. As I said it doesn't make lots of sense in this case and multi-player games when different amount of total points could be gained. The total number of points available vary from game to game in the case of draws it is 2 instead of 3. So you can not say one side scored 60% when all they did was draw 100%. So instead of a 1/3 for a draw it is a 1/2 = 50% since only 2 points got used and the other 1 point is used to penalize the teams so that they don't play for a draw. FIDE has a fixed 1-0.5-0 so I don't understand why you try to use their formulas ..I think HGM is right. The existing ELO rating formula would simply not work when not scoring a draw with 0.5 - at least they won't work without changes.
From the FIDE Handbook:
-
Daniel Shawul
- Posts: 4186
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Draw value
Never mind about a draw being a loss and a win. You said the same thing.
-
Sven
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Draw value
Please calm down, DanielDaniel Shawul wrote:By that you mean I am wrong ? I pointed out that percentages shouldn't be calculated from the total available points but from the total points gained during a game. As I said it doesn't make lots of sense in this case and multi-player games when different amount of total points could be gained. The total number of points available vary from game to game in the case of draws it is 2 instead of 3. So you can not say one side scored 60% when all they did was draw 100%. So instead of a 1/3 for a draw it is a 1/2 = 50% since only 2 points got used and the other 1 point is used to penalize the teams so that they don't play for a draw. FIDE has a fixed 1-0.5-0 so I don't understand why you try to use their formulas ..I think HGM is right. The existing ELO rating formula would simply not work when not scoring a draw with 0.5 - at least they won't work without changes.
From the FIDE Handbook:
Now regarding the topic itself: I don't favor that "draw score != 0.5" proposal at all. I just wanted to show that it already would not work for a trivial case unless you change the percentage expectancy table AND of course the fixed "1 - 0.5 - 0" rule. So we both agree in general, I think!
Whether the actual draw score is 0.6 or 0.45 or 0.4 or 1/3 (as in the "3-1-0" system if I understand it correctly - see also further down) does not matter in principle when analyzing whether a modified ELO system would work at all that way. My current opinion is, it will break at some point.
Regarding the "3-1-0" system, what I understand is it would mean that a win gets scored 3 points, a draw 1 point and a loss 0 points. To easily transform this into the existing ELO system you would divide by 3 and thus keep wins and losses at 1 resp. 0 points, and draws would get 1/3 points. That is not 50% so I don't get your point in that part.
On everything else I believe we both agree!
Sven