Ordo release (rating software, ELO-like)

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Ordo release (rating software, ELO-like)

Post by michiguel »

https://sites.google.com/site/gaviotachessengine/ordo

Based on a recent discussion on the IPON rankings, I decided to clean up the command line interface and release it. It may be an alternative to BayesELO and ELOSTAT.

Miguel
User avatar
Ajedrecista
Posts: 1971
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Ordo release (rating software, ELO-like).

Post by Ajedrecista »

Hello Miguel:
michiguel wrote:https://sites.google.com/site/gaviotachessengine/ordo

Based on a recent discussion on the IPON rankings, I decided to clean up the command line interface and release it. It may be an alternative to BayesELO and ELOSTAT.

Miguel
It is the first time I try a rating software and I must say that is was very easy for me. Congratulations! It also was very fast (surely less than five seconds with the PGN sample file that you provide with the download).

I read in the readme.txt file:
ordo -a 2500 -p results.pgn -o rating.txt
And I got this:

Code: Select all

C:\Documents and Settings\~[...]~\ordo-v0.2>ordo -a 2800 -p res
ults.pgn -o rating.txt
"ordo" no se reconoce como un comando interno o externo,
programa o archivo por lotes ejecutable.
There is no need of translate because you understand Spanish. I had to try with ordo-win32 instead of ordo:

Code: Select all

C:\Documents and Settings\~[...]~\ordo-v0.2>ordo-win32 -a 2800
-p results.pgn -o rating.txt

importing results (x1000):
****************************************    40k
****************************************    80k
****************************************   120k
****************************************   160k
*  total games:  161200

set average rating = 2800.000000

phase iteration  deviation
  0       2       26.53425
  1       1       13.71973
  2       3        6.75447
  3       4        3.41538
  4       2        1.74031
  5       2        0.91858
  6       8        0.41953
  7       1        0.18896
  8       1        0.11461
  9       1        0.06470
 10       7        0.02880
 11       2        0.01590
 12       4        0.00854
 13       7        0.00422
 14       3        0.00290
 15      17        0.00115
 16      13        0.00045
 17       1        0.00027
 18       7        0.00016
 19      11        0.00008
done
Could you please explain the meaning of iteration and deviation columns? Thanks in advance.

I get a rating list in a notepad (rating.txt); here is a short example:

Code: Select all

                        ENGINE:  RATING    POINTS  PLAYED    (%)
               Houdini 2.0 STD:  3090.2    2277.5    2900   78.5%
                  Houdini 1.5a:  3084.5    3162.5    4000   79.1%
             Critter 1.4 SSE42:  3055.5    1853.0    2400   77.2%
                Komodo 4 SSE42:  3049.8    1892.5    2500   75.7%
              Komodo64 3 SSE42:  3038.3    2075.5    2800   74.1%
          Deep Rybka 4.1 SSE42:  3032.8    2655.0    3700   71.8%
                   Critter 1.2:  3028.1    2232.0    3100   72.0%
                  Deep Rybka 4:  3027.7    3627.0    4900   74.0%
                 Houdini 1.03a:  3025.8    2520.0    3200   78.8%
          Komodo 2.03 DC SSE42:  3021.0    1985.5    2700   73.5%
            Stockfish 2.1.1 JA:  3012.4    2426.5    3500   69.3%

                          [...]
There are not uncertainties (± ... Elo) but IMHO the programme itself is quite good, gives elegant results and is very fast (processing 161,200 games in less than five seconds). At least, Ordo 0.2 is very fast from my POV... I do not know the speed of other programmes.

I want to congratulate you for the total success of Gaviota TBs: I have the full set (up to 5-man) and works flawlessly. Una vez más: ¡muchas gracias!

Regards from Spain.

Ajedrecista.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Ordo release (rating software, ELO-like)

Post by Adam Hair »

michiguel wrote:https://sites.google.com/site/gaviotachessengine/ordo

Based on a recent discussion on the IPON rankings, I decided to clean up the command line interface and release it. It may be an alternative to BayesELO and ELOSTAT.

Miguel
Excuse me for saying this Miguel, but it is about damn time that you did this. :)

I have wanted to use your program since you told me about it last year.

Seriously, thanks for sharing it.

Adam
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: Ordo release (rating software, ELO-like)

Post by Rémi Coulom »

michiguel wrote:https://sites.google.com/site/gaviotachessengine/ordo

Based on a recent discussion on the IPON rankings, I decided to clean up the command line interface and release it. It may be an alternative to BayesELO and ELOSTAT.

Miguel
Hi,

That looks interesting, but I could not find a description of the algorithm. Your volleyball link is broken. Can you explain a little?

Thanks,

Rémi
User avatar
Ajedrecista
Posts: 1971
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Ordo release (rating software, ELO-like).

Post by Ajedrecista »

Hello Rémi:
Rémi Coulom wrote:
michiguel wrote:https://sites.google.com/site/gaviotachessengine/ordo

Based on a recent discussion on the IPON rankings, I decided to clean up the command line interface and release it. It may be an alternative to BayesELO and ELOSTAT.

Miguel
Hi,

That looks interesting, but I could not find a description of the algorithm. Your volleyball link is broken. Can you explain a little?

Thanks,

Rémi
That link works here. I repeat it:

https://www.msu.edu/~ballicor/vb/rankfaq.htm

I copy here all the info. It may contain errors because I tried to copy all the links and the text format (bold, italic, ...) by hand (it took me a while):
Back to main page...

Volleyball Computer Ranking F.A.Q.

--------------------------------------------------------------------------------

Index

Did you make the computer rankings program yourself?
Can you give a brief overview of the ranking system?
Can you give more details of the ranking system?
Where do you get the results from?
Why does my team have a different record in the list you show?
Can I send you results from my team?
Is it correct that your computations operate entirely at the level of games won and lost, with no weight at all given specifically for winning or losing the match?
Does the ranking consider whether a team is improving?
What do the "points" mean?
Does it take into account home court advantage?

--------------------------------------------------------------------------------

Answers

·Did you make the computer rankings program yourself?
I made the program by myself for fun, in my spare time, very... slowly. Part is in C, part in Turbo Pascal.

·Can you give a brief overview of the ranking system?
In a thread in rec.sport.volleyball (1996, you may find some discussions about the rating system in the archives of rec.sport.volleyball in http://www.dejanews.com), Eric Wang from Illinois explained better than me:

"Miguel models his rankings after FIDE chess, which is a zero-sum system in which the winner takes some of the loser's points, depending on the point difference between them. Upsets are worth huge rewards (IIRC, ~50 in one match is "huge"), overruns vs fish are worth next to nothing. Points enter the system when new players join, and leave when they retire. But this system is actually quite similar to the poll system, since pollsters instinctively use a similar algorithm (even if they don't know it :-). A player/team doesn't get ranked above a player he just beat, unless they were within ~20 points of each other to begin with. Nor does this method take into account injuries, illnesses, blunders, external influences (e.g. got caught in traffic and missed your start time, or hand slipped and touched the wrong piece, which you're then obligated to move, or flourescent lights caused distracting buzz, triggering a severe migraine). And it's possible for a player to build up so much of a lead over the rest of the field that he could conceivably lose several games in a row to his closest rivals and still be ranked #1, although very few players have ever built this kind of lead (IIRC, Fischer and Kasparov have done this for short periods)."

·Can you give more details of the ranking system?
It takes into account games (not matches) won and lost and how good the opposition was. For instance, team A plays team B. Team A is 160 points above B. So, according to a table (based on the gauss curve) it is expected that A win 76% of the games over B. In other words, if they play four games, it would be expected a final result of ~3-1. If team A beat B 3-2, it means that A did not deserve to be 160 points above B. Therefore, A's rating is decreased B's rating is increased using a certain factor. This is done for every match played in the nation. When this cycle finishes, a rating for each team is obtained. Then, these ratings are used as "seeds" for the next round of calculation. The computer repeat the process iteratively until the ratings remain stable without further change. It is important to note that when the iteration starts, all the teams have the same rating (2000).

At the end of the rating calculation, if you pick a team, you will see that for each match it played, you get a difference between how many games was expected to win and how many it won. This number for some matches is negative, for others is positive. The addition of all those numbers has to be zero. In other words, the difference in rating between that team and their opponents reach a value that reflects a balance between the results and the expectations.

The original theory has been used for a long time (~3 decades) in chess with great success and accuracy. It was developed by the mathematician and chessplayer Dr. Arpad Elo (I had the huge pleasure to meet him in 1979! I was 15 when I represented Argentina in an international tournament in Puerto Rico for young chessplayers and he was the Tournament Director). These rankings are used by all federations in the world including the International Chess Federation (FIDE).

I modified the system a lot in order to be adapted for college volleyball. For instance, I introduced an iterative procedure. In Chess there are some aproximations that make the calculation easy to do but in college volleyball they are not valid. It has been developed to be use with paper and pencil and without computers.



·Where do you get the results from?
Weekly, I add results taken from the AVCA web site. You can take a look to all those results sorted by team, alphabetically or by conference, in the web site that is mantained by Rich Kern at Nebraska. All the weekly results are contained in a file that fortunately has kept a constant format all this time. That allowed me to wrote a program that parses the scores. It handles differences in names and even mispellings through a list of synonims (Michigan State = Michigan St). The list of weekly results from the AVCA site is not complete and a few times contains errors. If I detect one, I check it in any of the several volleyball web pages that each team has constructed this year. A List of many volleball team home pages is mantained at Penn State. Some people sent me results from many teams that were not included in the AVCA web site. This is a list of the results of the 1997 season.


·Why does my team has a different record in the list you show?
The most common reason is that I don't have all the results of that particular team. Sometimes, the file from the ACVA site contains errors. Many times gives two different results for the same match. Once, I had that Michigan State defeated Purdue 3-0 and 3-1 the same week. These problems pop up when I see that a team has more matches than it should. When I observe this, I check the respective home page. Generally I try to do this for the first ~30 teams in the ranking.


·Can I send you results from my team?
Please, do it!. However, if you really want me to update your team with a complete schedule, send me the results in a format like this:

5: Stanford % UCLA = 3-2
6: Texas % Hawaii = 2-3

Where "5" and "6" are the weeks when the matches were played. That would certainly help me to add them immediately to the match database. If there is any match that it is already in the database, don't worry, it will be detected by the program and discarded automatically.


·Is it correct that your computations operate entirely at the level of games won and lost, with no weight at all given specifically for winning or losing the match?
Yes, that is correct. The program takes into account games won and lost. Then, calculates the probability to win/lose a game. Obviously, that has a relationship with the probability to win a match, which can be calculated from the former one.


·Does the ranking consider whether a team is improving?
Yes. Last matches are considered more important that the ones played several weeks ago. Value of the matches decays exponentially. A result that is eight weeks old worth only half of the value of any result obtained last week. This number is empirical but it seems to work ok.


·What do the points mean?
The absolute rating of a team does not mean anything. The rating difference between two teams is what has a meaning. The following table illustrates what the points (rating) mean. It allows to understand how a team is expected to perform when there is a given difference in points with another team.

PROBABILITY TABLE

Code: Select all

---------------------------------------------
 Point    Prob. to     Prob. to              
 diff.   win a game  win a 5-game   Odds     
                        match                
---------------------------------------------
   0:       0.50        0.500       1.0 : 1  
  20:       0.54        0.566       1.3 : 1  
  40:       0.57        0.630       1.7 : 1  
  60:       0.60        0.691       2.2 : 1  
  80:       0.64        0.747       3.0 : 1  
 100:       0.67        0.797       3.9 : 1  
 120:       0.70        0.840       5.3 : 1  
 140:       0.73        0.877       7.1 : 1  
 160:       0.76        0.907       9.8 : 1  
 180:       0.79        0.931      13.6 : 1  
 200:       0.81        0.950      19.2 : 1  
 220:       0.83        0.965      27.5 : 1  
 240:       0.85        0.976      40.1 : 1  
 260:       0.87        0.983      59.4 : 1  
 280:       0.89        0.989      89.6 : 1  
 300:       0.91        0.993     137.5 : 1  
---------------------------------------------
Go to a more complete table...


·Does it take into account home court advantage?
It does not consider whether a team played as a host or in the road at the time to compute the results. However, that should not be a big problem because all the teams play almost 50% of the times in both conditions. as guests. That should balance the final result or at least the difference should be minimal. In other words, the rating can be interpreted as an average of the team's strength as a host and as a visitor.

I could calculate the home-court-advantage for each team, but I don't have the necessary information (i.e. who was host and visitor for each match).

At the end of the season in 1996, I did a calculation to see if the home court advantage was real in the Big Ten. It turned out that the host team is favored with ~60 points. I assume that this is similar in the rest of the country. Thus, if you want a prediction, add 60 points to the host team and go to the probability table.



--------------------------------------------------------------------------------

Miguel A. Ballicora
Created: 11/24/97
Last update: 09/19/99
mailto:ballicor@msu.edu
There are some links that do not work gor me: the second and the third links of ·Where do you get the results from?

Nice read and please enjoy. Thank you very much for all your work in BayesElo, CLOP, ...

Regards from Spain.

Ajedrecista.
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: Ordo release (rating software, ELO-like).

Post by Rémi Coulom »

I see, thanks. It is like bayeselo, except there is no prior, no confidence intervals, no LOS, and the model is Gaussian instead of logistic.

Rémi
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Ordo release (rating software, ELO-like).

Post by michiguel »

Rémi Coulom wrote:I see, thanks. It is like bayeselo, except there is no prior, no confidence intervals, no LOS, and the model is Gaussian instead of logistic.

Rémi
Hi Remi,

It is logistic. The procedure has evolved since the beginning, when I used a Gaussian (there are other things that I do not need to do for engines, like giving weights for "newer" games etc.Home field advantage, or neutral). In fact, this is a simpler rewrite in C of the original Pascal code. I still do not have White-Black bonus calculation, for instance.

I just gave the volleyball link for historical purpose and out of laziness. I will try to give more details soon. The fact that is logistic is a "mathematical convergence" of my "theoretical" frame work. I did not pick the equation to mimic a Gaussian as I recently mention days ago here. As Kai told me, it may look very similar to BayesELO mathematically.

Miguel
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Ordo release (rating software, ELO-like)

Post by michiguel »

Adam Hair wrote:
michiguel wrote:https://sites.google.com/site/gaviotachessengine/ordo

Based on a recent discussion on the IPON rankings, I decided to clean up the command line interface and release it. It may be an alternative to BayesELO and ELOSTAT.

Miguel
Excuse me for saying this Miguel, but it is about damn time that you did this. :)

I have wanted to use your program since you told me about it last year.

Seriously, thanks for sharing it.

Adam
Imagine that I promised this to Leo Dijksman almost 10 years ago...

:oops:

Miguel
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Ordo release (rating software, ELO-like).

Post by michiguel »

Ajedrecista wrote:Hello Miguel:
michiguel wrote:https://sites.google.com/site/gaviotachessengine/ordo

Based on a recent discussion on the IPON rankings, I decided to clean up the command line interface and release it. It may be an alternative to BayesELO and ELOSTAT.

Miguel
It is the first time I try a rating software and I must say that is was very easy for me. Congratulations! It also was very fast (surely less than five seconds with the PGN sample file that you provide with the download).

I read in the readme.txt file:
ordo -a 2500 -p results.pgn -o rating.txt
And I got this:

Code: Select all

C:\Documents and Settings\~[...]~\ordo-v0.2>ordo -a 2800 -p res
ults.pgn -o rating.txt
"ordo" no se reconoce como un comando interno o externo,
programa o archivo por lotes ejecutable.
There is no need of translate because you understand Spanish. I had to try with ordo-win32 instead of ordo:
I think I may not have been clear in the readme.txt. You have to rename ordo-win32.exe to ordo.exe. But that is ok, you can run it as you did.

Code: Select all

C:\Documents and Settings\~[...]~\ordo-v0.2>ordo-win32 -a 2800
-p results.pgn -o rating.txt

importing results (x1000):
****************************************    40k
****************************************    80k
****************************************   120k
****************************************   160k
*  total games:  161200

set average rating = 2800.000000

phase iteration  deviation
  0       2       26.53425
  1       1       13.71973
  2       3        6.75447
  3       4        3.41538
  4       2        1.74031
  5       2        0.91858
  6       8        0.41953
  7       1        0.18896
  8       1        0.11461
  9       1        0.06470
 10       7        0.02880
 11       2        0.01590
 12       4        0.00854
 13       7        0.00422
 14       3        0.00290
 15      17        0.00115
 16      13        0.00045
 17       1        0.00027
 18       7        0.00016
 19      11        0.00008
done
In each phase the program tries to minimize the differences between the data and the model (rating values for all the players) "trying" to vary "rating" in chunks. Phase 0 the program tries adjusting the rating adding or subtracting 100 points (very gross approximation). After trying several times (iterations), it reaches the best it could. The "best" is defined by the "deviation", which is the average difference between the data and the model. For instance, in the first case, the difference between the current rating and the optimum is as bad as 26.5 points for each player (average). The next phase, the increments are not 100 points, but 50, then 25, then 12.5 etc. This is done until the deviation is very low. A deviation of "0.000000" is when the model has been optimized to fit the data, but it is not really worth it to spend more calculation time.

I hope this helps. I give this so the user has an idea what the program is doing. With certain anomalous data, the calculation can last longer because convergence is problematic. IPON data is really great (all players played each other) so that's why it goes fast.

Could you please explain the meaning of iteration and deviation columns? Thanks in advance.

I get a rating list in a notepad (rating.txt); here is a short example:

Code: Select all

                        ENGINE:  RATING    POINTS  PLAYED    (%)
               Houdini 2.0 STD:  3090.2    2277.5    2900   78.5%
                  Houdini 1.5a:  3084.5    3162.5    4000   79.1%
             Critter 1.4 SSE42:  3055.5    1853.0    2400   77.2%
                Komodo 4 SSE42:  3049.8    1892.5    2500   75.7%
              Komodo64 3 SSE42:  3038.3    2075.5    2800   74.1%
          Deep Rybka 4.1 SSE42:  3032.8    2655.0    3700   71.8%
                   Critter 1.2:  3028.1    2232.0    3100   72.0%
                  Deep Rybka 4:  3027.7    3627.0    4900   74.0%
                 Houdini 1.03a:  3025.8    2520.0    3200   78.8%
          Komodo 2.03 DC SSE42:  3021.0    1985.5    2700   73.5%
            Stockfish 2.1.1 JA:  3012.4    2426.5    3500   69.3%

                          [...]
There are not uncertainties (± ... Elo)
That is because I personally do not use those numbers. Uncertainties really depend of what you try to compare. The uncertainty of player A, may depend if you compare it with player B or player C. The rating is not an absolute measure, but only has a meaning when it is subtracted from another one.

but IMHO the programme itself is quite good, gives elegant results and is very fast (processing 161,200 games in less than five seconds). At least, Ordo 0.2 is very fast from my POV... I do not know the speed of other programmes.

I want to congratulate you for the total success of Gaviota TBs: I have the full set (up to 5-man) and works flawlessly. Una vez más: ¡muchas gracias!
Un placer saber que funcionan!!

Saludos,
Miguel
Regards from Spain.

Ajedrecista.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Ordo release (rating software, ELO-like)

Post by Adam Hair »

michiguel wrote:
Adam Hair wrote:
michiguel wrote:https://sites.google.com/site/gaviotachessengine/ordo

Based on a recent discussion on the IPON rankings, I decided to clean up the command line interface and release it. It may be an alternative to BayesELO and ELOSTAT.

Miguel
Excuse me for saying this Miguel, but it is about damn time that you did this. :)

I have wanted to use your program since you told me about it last year.

Seriously, thanks for sharing it.

Adam
Imagine that I promised this to Leo Dijksman almost 10 years ago...

:oops:

Miguel
I will get in line behind Leo. He has been waiting 9 years longer than I. :)