Bayeselo problem with large number of players

Janzert · Post by **Janzert** » Mon Oct 25, 2010 10:47 am

I'm working on http://ai-contest.com and we are using bayeselo to generate the rankings. There are currently ~3600 participants with ~145k games currently rated. The problem comes in that I would like to start recording and displaying the confidence intervals as well as the actual elo given. But when I try to calculate the intervals with exactdist or covariance they seem to give bad intervals.

When using exactdist the top 65 players are given negative + intervals, for example:

Code: Select all

Rank Name                            Elo    +    - games score oppo. draws 
   1 8565,bocsimacko,172520         1951 -562  589    38   82%  1582    0% 
   2 8737,Hazard,171835             1950 -560  579    56   75%  1678    0%

When using covariance all the players are given the same very wrong intervals:

Code: Select all

Rank Name                            Elo    +    - games score oppo. draws 
   1 8565,bocsimacko,172520         1951 -2147483648 -2147483648    38   82%  1582    0% 
   2 8737,Hazard,171835             1950 -2147483648 -2147483648    56   75%  1678    0%

Certainly I've seen people here using bayeselo with many more games, but maybe none of them involved large numbers of players. Has anyone seen this before or have any idea on a fix?

Rémi Coulom · Post by **Rémi Coulom** » Wed Oct 27, 2010 9:44 am

Hi,

Your covariance matrix is probably singular, or close to singular. That may happen if the graph of games is not connected, or very loosely connected, which is likely if the number of players is high and the number of games is small. If I can download the PGN somewhere, I'll take a closer look.

Confidence intervals, whatever method is used to compute them, are completely meaningless anyway, since they can tell nothing about the relative strength of two players. I suggest using the default method (ie neither exactdist nor covariance), because it is simplest and fastest. I am seriously considering the possibility of removing confidence intervals from bayeselo, because everytime somebody tries to make a statistical interpretation of data from them, the conclusion is very likely to be completely wrong.

If you wish to estimate the relative strength of two players, the likelihood-of-superiority matrix is what you should use. But, again, computation of LOS is likely to fail if the graph is not connected. Connected means there is at least a path made of wins and a path made of losses between any two players, a draw counting as both a win and a loss.

Rémi

Janzert · Post by **Janzert** » Wed Oct 27, 2010 10:17 am

Thanks for the help. The pgn for the set of games I've been using to test with can be found at http://janzert.com/planetwars-pgn.zip.

The use I was trying to put the confidence interval to is just to give people an idea of how certain, or rather uncertain, the rating is. I think most people looking at the ranking list by far overestimate the certainty of the rating.

LoS does seem most appropriate but I'm not sure if there is a good way to present that in a ranking list with 3000+ players.

Janzert

Rein Halbersma · Post by **Rein Halbersma** » Wed Oct 27, 2010 11:01 am

Rémi Coulom wrote:Hi,

Your covariance matrix is probably singular, or close to singular. That may happen if the graph of games is not connected, or very loosely connected.

If you wish to estimate the relative strength of two players, the likelihood-of-superiority matrix is what you should use. But, again, computation of LOS is likely to fail if the graph is not connected. Connected means there is at least a path made of wins and a path made of losses between any two players, a draw counting as both a win and a loss.

Rémi

In the R-package {igraph} you can supply a neighbourhood matrix (in this case: player x player) of a graph and it will give a list of the various connected components. How much work would it be for BayesElo to create the neighbourhood matrix from the PGN and give LOS matrices for each connected subgraph? In particular, this would give N/2 2x2 LOS matrices after the first round of a N-player tournament.

Rein

Rémi Coulom · Post by **Rémi Coulom** » Wed Oct 27, 2010 12:27 pm

Rein Halbersma wrote:
In the R-package {igraph} you can supply a neighbourhood matrix (in this case: player x player) of a graph and it will give a list of the various connected components. How much work would it be for BayesElo to create the neighbourhood matrix from the PGN and give LOS matrices for each connected subgraph? In particular, this would give N/2 2x2 LOS matrices after the first round of a N-player tournament.

Rein

Bayeselo already has a command to find the connected subgraph of a given player. When applying it to the pgn of planet wars, it removes some players, which means my initial intuition was correct: the set of games is not connected:

Code: Select all

ResultSet>readpgn planetwars.pgn
141214 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>players >list.txt
ResultSet>connect 0
3628 players left
ResultSet>players >list2.txt
ResultSet>! diff list.txt list2.txt
3615,3631c3615,3629
< 3613 6669,luismi,176111                 1       1 
< 3614 11690,skowal,176106                1       0 
< 3615 12047,Tuke82,176150                3       0 
< 3616 11911,Superman_s,176151            1     0.5 
< 3617 11312,UweTUDD,176153               1       1 
< 3618 12025,Robert_Theed,176156          1       0 
< 3619 12105,drdoom,176160                2       0 
< 3620 5894,StanleyMT,176161              1       1 
< 3621 7423,Accoun,176162                 1       1 
< 3622 7090,Moonson,176163                1       1 
< 3623 9786,Roamboater,176164             1       0 
< 3624 10505,chs4314,176165               1       1 
< 3625 7157,Diversus,176166               1       1 
< 3626 11517,weWillWin,176167             1       1 
< 3627 11569,NarcolepticFrog,176168       2       2 
< 3628 3938,dabino,176169                 1       1 
< 3629 12104,Haatschii,176173             1       0 
---
> 3613 12047,Tuke82,176150                3       0 
> 3614 11911,Superman_s,176151            1     0.5 
> 3615 11312,UweTUDD,176153               1       1 
> 3616 12025,Robert_Theed,176156          1       0 
> 3617 12105,drdoom,176160                2       0 
> 3618 5894,StanleyMT,176161              1       1 
> 3619 7423,Accoun,176162                 1       1 
> 3620 7090,Moonson,176163                1       1 
> 3621 9786,Roamboater,176164             1       0 
> 3622 10505,chs4314,176165               1       1 
> 3623 7157,Diversus,176166               1       1 
> 3624 11517,weWillWin,176167             1       1 
> 3625 11569,NarcolepticFrog,176168       2       2 
> 3626 3938,dabino,176169                 1       1 
> 3627 12104,Haatschii,176173             1       0 
Exit status = 256

Rémi

Kirill Kryukov · Post by **Kirill Kryukov** » Wed Oct 27, 2010 6:08 pm

Janzert wrote:LoS does seem most appropriate but I'm not sure if there is a good way to present that in a ranking list with 3000+ players.

Janzert

Hi Brian,

One possible way is to show the LOS values only for neighboring entries in the list. This is what we do in CCRL lists (e.g. CCRL 40/40). Note that the LOS number is shown between the lines in the list, hopefully making it easy to interpret.

On KCEC tournament pages I extended this idea to multiple LOS columns (3 by default, up to 10 in custom comparison). I'm not sure it's easy to figure out for everyone, but in a programmer's contest perhaps it should be OK.

Why don't you show the total number of games of each player? It's a very important number.

Personally I'd also appreciate if you could show the average opponent's rating (relative to own rating) and average distance to the opponent.

Good luck and thanks for the efforts you are putting into this.

Best,
Kirill

marcelk · Post by **marcelk** » Wed Oct 27, 2010 8:13 pm

Janzert wrote:I'm working on http://ai-contest.com and we are using bayeselo to generate the rankings.

By default Bayeselo gives White some higher winning expectations than Black, because in chess this is true. Is it is different in Planet Wars, you might need to look into how to remove this chess bias or your rankings will be influenced.

Janzert · Post by **Janzert** » Fri Oct 29, 2010 9:17 pm

Sorry, I dropped out of the conversation. Windstorm went through and knocked out power for 2 days.

marcelk wrote:
Janzert wrote:I'm working on http://ai-contest.com and we are using bayeselo to generate the rankings.
By default Bayeselo gives White some higher winning expectations than Black, because in chess this is true. Is it is different in Planet Wars, you might need to look into how to remove this chess bias or your rankings will be influenced.

This was my belief as well, but checking the advantage command in bayeselo it returns 0. Also the default way the planetwars backend feeds data to bayeselo is with the winning player always as player one. I tried changing this to randomize which player was the winner and it made no difference in the results. So I'm pretty sure while bayeselo can be set to give a first player advantage it does not do so by default.

Janzert

Rémi Coulom · Post by **Rémi Coulom** » Sat Oct 30, 2010 9:29 am

Janzert wrote: This was my belief as well, but checking the advantage command in bayeselo it returns 0. Also the default way the planetwars backend feeds data to bayeselo is with the winning player always as player one. I tried changing this to randomize which player was the winner and it made no difference in the results. So I'm pretty sure while bayeselo can be set to give a first player advantage it does not do so by default.

Janzert

The default value is not zero. Did you compile bayeselo from source with a recent C++ compiler? Maybe you hit that problem:
http://stackoverflow.com/questions/3968 ... ut-failure
Although I could not replicate it.

Rémi

Janzert · Post by **Janzert** » Sat Oct 30, 2010 9:42 am

Rémi Coulom wrote:
Janzert wrote: This was my belief as well, but checking the advantage command in bayeselo it returns 0. Also the default way the planetwars backend feeds data to bayeselo is with the winning player always as player one. I tried changing this to randomize which player was the winner and it made no difference in the results. So I'm pretty sure while bayeselo can be set to give a first player advantage it does not do so by default.

Janzert
The default value is not zero. Did you compile bayeselo from source with a recent C++ compiler? Maybe you hit that problem:
http://stackoverflow.com/questions/3968 ... ut-failure
Although I could not replicate it.

Rémi

On the contest server it is using g++ 4.2.4 and locally I'm using g++ 4.4.3. Both versions of bayeselo respond with zero to the advantage command. I only tested locally with the randomized winning side.

Janzert

Bayeselo problem with large number of players

Bayeselo problem with large number of players

Re: Bayeselo problem with large number of players

Re: Bayeselo problem with large number of players

Re: Bayeselo problem with large number of players

Re: Bayeselo problem with large number of players

Re: Bayeselo problem with large number of players

Re: Bayeselo problem with large number of players

Re: Bayeselo problem with large number of players

Re: Bayeselo problem with large number of players

Re: Bayeselo problem with large number of players