I don't actually know what jointdist and exactdist stand for (given that we already have the two covariance options).Daniel Shawul wrote: ↑Tue Jul 30, 2019 6:51 pmThere are two more options for computing intervalsWith exactdistCode: Select all
jointdist [p] ... compute intervals from joint distribution exactdist [p] ... compute intervals assuming exact opponent Elos
jointdist takes a lot of time. Will post again if it finishes.Code: Select all
Rank Name Elo +  games score oppo. draws 1 scorpio146 415 40 40 200 56% 379 36% 2 scorpio147 383 56 56 100 45% 415 34% 3 scorpio145 374 40 40 200 50% 375 38% 4 scorpio134 347 39 38 200 55% 317 43% 5 scorpio144 335 39 39 200 48% 350 40% 6 scorpio141 329 39 39 200 52% 315 38% 7 scorpio143 326 40 40 200 49% 330 35% 8 scorpio142 325 40 39 200 50% 327 37% 9 scorpio133 319 40 39 200 51% 310 37% 10 scorpio135 315 39 39 200 48% 327 43% 11 scorpio137 311 40 40 200 52% 298 34% 12 scorpio136 308 39 39 200 49% 313 37% 13 scorpio139 308 39 39 200 52% 297 37% 14 scorpio140 306 40 40 200 48% 318 34% 15 scorpio138 288 39 40 200 47% 309 37% 16 scorpio131 283 40 39 200 56% 247 39% 17 scorpio132 273 40 40 200 46% 301 36% 18 scorpio125 243 40 40 200 55% 210 32% 19 scorpio128 224 39 39 200 51% 217 37% 20 scorpio129 222 39 39 200 50% 222 40% 21 scorpio130 221 39 39 200 45% 253 41% 22 scorpio124 218 40 40 200 50% 218 34% 23 scorpio127 212 39 39 200 50% 213 39% 24 scorpio121 203 40 40 200 57% 154 35% 25 scorpio126 201 40 40 200 46% 227 35% 26 scorpio123 193 39 39 200 51% 185 38% 27 scorpio118 172 39 39 200 54% 149 41% 28 scorpio120 156 39 39 200 47% 176 38% 29 scorpio122 152 39 40 200 43% 198 38% 30 scorpio117 150 54 54 101 47% 170 42% 31 scorpio119 149 39 39 200 48% 164 42% 32 scorpio17 62 38 38 200 56% 24 50% 33 scorpio16 38 38 38 200 50% 40 50% 34 scorpio14 35 39 39 200 55% 5 41% 35 scorpio15 18 38 38 200 47% 36 47% 36 scorpio18 10 38 38 200 47% 27 50% 37 scorpio115 10 39 39 200 53% 11 38% 38 scorpio101 8 38 38 200 56% 25 48% 39 scorpio116 8 55 55 101 50% 11 40% 40 scorpio11 6 28 28 400 52% 8 42% 41 scorpio21 6 38 38 200 53% 11 49% 42 scorpio10 3 27 27 400 51% 2 44% 43 scorpio99 0 39 38 200 55% 28 43% 44 scorpio20 6 38 38 200 49% 0 50% 45 scorpio19 7 37 37 200 49% 2 52% 46 scorpio8 7 28 28 400 52% 17 41% 47 scorpio13 8 38 38 200 48% 8 46% 48 scorpio86 11 39 39 200 53% 30 41% 49 scorpio9 11 27 27 400 49% 2 43% 50 scorpio22 17 37 37 200 50% 16 55% 51 scorpio12 20 31 31 300 47% 2 45% 52 scorpio109 21 39 39 200 55% 53 39% 53 scorpio104 23 39 39 200 55% 52 39% 54 scorpio7 24 27 27 400 51% 28 45% 55 scorpio112 24 39 38 200 52% 37 43% 56 scorpio102 24 38 38 200 49% 18 48% 57 scorpio100 25 38 38 200 45% 4 48% 58 scorpio87 27 38 38 200 49% 21 46% 59 scorpio97 27 39 39 200 53% 45 42% 60 scorpio114 30 39 40 200 48% 15 36% 61 scorpio98 30 39 39 200 48% 14 42% 62 scorpio88 30 38 38 200 51% 34 50% 63 scorpio55 32 38 38 200 53% 50 48% 64 scorpio85 33 39 39 200 52% 43 42% 65 scorpio56 35 38 38 200 51% 41 48% 66 scorpio111 35 39 39 200 51% 37 40% 67 scorpio107 35 39 39 200 54% 60 43% 68 scorpio37 36 37 37 200 51% 44 60% 69 scorpio24 37 38 37 200 54% 61 51% 70 scorpio47 38 38 37 200 54% 63 52% 71 scorpio23 39 37 37 200 48% 27 53% 72 scorpio113 39 39 39 200 48% 27 42% 73 scorpio90 40 39 39 200 51% 43 43% 74 scorpio33 40 38 38 200 54% 64 48% 75 scorpio89 42 38 38 200 49% 35 47% 76 scorpio38 43 36 36 200 50% 42 62% 77 scorpio103 44 39 39 200 47% 23 42% 78 scorpio36 44 37 37 200 51% 48 56% 79 scorpio91 44 38 38 200 51% 47 45% 80 scorpio39 47 37 37 200 50% 47 59% 81 scorpio6 48 27 27 400 49% 45 46% 82 scorpio48 48 38 38 200 52% 60 46% 83 scorpio40 50 37 37 200 53% 67 54% 84 scorpio43 50 37 37 200 53% 66 55% 85 scorpio57 50 38 38 200 52% 60 51% 86 scorpio110 51 39 40 200 47% 28 37% 87 scorpio34 51 37 37 200 50% 50 59% 88 scorpio1 53 29 29 400 56% 96 30% 89 scorpio108 54 38 38 200 46% 28 48% 90 scorpio92 55 38 38 200 50% 54 47% 91 scorpio35 60 37 37 200 48% 48 60% 92 scorpio105 60 38 38 200 47% 44 45% 93 scorpio95 61 38 38 200 51% 66 50% 94 scorpio96 61 38 39 200 48% 44 43% 95 scorpio93 63 39 39 200 50% 62 40% 96 scorpio44 64 37 37 200 50% 62 54% 97 scorpio4 64 28 28 400 50% 66 39% 98 scorpio54 65 37 37 200 49% 58 54% 99 scorpio106 65 39 39 200 48% 48 39% 100 scorpio5 66 28 28 400 49% 56 42% 101 scorpio3 66 28 28 400 51% 72 39% 102 scorpio42 68 37 37 200 50% 69 57% 103 scorpio94 70 38 38 200 49% 62 45% 104 scorpio31 73 37 37 200 52% 84 57% 105 scorpio59 74 38 38 200 56% 112 46% 106 scorpio45 74 37 37 200 50% 71 57% 107 scorpio84 75 38 38 200 49% 70 44% 108 scorpio32 77 38 38 200 47% 57 49% 109 scorpio46 78 37 37 200 46% 56 57% 110 scorpio2 80 28 28 400 47% 59 38% 111 scorpio49 83 38 38 200 51% 87 49% 112 scorpio53 83 38 38 200 50% 86 49% 113 scorpio25 84 38 38 200 47% 65 50% 114 scorpio58 86 38 38 200 46% 62 46% 115 scorpio41 87 37 37 200 45% 59 54% 116 scorpio62 90 39 39 200 58% 138 39% 117 scorpio30 91 37 37 200 49% 85 55% 118 scorpio26 93 38 38 200 52% 106 50% 119 scorpio29 96 38 38 200 54% 118 47% 120 scorpio52 107 38 38 200 50% 104 51% 121 scorpio83 107 38 38 200 53% 123 47% 122 scorpio0 111 41 42 200 42% 53 24% 123 scorpio51 125 37 37 200 49% 116 55% 124 scorpio50 125 37 37 200 46% 104 54% 125 scorpio27 128 38 38 200 48% 119 50% 126 scorpio63 134 39 39 200 49% 131 41% 127 scorpio60 137 38 38 200 45% 108 50% 128 scorpio61 141 38 39 200 46% 114 44% 129 scorpio71 141 39 39 200 53% 157 38% 130 scorpio28 144 38 38 200 45% 112 47% 131 scorpio66 150 38 38 200 52% 162 44% 132 scorpio72 155 39 39 200 52% 164 41% 133 scorpio65 156 38 38 200 51% 161 48% 134 scorpio70 159 38 38 200 51% 165 48% 135 scorpio67 168 39 39 200 49% 160 40% 136 scorpio68 171 39 39 200 51% 178 42% 137 scorpio82 171 38 39 200 46% 146 45% 138 scorpio64 172 38 39 200 46% 145 43% 139 scorpio76 179 39 39 200 54% 206 39% 140 scorpio81 185 39 39 200 53% 205 42% 141 scorpio73 186 38 38 200 48% 176 46% 142 scorpio69 188 38 38 200 46% 165 47% 143 scorpio74 196 39 39 200 50% 196 38% 144 scorpio79 203 37 37 200 55% 233 52% 145 scorpio75 205 39 39 200 47% 188 39% 146 scorpio77 207 40 40 200 50% 203 35% 147 scorpio78 228 38 39 200 46% 205 44% 148 scorpio80 238 38 38 200 43% 194 47%
best way to determine elos of a group
Moderators: bob, hgm, Harvey Williamson
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Re: best way to determine elos of a group
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.
Re: best way to determine elos of a group
Maybe these terms refer to the exact posterior distribution suitably discretized? But how would one avoid dimensional explosion? I looked in the source of BayesElo but I could not understand it.
I am not an expert in Bayesian statistics, but I thought the mathematically exact methods use Monte Carlo sampling from the posterior e.g. using MCMC (Markov Chain Monte Carlo). Such methods scale well with the dimension.
I am not an expert in Bayesian statistics, but I thought the mathematically exact methods use Monte Carlo sampling from the posterior e.g. using MCMC (Markov Chain Monte Carlo). Such methods scale well with the dimension.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.
Re: best way to determine elos of a group
Ok I am guessing that exactdist uses the (1dimensional) posterior for one elo assuming the other elos are exact.
And jointdist uses the true posterior. It seems to me that for more than a few players the naive (nonMonte Carlo) implementation would take a lot of memory and would be slow.
And jointdist uses the true posterior. It seems to me that for more than a few players the naive (nonMonte Carlo) implementation would take a lot of memory and would be slow.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.

 Posts: 3810
 Joined: Tue Mar 14, 2006 10:34 am
 Location: Ethiopia
 Contact:
Re: best way to determine elos of a group
Yes, the exactdist methods is same us elostat's assumption opponent's elos are their true elos. Maybe Remi has exactdist as a benchmark to evaluateMichel wrote: ↑Tue Jul 30, 2019 9:50 pmOk I am guessing that exactdist uses the (1dimensional) posterior for one elo assuming the other elos are exact.
And jointdist uses the true posterior. It seems to me that for more than a few players the naive (nonMonte Carlo) implementation would take a lot of memory and would be slow.
the other methods. The results from my quick test also seem to confirm that.
The joinntdist was still running after 10 minutes before i had to stop it. I will try again later to see what kind of error bounds it produces compared to
the full hessian inverse method which seem to be the better approach so far IMO.
Re: best way to determine elos of a group
Well near its maximum the posterior is multivariate Gaussian but the posterior (if derived from a logistic function) has fatter tails (they are e^{ax} instead of e^{ax^2}) so it is not inconceivable that in degenerate situations (i.e. a poorly connected tournament graph and few games) the true Bayesian credibility intervals would be different from those estimated with a multivariate Gaussian (I am not saying they will be, just that it seems possible).The joinntdist was still running after 10 minutes before i had to stop it. I will try again later to see what kind of error bounds it produces compared to
the full hessian inverse method which seem to be the better approach so far IMO.
As usual I am interested in the mathematical side of this. Standard Bayesian practice to obtain point estimates and credibility intervals is to sample from the posterior. By accident I happen to have some experience with MCMC. It is a wonderful method that you can throw at anything but it needs fine tuning for the particular problem at hand (one issue is that it is not so easy to see when it has converged). Since in this case we already have a good approximation of the posterior (a Gaussian) perhaps there are better methods to sample from the posterior.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.

 Posts: 3810
 Joined: Tue Mar 14, 2006 10:34 am
 Location: Ethiopia
 Contact:
Re: best way to determine elos of a group
I run the it overnight and it still didn't finish! Note that I did not use the poorly connected graph we were discussing about, but something else with just 10 players. It looks like even a 3dimension problem takes too long with the joint probablity distribution method. The only thing i managed to get a result out of it is for 2 players and the results are similar to the rest. Both inverse diagonal and full inverse are also indistinguishable with two players only.Michel wrote: ↑Wed Jul 31, 2019 6:32 amWell near its maximum the posterior is multivariate Gaussian but the posterior (if derived from a logistic function) has fatter tails (they are e^{ax} instead of e^{ax^2}) so it is not inconceivable that in degenerate situations (i.e. a poorly connected tournament graph and few games) the true Bayesian credibility intervals would be different from those estimated with a multivariate Gaussian (I am not saying they will be, just that it seems possible).The joinntdist was still running after 10 minutes before i had to stop it. I will try again later to see what kind of error bounds it produces compared to
the full hessian inverse method which seem to be the better approach so far IMO.
As usual I am interested in the mathematical side of this. Standard Bayesian practice to obtain point estimates and credibility intervals is to sample from the posterior. By accident I happen to have some experience with MCMC. It is a wonderful method that you can throw at anything but it needs fine tuning for the particular problem at hand (one issue is that it is not so easy to see when it has converged). Since in this case we already have a good approximation of the posterior (a Gaussian) perhaps there are better methods to sample from the posterior.
In any case, this method is impractical in the current nonmontecarlo implementation form. Since you have MCMC experience, maybe you can implement something for comparing it to the full hessian inverse method?
Re: best way to determine elos of a group
It is an enticing idea but I am rather busy professionally right now. I'll see what I can do.Daniel Shawul wrote: ↑Wed Jul 31, 2019 3:24 pmI run the it overnight and it still didn't finish! Note that I did not use the poorly connected graph we were discussing about, but something else with just 10 players. It looks like even a 3dimension problem takes too long with the joint probablity distribution method. The only thing i managed to get a result out of it is for 2 players and the results are similar to the rest. Both inverse diagonal and full inverse are also indistinguishable with two players only.Michel wrote: ↑Wed Jul 31, 2019 6:32 amWell near its maximum the posterior is multivariate Gaussian but the posterior (if derived from a logistic function) has fatter tails (they are e^{ax} instead of e^{ax^2}) so it is not inconceivable that in degenerate situations (i.e. a poorly connected tournament graph and few games) the true Bayesian credibility intervals would be different from those estimated with a multivariate Gaussian (I am not saying they will be, just that it seems possible).The joinntdist was still running after 10 minutes before i had to stop it. I will try again later to see what kind of error bounds it produces compared to
the full hessian inverse method which seem to be the better approach so far IMO.
As usual I am interested in the mathematical side of this. Standard Bayesian practice to obtain point estimates and credibility intervals is to sample from the posterior. By accident I happen to have some experience with MCMC. It is a wonderful method that you can throw at anything but it needs fine tuning for the particular problem at hand (one issue is that it is not so easy to see when it has converged). Since in this case we already have a good approximation of the posterior (a Gaussian) perhaps there are better methods to sample from the posterior.
In any case, this method is impractical in the current nonmontecarlo implementation form. Since you have MCMC experience, maybe you can implement something for comparing it to the full hessian inverse method?
In any case Gibbs sampling https://en.wikipedia.org/wiki/Gibbs_sampling appears to be well adapted to
elo estimation. It reduces the problem to sampling from the posterior for a single player against n1 other
players with given fixed elo (and then one cycles through the players each time updating the elo with the latest sample).
The case of a single player is maybe easiest to do by discretization of the posterior.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.
Re: best way to determine elos of a group
I am on the phone now, on a vacation. I would propose an adhoc approach adjusted to your needs.Laskos wrote: ↑Tue Jul 30, 2019 2:40 pmWell, 2 times as big if he starts the procedure from the beginning of his line of nets. The issue might become whether that engine as an opponent is not peculiar in some ways and what really we want to measure ("strength", I suppose, but in relation to "something"). How the gating would look now? For example, would it be an orderly gating to require that each successive net performs better against "fixed opponents" than the previous net?Michel wrote: ↑Tue Jul 30, 2019 11:08 amTo get an elo estimate with fixed variance you should test against a fixed engine (or group of engines). But this then not good for comparing engines (it is well known that the variance of elo difference measured against
a 3d engine is 4 times as big as when measured in direct play).
First, you have to have a rough estimate of how many nets you will build. Say N=400. The error margins towards the end of the run will explode as N^(1/2) times the error margins of the first net after the anchor net if you play successive nets.
Set N^(1/2) "anchor" nets, in your case 400^(1/2) = 20 nets from ID20, 40, 60,..., to ID400. The true anchor is ID0. For ratings play the net against the last "anchor". If you hit another "anchor" ID, say ID80, play it against the anchor ID60 four time more games compared to usual nets (setting so the new "anchor" ID80).
This way, the final nets of the run will have error margins larger only by a factor of less than N^(1/4) times the initial error margins, better than previous N^(1/2). ID400 will have, if I am not completely wrong, N^(1/4)/2 or, in your case with 400 nets whole run, only some 2 times larger error margins for absolute Elo (no gating) than the error margins of the first net.
I am not sure if it's close to optimum use of resources (the effort is only slightly larger than playing successive nets, with a significant improvement of precision in absolute Elo measurement). The optimal use of resources will surely depend on the task you have to accomplish.
Re: best way to determine elos of a group
Forgot to mention. All this adhoc procedure is if you don't want or cannot (too large an Elo span) play all the games against ID0 true anchor. Here you have some diversity of opponents, and Elo differences in games shouldn't be too large. The final error margins are square root tamed compared to the usual successive play. If you want even tamer error margins, sparser say N^(1/4) number of "anchors" can be projected, but maybe the Elo differences between them is too large and the number of these "anchors" is too small.Laskos wrote: ↑Thu Aug 01, 2019 6:43 pmI am on the phone now, on a vacation. I would propose an adhoc approach adjusted to your needs.Laskos wrote: ↑Tue Jul 30, 2019 2:40 pmWell, 2 times as big if he starts the procedure from the beginning of his line of nets. The issue might become whether that engine as an opponent is not peculiar in some ways and what really we want to measure ("strength", I suppose, but in relation to "something"). How the gating would look now? For example, would it be an orderly gating to require that each successive net performs better against "fixed opponents" than the previous net?Michel wrote: ↑Tue Jul 30, 2019 11:08 amTo get an elo estimate with fixed variance you should test against a fixed engine (or group of engines). But this then not good for comparing engines (it is well known that the variance of elo difference measured against
a 3d engine is 4 times as big as when measured in direct play).
First, you have to have a rough estimate of how many nets you will build. Say N=400. The error margins towards the end of the run will explode as N^(1/2) times the error margins of the first net after the anchor net if you play successive nets.
Set N^(1/2) "anchor" nets, in your case 400^(1/2) = 20 nets from ID20, 40, 60,..., to ID400. The true anchor is ID0. For ratings play the net against the last "anchor". If you hit another "anchor" ID, say ID80, play it against the anchor ID60 four time more games compared to usual nets (setting so the new "anchor" ID80).
This way, the final nets of the run will have error margins larger only by a factor of less than N^(1/4) times the initial error margins, better than previous N^(1/2). ID400 will have, if I am not completely wrong, N^(1/4)/2 or, in your case with 400 nets whole run, only some 2 times larger error margins for absolute Elo (no gating) than the error margins of the first net.
I am not sure if it's close to optimum use of resources (the effort is only slightly larger than playing successive nets, with a significant improvement of precision in absolute Elo measurement). The optimal use of resources will surely depend on the task you have to accomplish.

 Posts: 3810
 Joined: Tue Mar 14, 2006 10:34 am
 Location: Ethiopia
 Contact:
Re: best way to determine elos of a group
Ok Kai, I will try your suggestion once the current run ends  or maybe i start doing it from where it stopped now. I have given up on the idea of matching IDx and x+1 once we figured bayeselo gives really high variance.Laskos wrote: ↑Thu Aug 01, 2019 6:43 pmI am on the phone now, on a vacation. I would propose an adhoc approach adjusted to your needs.Laskos wrote: ↑Tue Jul 30, 2019 2:40 pmWell, 2 times as big if he starts the procedure from the beginning of his line of nets. The issue might become whether that engine as an opponent is not peculiar in some ways and what really we want to measure ("strength", I suppose, but in relation to "something"). How the gating would look now? For example, would it be an orderly gating to require that each successive net performs better against "fixed opponents" than the previous net?Michel wrote: ↑Tue Jul 30, 2019 11:08 amTo get an elo estimate with fixed variance you should test against a fixed engine (or group of engines). But this then not good for comparing engines (it is well known that the variance of elo difference measured against
a 3d engine is 4 times as big as when measured in direct play).
First, you have to have a rough estimate of how many nets you will build. Say N=400. The error margins towards the end of the run will explode as N^(1/2) times the error margins of the first net after the anchor net if you play successive nets.
Set N^(1/2) "anchor" nets, in your case 400^(1/2) = 20 nets from ID20, 40, 60,..., to ID400. The true anchor is ID0. For ratings play the net against the last "anchor". If you hit another "anchor" ID, say ID80, play it against the anchor ID60 four time more games compared to usual nets (setting so the new "anchor" ID80).
This way, the final nets of the run will have error margins larger only by a factor of less than N^(1/4) times the initial error margins, better than previous N^(1/2). ID400 will have, if I am not completely wrong, N^(1/4)/2 or, in your case with 400 nets whole run, only some 2 times larger error margins for absolute Elo (no gating) than the error margins of the first net.
I am not sure if it's close to optimum use of resources (the effort is only slightly larger than playing successive nets, with a significant improvement of precision in absolute Elo measurement). The optimal use of resources will surely depend on the task you have to accomplish.
From my quick tests, It looks like every 100 net (i.e. each net with just 512 games) I am getting 150 elos so far.