Michel wrote: ↑Sun Aug 04, 2019 11:11 am
Laskos wrote: ↑Sun Aug 04, 2019 10:12 am
To add:
The effort in the first, "naive" case, is 5000 games. In the second, uniformly spaced --- 8000 games. In the third, k^2 spaced --- 8000 games again.
The effort with hundreds of nets will be only slightly, say 10% larger than the "naive" case, and the effort goes to "firm up" the anchors. More nets, sparser the anchors are (I have chosen somewhat arbitrarily the square root sparseness).
It seems you have come up with a pretty nice scheme!
I still like my own idea of a self organizing tree (rather than shaping the tree beforehand as you are doing) but I have not actually checked that it would work satisfactorily. The advantage of shaping the tree beforehand is that you can allocate an optimal number of games to the edges in advance (more to the stem, less to the side branches).
I think I can propose a less ad-hoc methodology. One important thing is the shape of the derivative of the logistic. We want to have anchors in the places of the run where the current net is not far Elo-wise towards the last anchor, i.e. the Elo is such that the derivative of the logistic for that Elo is high. Here is the shape of the derivative of the usual logistic:
D_logistic.jpg
It is pretty obvious that for 200 Elo points difference, the derivative deviates enough to warrant a new anchor, A new anchor added to N previous anchors adds a factor of 1/N to the new variance. The proposal is this: the first anchor is the true anchor ID-0. Play successive nets against this anchor for established number of games, say 200. As soon as you find an Elo difference above 200, set this net as the new anchor and play 4 or 8 times more games, say 1600 games ("firming up"). Do the same with successive nets against this new anchor. So on. Ad-hoc for diversity of opponents: estimate the length of the whole run, planned to contain very roughly N IDs. If for a 2 * sqrt(N) nets no net came 200 Elo above (or below, in fact) the last anchor, set the new anchor at this 2 * sqrt(N) distance from the last anchor.
A silly subtlety here, looking at the plot, is whether for long runs (say 1000 IDs) the threshold should be 200 Elo points or 150 Elo points, as on long runs and many anchors the cost of adding a new anchor diminishes. Full runs from random player to some good strength should run for 3000-4000 absolute Elo compared to fixed 0 of random net. So, about 20 anchors according to the threshold, and if some 1000 IDs, maybe some 10-15 more anchors due to ad-hoc "diversity criterion" (2 * sqrt(N) spacing). This way, no large Elo differences will occur and that without inflating the number of nets. No prior knowledge of run behavior is needed aside the rough estimated of its IDs length. Ad-hoc spacing might go to higher values if one is really protracting the end of the run.
Here are the results in Ordo for best behaved until now, but specific k^2 spacing, and this more general proposal:
k^2 spacing of anchors:
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%) CFS(next)
1 SF24 : 1459.15 62.99 141.5 200 70.8 70
2 SF25 : 1446.17 46.11 1107.0 1600 69.2 84
3 SF23 : 1422.71 62.29 132.5 200 66.3 55
4 SF22 : 1418.82 62.31 131.5 200 65.8 52
5 SF20 : 1416.88 61.29 131.0 200 65.5 91
6 SF21 : 1373.93 61.94 119.5 200 59.8 72
7 SF18 : 1355.94 60.41 114.5 200 57.3 83
8 SF17 : 1327.67 60.51 106.5 200 53.3 64
9 SF19 : 1317.17 61.23 103.5 200 51.8 72
10 SF16 : 1304.94 43.59 2410.5 4800 50.2 90
11 SF15 : 1271.38 63.79 156.0 200 78.0 88
12 SF13 : 1230.81 61.51 147.5 200 73.8 63
13 SF14 : 1219.71 60.87 145.0 200 72.5 96
14 SF11 : 1162.42 59.04 131.0 200 65.5 52
15 SF12 : 1160.49 59.42 130.5 200 65.3 100
16 SF10 : 1066.20 56.91 104.5 200 52.3 78
17 SF9 : 1050.47 39.30 2149.5 4400 48.9 97
18 SF8 : 986.44 69.00 176.0 200 88.0 99
19 SF7 : 894.80 60.91 162.5 200 81.3 90
20 SF6 : 849.82 57.11 154.0 200 77.0 100
21 SF5 : 707.91 52.96 119.5 200 59.8 100
22 SF4 : 638.91 30.14 1770.5 4000 44.3 100
23 SF3 : 536.33 58.23 167.5 200 83.8 100
24 SF2 : 400.24 48.72 140.5 200 70.3 100
25 SF1 : 250.23 18.42 1539.5 3600 42.8 100
26 SF0 : 0.00 ---- 308.0 1600 19.3 ---
The more general proposal. Anchors are found to be after 200 games as per usual nets and are: 1,3,6,9,14,23 (200 Elo points threshold):
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%) CFS(next)
1 SF24 : 1432.17 62.81 104.5 200 52.3 64
2 SF25 : 1421.72 62.36 101.5 200 50.8 60
3 SF23 : 1416.50 45.58 1393.0 2000 69.7 60
4 SF22 : 1410.19 63.55 148.5 200 74.3 68
5 SF20 : 1394.61 63.79 145.0 200 72.5 76
6 SF21 : 1371.30 61.40 139.5 200 69.8 99
7 SF18 : 1296.51 59.89 120.0 200 60.0 55
8 SF19 : 1292.89 61.64 119.0 200 59.5 52
9 SF17 : 1291.09 60.04 118.5 200 59.3 55
10 SF16 : 1287.49 59.08 117.5 200 58.8 68
11 SF15 : 1273.23 59.95 113.5 200 56.8 99
12 SF14 : 1225.97 41.86 2184.0 4800 45.5 64
13 SF13 : 1216.46 61.14 148.5 200 74.3 95
14 SF12 : 1161.37 59.39 135.5 200 67.8 79
15 SF11 : 1136.12 58.26 129.0 200 64.5 99
16 SF10 : 1065.39 56.88 109.5 200 54.8 94
17 SF9 : 1032.24 37.54 1891.5 4000 47.3 97
18 SF8 : 985.99 56.71 142.0 200 71.0 97
19 SF7 : 926.56 55.50 127.0 200 63.5 100
20 SF6 : 830.23 33.86 1845.5 3600 51.3 100
21 SF5 : 734.77 55.26 148.5 200 74.3 99
22 SF4 : 658.25 51.61 130.0 200 65.0 100
23 SF3 : 550.55 27.36 1735.0 3600 48.2 100
24 SF2 : 420.11 50.94 143.0 200 71.5 100
25 SF1 : 260.10 18.90 1617.5 3400 47.6 100
26 SF0 : 0.00 ---- 293.0 1600 18.3 ---
We see that this more general case behaves as well as the best previous k^2 ad-hoc procedure (compare the errors of ID25 in the first case and ID23 in the general case). Some discussion might be about choosing the threshold to be 200 or 150 Elo points for long runs, and the ad-hoc spacing for diversity to 2 or 3 or 4 times sqrt(N), if the runs are protracted.