Why all Lc0 runs result in such similarity of quiet moves selection?

Laskos · Post by **Laskos** » Wed May 01, 2019 6:38 pm

I run a bit modified by Adam Hair (hash cleared before each position) Sim03 tester by Don Dailey. The 8,300 positions tested are quiet positions from real games, having several close in value options for the best move. Engines are on one thread (aside Lc0 on 2 threads), time per positions is 100ms. Here are the cross-percentages for matched moves engine-wise (scroll down the box):

Code: Select all

sim

  Key:

  1) Andscacs 0.95 (time: 100 ms  scale: 1.0)
  2) Ethereal 11.25 (time: 100 ms  scale: 1.0)
  3) Fire 7.1 (time: 100 ms  scale: 1.0)
  4) Fruit 2.1 (time: 100 ms  scale: 1.0)
  5) Komodo 12.3 (time: 100 ms  scale: 1.0)
  6) Lc0 11261 (time: 100 ms  scale: 1.0)
  7) Lc0 32930 (time: 100 ms  scale: 1.0)
  8) Lc0 42184 (time: 100 ms  scale: 1.0)
  9) Senpai 1.0 (time: 100 ms  scale: 1.0)
 10) SF 10 (time: 100 ms  scale: 1.0)
 11) SF 8 (time: 100 ms  scale: 1.0)
 12) SF dev (time: 100 ms  scale: 1.0)

         1     2     3     4     5     6     7     8     9    10    11    12
  1.  ----- 49.19 45.69 37.95 48.17 44.90 43.65 44.22 46.88 50.36 52.22 49.93
  2.  49.19 ----- 48.05 39.58 48.57 47.14 45.29 45.76 48.66 52.15 52.48 52.09
  3.  45.69 48.05 ----- 40.17 46.43 43.41 42.23 43.12 45.35 48.36 50.24 47.69
  4.  37.95 39.58 40.17 ----- 39.51 36.34 35.54 35.72 46.55 37.81 39.88 37.50
  5.  48.17 48.57 46.43 39.51 ----- 45.82 44.54 45.21 48.28 50.10 51.18 50.15
  6.  44.90 47.14 43.41 36.34 45.82 ----- 71.28 71.04 42.90 49.92 47.74 49.78
  7.  43.65 45.29 42.23 35.54 44.54 71.28 ----- 74.81 42.11 48.82 46.88 48.09
  8.  44.22 45.76 43.12 35.72 45.21 71.04 74.81 ----- 42.95 49.34 47.44 48.94
  9.  46.88 48.66 45.35 46.55 48.28 42.90 42.11 42.95 ----- 46.42 48.07 46.56
 10.  50.36 52.15 48.36 37.81 50.10 49.92 48.82 49.34 46.42 ----- 58.76 63.17
 11.  52.22 52.48 50.24 39.88 51.18 47.74 46.88 47.44 48.07 58.76 ----- 57.13
 12.  49.93 52.09 47.69 37.50 50.15 49.78 48.09 48.94 46.56 63.17 57.13 -----

And the dendrogram of similarity in moves selection:

Lc0_Dendrogram.jpg

We see from both the matrix and the dendrogram that all 3 different Lc0 runs are so closely related in this Sim test, that they are MUCH closer one to another than SF_dev to SF10. I was expecting a quite different picture. Even inside the same run, there are many "drift areas" in the "optima landscape" for NN values, many local "optima" and many possibilities to reach some run-dependent more general optimum. So, I expected even inside the same run to find many dissimilarities between the NN nets, never mind very different runs. But I do not understand well this stuff.

Also, we know that positionally (on this Sim 8,300 quiet positions too) Lc0 late nets of a particular run are VERY strong. Is there a possibility that the evals of these different NNs converge to some common choices for different runs just because objectively stronger moves are quite unique even in quiet positions? And the sheer strength makes the nets convergent?
Another possibility is that all three runs are almost identic, varying only in irrelevant details.
And another is that all zero or quasi-zero runs give similar move-selection nets, at least positionally. That would mean that Lc0 must be very similar to Alpha0 positionally.

A note: observe how Stockfishes cluster together, some other new engines not far away, and 2 engines by Fabien Letouzey, Fruit 2.1 and Senpai 1.0 cluster separately together (although they are not closely related). Also, all Lc0s are very unrelated to other engines (but extremely related among themselves).

Laskos · Post by **Laskos** » Wed May 01, 2019 10:55 pm

I also checked whether Lc0 runs are similar from early stages till the end, checking several stages t40 nets and two t30 nets. Here is the matrix:

Code: Select all

sim version 3

  Key:

  1) Lc0 11261 (time: 100 ms  scale: 1.0)
  2) Lc0 30643 (time: 100 ms  scale: 1.0)
  3) Lc0 32930 (time: 100 ms  scale: 1.0)
  4) Lc0 40320 (time: 100 ms  scale: 1.0)
  5) Lc0 41006 (time: 100 ms  scale: 1.0)
  6) Lc0 41654 (time: 100 ms  scale: 1.0)
  7) Lc0 42184 (time: 100 ms  scale: 1.0)

         1     2     3     4     5     6     7
  1.  ----- 50.07 71.28 61.31 69.57 71.13 71.04
  2.  50.07 ----- 48.06 52.21 49.08 48.53 48.36
  3.  71.28 48.06 ----- 61.60 72.13 74.80 74.81
  4.  61.31 52.21 61.60 ----- 65.80 63.13 62.65
  5.  69.57 49.08 72.13 65.80 ----- 75.80 74.87
  6.  71.13 48.53 74.80 63.13 75.80 ----- 82.05
  7.  71.04 48.36 74.81 62.65 74.87 82.05 -----

and the dendrogram:

Lc0_Dendrogram2.jpg

The scale in this second dendrogram is different from that of the first one (first post), here all the nets aside the early ID30643 would be considered "clones" if standard engines were involved. You can see the numbers in the matrix.
We can probably infer that the learning process starts differently in earlier stages of different runs, but then converges to have very similar nets towards the end of the runs. I am not sure what that means. Maybe really, even a quiet position has a pretty unique best move, and through learning nets converge to pick it? And this convergence shows an extremely strong positional play towards the end of the runs (10xxx, 30xxx, 40xxx runs)?

But maybe these zero and quasi-zero runs simply have very similar nets as outcomes, which do not necessarily pick the best move, but the same move?

Laskos · Post by **Laskos** » Wed May 01, 2019 11:37 pm

Then, maybe it's because the engine is the same (I used v0.21.1)? But then what's the matter in the early learning stages of the runs, where the engine+nets are not that similar? In the early stages of the learning, different runs differ like unrelated regular engines differ.

Laskos · Post by **Laskos** » Thu May 02, 2019 12:15 am

I will leave overnight these 7 nets (engine v0.21.1) to see how they behave with longer TC (300ms/position or about 2000-2500 nodes/position on my GPU). 100ms are good for regular engines, but might be too short for Leela (it starts rather slowly).

Laskos · Post by **Laskos** » Thu May 02, 2019 9:01 am

At longer time control (300ms/position, or about 2000-2500 nodes per position on my GPU), the similarity among Lc0 nets is even HIGHER by 1-2%. I also included a 10b net ID36091, as I was curious how this late in the t35 run thing will behave. And again, the similarity is very high with late in the runs 20b nets. Here is the matrix for all included engines (cross-read the percentage of matched moves).

Code: Select all

sim

  Key:

  1) Andscacs 0.95 (time: 100 ms  scale: 1.0)
  2) Ethereal 11.25 (time: 100 ms  scale: 1.0)
  3) Fire 7.1 (time: 100 ms  scale: 1.0)
  4) Fruit 2.1 (time: 100 ms  scale: 1.0)
  5) Komodo 12.3 (time: 100 ms  scale: 1.0)
  6) Lc0 11261 (time: 100 ms  scale: 3.0)
  7) Lc0 30643 (time: 100 ms  scale: 3.0)
  8) Lc0 32930 (time: 100 ms  scale: 3.0)
  9) Lc0 36091 (time: 100 ms  scale: 3.0)
 10) Lc0 40320 (time: 100 ms  scale: 3.0)
 11) Lc0 41006 (time: 100 ms  scale: 3.0)
 12) Lc0 41654 (time: 100 ms  scale: 3.0)
 13) Lc0 42184 (time: 100 ms  scale: 3.0)
 14) Senpai 1.0 (time: 100 ms  scale: 1.0)
 15) SF 10 (time: 100 ms  scale: 1.0)
 16) SF 8 (time: 100 ms  scale: 1.0)
 17) SF dev (time: 100 ms  scale: 1.0)

         1     2     3     4     5     6     7     8     9    10    11    12	13    14    15    16    17
  1.  ----- 49.19 45.69 37.95 48.17 44.94 40.30 44.21 44.88 44.34 44.17 44.74 45.00 46.88 50.36 52.22 49.93
  2.  49.19 ----- 48.05 39.58 48.57 46.83 42.27 45.90 46.90 46.88 46.64 46.37 46.15 48.66 52.15 52.48 52.09
  3.  45.69 48.05 ----- 40.17 46.43 42.87 39.97 42.27 43.20 43.60 42.69 43.34 43.00 45.35 48.36 50.24 47.69
  4.  37.95 39.58 40.17 ----- 39.51 35.91 34.87 35.70 35.96 37.00 36.08 36.14 35.79 46.55 37.81 39.88 37.50
  5.  48.17 48.57 46.43 39.51 ----- 45.78 41.31 45.10 45.95 45.40 45.16 45.40 45.63 48.28 50.10 51.18 50.15
  6.  44.94 46.83 42.87 35.91 45.78 ----- 51.27 72.89 69.60 64.08 71.05 72.97 73.11 42.92 50.78 48.53 50.41
  7.  40.30 42.27 39.97 34.87 41.31 51.27 ----- 48.99 50.04 52.20 50.29 50.16 49.45 40.08 43.55 42.39 43.11
  8.  44.21 45.90 42.27 35.70 45.10 72.89 48.99 ----- 70.95 64.83 74.27 76.66 76.61 42.73 50.39 47.95 49.54
  9.  44.88 46.90 43.20 35.96 45.95 69.60 50.04 70.95 ----- 66.16 71.16 71.55 71.19 43.70 50.23 48.24 50.25
 10.  44.34 46.88 43.60 37.00 45.40 64.08 52.20 64.83 66.16 ----- 68.38 66.75 65.70 43.71 49.45 47.44 48.75
 11.  44.17 46.64 42.69 36.08 45.16 71.05 50.29 74.27 71.16 68.38 ----- 78.14 77.71 43.11 49.89 48.30 49.62
 12.  44.74 46.37 43.34 36.14 45.40 72.97 50.16 76.66 71.55 66.75 78.14 ----- 84.04 43.34 49.95 48.20 49.71
 13.  45.00 46.15 43.00 35.79 45.63 73.11 49.45 76.61 71.19 65.70 77.71 84.04 ----- 43.06 49.79 48.18 49.66
 14.  46.88 48.66 45.35 46.55 48.28 42.92 40.08 42.73 43.70 43.71 43.11 43.34 43.06 ----- 46.42 48.07 46.56
 15.  50.36 52.15 48.36 37.81 50.10 50.78 43.55 50.39 50.23 49.45 49.89 49.95 49.79 46.42 ----- 58.76 63.17
 16.  52.22 52.48 50.24 39.88 51.18 48.53 42.39 47.95 48.24 47.44 48.30 48.20 48.18 48.07 58.76 ----- 57.13
 17.  49.93 52.09 47.69 37.50 50.15 50.41 43.11 49.54 50.25 48.75 49.62 49.71 49.66 46.56 63.17 57.13 -----

The dendrogram is here:

Lc0_Dendrogram3.jpg

Only early run ID30643 stands as not very related.

If I separate only Leela engines (some sort of zooming into previous dendrogram), the dendrogram looks like:

Lc0_Dendrogram4.jpg

We again see the same at longer time control too, that the move selection in quiet positions can be divergent in early stages of the learning, but converges late in the runs for all 20b and even 10b nets.

I am not sure what that means.

The curious thing is that if you look at the matrix, closest to Lc0 move choices are SF_10 and SF_dev, the strongest AB engines. I will put now SF_dev on 4 cores and 500ms/position to see if this much stronger SF_dev approaches Lc0 nets in similarity compared to 1 thread at 100ms/position. It might indicate that indeed the strength is a factor in this similarity of Lc0 results of runs.

Laskos · Post by **Laskos** » Thu May 02, 2019 10:23 am

VERY interesting.

"SF dev Strong", that is, SF_dev on 4 cores at 500ms/position compared to SF_dev on 1 core at 100ms/position is SIGNIFICANTLY closer to Lc0 late nets of each run 10xxx, 30xxx, 35xxx, 40xxx, by a pretty hefty margin of about 3.5% of matched moves in quiet positions. The matrix is here:

Code: Select all

sim

  Key:

  1) Andscacs 0.95 (time: 100 ms  scale: 1.0)
  2) Ethereal 11.25 (time: 100 ms  scale: 1.0)
  3) Fire 7.1 (time: 100 ms  scale: 1.0)
  4) Fruit 2.1 (time: 100 ms  scale: 1.0)
  5) Komodo 12.3 (time: 100 ms  scale: 1.0)
  6) Lc0 11261 (time: 100 ms  scale: 3.0)
  7) Lc0 30643 (time: 100 ms  scale: 3.0)
  8) Lc0 32930 (time: 100 ms  scale: 3.0)
  9) Lc0 36091 (time: 100 ms  scale: 3.0)
 10) Lc0 40320 (time: 100 ms  scale: 3.0)
 11) Lc0 41006 (time: 100 ms  scale: 3.0)
 12) Lc0 41654 (time: 100 ms  scale: 3.0)
 13) Lc0 42184 (time: 100 ms  scale: 3.0)
 14) Senpai 1.0 (time: 100 ms  scale: 1.0)
 15) SF 10 (time: 100 ms  scale: 1.0)
 16) SF 8 (time: 100 ms  scale: 1.0)
 17) SF dev (time: 100 ms  scale: 1.0)
 18) SF dev Strong (time: 100 ms  scale: 5.0)

         1     2     3     4     5     6     7     8     9    10    11    12	13    14    15    16    17    18
  1.  ----- 49.19 45.69 37.95 48.17 44.94 40.30 44.21 44.88 44.34 44.17 44.74 45.00 46.88 50.36 52.22 49.93 49.13
  2.  49.19 ----- 48.05 39.58 48.57 46.83 42.27 45.90 46.90 46.88 46.64 46.37 46.15 48.66 52.15 52.48 52.09 50.78
  3.  45.69 48.05 ----- 40.17 46.43 42.87 39.97 42.27 43.20 43.60 42.69 43.34 43.00 45.35 48.36 50.24 47.69 46.04
  4.  37.95 39.58 40.17 ----- 39.51 35.91 34.87 35.70 35.96 37.00 36.08 36.14 35.79 46.55 37.81 39.88 37.50 37.23
  5.  48.17 48.57 46.43 39.51 ----- 45.78 41.31 45.10 45.95 45.40 45.16 45.40 45.63 48.28 50.10 51.18 50.15 50.05
  6.  44.94 46.83 42.87 35.91 45.78 ----- 51.27 72.89 69.60 64.08 71.05 72.97 73.11 42.92 50.78 48.53 50.41 54.18
  7.  40.30 42.27 39.97 34.87 41.31 51.27 ----- 48.99 50.04 52.20 50.29 50.16 49.45 40.08 43.55 42.39 43.11 43.57
  8.  44.21 45.90 42.27 35.70 45.10 72.89 48.99 ----- 70.95 64.83 74.27 76.66 76.61 42.73 50.39 47.95 49.54 53.12
  9.  44.88 46.90 43.20 35.96 45.95 69.60 50.04 70.95 ----- 66.16 71.16 71.55 71.19 43.70 50.23 48.24 50.25 53.82
 10.  44.34 46.88 43.60 37.00 45.40 64.08 52.20 64.83 66.16 ----- 68.38 66.75 65.70 43.71 49.45 47.44 48.75 50.79
 11.  44.17 46.64 42.69 36.08 45.16 71.05 50.29 74.27 71.16 68.38 ----- 78.14 77.71 43.11 49.89 48.30 49.62 52.83
 12.  44.74 46.37 43.34 36.14 45.40 72.97 50.16 76.66 71.55 66.75 78.14 ----- 84.04 43.34 49.95 48.20 49.71 53.39
 13.  45.00 46.15 43.00 35.79 45.63 73.11 49.45 76.61 71.19 65.70 77.71 84.04 ----- 43.06 49.79 48.18 49.66 53.24
 14.  46.88 48.66 45.35 46.55 48.28 42.92 40.08 42.73 43.70 43.71 43.11 43.34 43.06 ----- 46.42 48.07 46.56 44.70
 15.  50.36 52.15 48.36 37.81 50.10 50.78 43.55 50.39 50.23 49.45 49.89 49.95 49.79 46.42 ----- 58.76 63.17 60.94
 16.  52.22 52.48 50.24 39.88 51.18 48.53 42.39 47.95 48.24 47.44 48.30 48.20 48.18 48.07 58.76 ----- 57.13 56.54
 17.  49.93 52.09 47.69 37.50 50.15 50.41 43.11 49.54 50.25 48.75 49.62 49.71 49.66 46.56 63.17 57.13 ----- 61.60
 18.  49.13 50.78 46.04 37.23 50.05 54.18 43.57 53.12 53.82 50.79 52.83 53.39 53.24 44.70 60.94 56.54 61.60 -----

As percentages go, this SF_dev_Strong has the highest matching moves numbers to late Lc0 nets of every run, higher compared to the matches to other AB engines. So, positional strength IS a significant factor in Lc0 runs convergence. How much it is a factor is hard to say. I am not sure what is the percentage of quiet positions having a unique best move. Some positions must have 2 or more best moves due to transpositions, for example, and some simply because, as with TBs, there are sometimes several choices to the same theoretical outcome. Still, a large fraction of quiet positions DO have unique best moves.
It is a distinct possibility that all Lc0 completed or late runs are extremely strong positionally and the convergence of the runs is due to that, although several other explanations are plausible too (same engine, MCTS search, very similar runs, etc.).

Laskos · Post by **Laskos** » Fri May 03, 2019 4:42 am

I put SF_dev at even longer time control, 3000ms/position on 4 cores and called it "SF dev 30 x 4". This "Very Strong Stockfish" now clusters with Leelas and not with regular engines at 100ms (1 core)! So, SF_dev is an Lc0 "wannabe" at long time control. In what compartment SF_dev aspires to be a Leela to longer TC? Probably the positional play, as shown in positional test suites. So, Sim "quiet" positions chosen by Don Dailey are representative of differences in positional play, but another conclusion would be that approaching from different sides (stronger nets in Leela case, longer TC in SF case, very different engines), the choice of positions somewhat converges (probably not anywhere close to 100% matches, but still to a fairly high ratio of matches). Here are the clustering dendrograms with SF_dev, which clusters with all the regular engines, and SF_dev 30x4, which becomes a Leela:

SF_dev

Lc0_Dendrogram5.jpg

SF_dev 30 x 4

Lc0_Dendrogram6.jpg

So, it might be that the convergence of strong nets of Leela on this Sim tester is partly due to positional strength.

The matching matrix is here:

Code: Select all

sim version 3

  Key:

  1) Andscacs 0.95 (time: 100 ms  scale: 1.0)
  2) Ethereal 11.25 (time: 100 ms  scale: 1.0)
  3) Fire 7.1 (time: 100 ms  scale: 1.0)
  4) Fruit 2.1 (time: 100 ms  scale: 1.0)
  5) Komodo 12.3 (time: 100 ms  scale: 1.0)
  6) Lc0 11261 (time: 100 ms  scale: 3.0)
  7) Lc0 30643 (time: 100 ms  scale: 3.0)
  8) Lc0 32930 (time: 100 ms  scale: 3.0)
  9) Lc0 36091 (time: 100 ms  scale: 3.0)
 10) Lc0 40320 (time: 100 ms  scale: 3.0)
 11) Lc0 41006 (time: 100 ms  scale: 3.0)
 12) Lc0 41654 (time: 100 ms  scale: 3.0)
 13) Lc0 42184 (time: 100 ms  scale: 3.0)
 14) Senpai 1.0 (time: 100 ms  scale: 1.0)
 15) SF 10 (time: 100 ms  scale: 1.0)
 16) SF 8 (time: 100 ms  scale: 1.0)
 17) SF dev (time: 100 ms  scale: 1.0)
 18) SF dev Strong (time: 100 ms  scale: 5.0)
 19) SF dev V.Strong (time: 100 ms  scale: 30.0)

         1     2     3     4     5     6     7     8     9    10    11    12	13    14    15    16    17    18    19
  1.  ----- 49.19 45.69 37.95 48.17 44.94 40.30 44.21 44.88 44.34 44.17 44.74 45.00 46.88 50.36 52.22 49.93 49.13 46.58
  2.  49.19 ----- 48.05 39.58 48.57 46.83 42.27 45.90 46.90 46.88 46.64 46.37 46.15 48.66 52.15 52.48 52.09 50.78 47.49
  3.  45.69 48.05 ----- 40.17 46.43 42.87 39.97 42.27 43.20 43.60 42.69 43.34 43.00 45.35 48.36 50.24 47.69 46.04 43.76
  4.  37.95 39.58 40.17 ----- 39.51 35.91 34.87 35.70 35.96 37.00 36.08 36.14 35.79 46.55 37.81 39.88 37.50 37.23 34.78
  5.  48.17 48.57 46.43 39.51 ----- 45.78 41.31 45.10 45.95 45.40 45.16 45.40 45.63 48.28 50.10 51.18 50.15 50.05 47.10
  6.  44.94 46.83 42.87 35.91 45.78 ----- 51.27 72.89 69.60 64.08 71.05 72.97 73.11 42.92 50.78 48.53 50.41 54.18 56.14
  7.  40.30 42.27 39.97 34.87 41.31 51.27 ----- 48.99 50.04 52.20 50.29 50.16 49.45 40.08 43.55 42.39 43.11 43.57 43.34
  8.  44.21 45.90 42.27 35.70 45.10 72.89 48.99 ----- 70.95 64.83 74.27 76.66 76.61 42.73 50.39 47.95 49.54 53.12 55.97
  9.  44.88 46.90 43.20 35.96 45.95 69.60 50.04 70.95 ----- 66.16 71.16 71.55 71.19 43.70 50.23 48.24 50.25 53.82 55.57
 10.  44.34 46.88 43.60 37.00 45.40 64.08 52.20 64.83 66.16 ----- 68.38 66.75 65.70 43.71 49.45 47.44 48.75 50.79 50.52
 11.  44.17 46.64 42.69 36.08 45.16 71.05 50.29 74.27 71.16 68.38 ----- 78.14 77.71 43.11 49.89 48.30 49.62 52.83 54.78
 12.  44.74 46.37 43.34 36.14 45.40 72.97 50.16 76.66 71.55 66.75 78.14 ----- 84.04 43.34 49.95 48.20 49.71 53.39 55.81
 13.  45.00 46.15 43.00 35.79 45.63 73.11 49.45 76.61 71.19 65.70 77.71 84.04 ----- 43.06 49.79 48.18 49.66 53.24 55.91
 14.  46.88 48.66 45.35 46.55 48.28 42.92 40.08 42.73 43.70 43.71 43.11 43.34 43.06 ----- 46.42 48.07 46.56 44.70 42.70
 15.  50.36 52.15 48.36 37.81 50.10 50.78 43.55 50.39 50.23 49.45 49.89 49.95 49.79 46.42 ----- 58.76 63.17 60.94 55.01
 16.  52.22 52.48 50.24 39.88 51.18 48.53 42.39 47.95 48.24 47.44 48.30 48.20 48.18 48.07 58.76 ----- 57.13 56.54 52.22
 17.  49.93 52.09 47.69 37.50 50.15 50.41 43.11 49.54 50.25 48.75 49.62 49.71 49.66 46.56 63.17 57.13 ----- 61.60 55.39
 18.  49.13 50.78 46.04 37.23 50.05 54.18 43.57 53.12 53.82 50.79 52.83 53.39 53.24 44.70 60.94 56.54 61.60 ----- 65.46
 19.  46.58 47.49 43.76 34.78 47.10 56.14 43.34 55.97 55.57 50.52 54.78 55.81 55.91 42.70 55.01 52.22 55.39 65.46 -----

Ferdy · Post by **Ferdy** » Fri May 03, 2019 5:55 am

Laskos wrote: ↑Wed May 01, 2019 6:38 pm I run a bit modified by Adam Hair (hash cleared before each position) Sim03 tester by Don Dailey. The 8,300 positions tested are quiet positions from real games, having several close in value options for the best move. Engines are on one thread (aside Lc0 on 2 threads), time per positions is 100ms. Here are the cross-percentages for matched moves engine-wise (scroll down the box):

I am currently analyzing the sim positions with latest Stockfish dev on multipv 2, contempts are set to 0 and Off run at 1s/pos, 1 thread on 3.4 Ghz intel cpu. Tried to record the difference between the bs1 the score from multipv 1 and bs2 the score from multipv 2 on some score range, to have an idea of non-quiet pos in the set at this movetime.

Results so far:

Code: Select all

Pos: 562, FEN: r1b1rbk1/2q2ppp/pp3n2/3pp3/P3PPP1/2N2B2/1PPQ3P/R4RBK w - - 0 17, diff: +79
diff 200cp or more : 10 (1.8%) of 563 so far
diff 100cp to 199cp: 21 (3.7%) of 563 so far
diff 50cp  to 99cp : 65 (11.5%) of 563 so far
diff 25cp  to 49cp : 123 (21.8%) of 563 so far

Ferdy · Post by **Ferdy** » Fri May 03, 2019 6:56 am

Ferdy wrote: ↑Fri May 03, 2019 5:55 am
Laskos wrote: ↑Wed May 01, 2019 6:38 pm I run a bit modified by Adam Hair (hash cleared before each position) Sim03 tester by Don Dailey. The 8,300 positions tested are quiet positions from real games, having several close in value options for the best move. Engines are on one thread (aside Lc0 on 2 threads), time per positions is 100ms. Here are the cross-percentages for matched moves engine-wise (scroll down the box):
I am currently analyzing the sim positions with latest Stockfish dev on multipv 2, contempts are set to 0 and Off run at 1s/pos, 1 thread on 3.4 Ghz intel cpu. Tried to record the difference between the bs1 the score from multipv 1 and bs2 the score from multipv 2 on some score range, to have an idea of non-quiet pos in the set at this movetime.

Results so far:
Code: Select all
Pos: 562, FEN: r1b1rbk1/2q2ppp/pp3n2/3pp3/P3PPP1/2N2B2/1PPQ3P/R4RBK w - - 0 17, diff: +79
diff 200cp or more : 10 (1.8%) of 563 so far
diff 100cp to 199cp: 21 (3.7%) of 563 so far
diff 50cp  to 99cp : 65 (11.5%) of 563 so far
diff 25cp  to 49cp : 123 (21.8%) of 563 so far

After 4010 pos.

Code: Select all

Pos: 4010, FEN: 2k5/1p1r1ppp/pqbBp3/4P3/4pQ2/8/PPP2PPP/1KR5 w - - 7 22, diff: +32
diff 200cp or more : 67 (1.7%) of 4011 so far
diff 100cp to 199cp: 189 (4.7%) of 4011 so far
diff 50cp  to 99cp : 498 (12.4%) of 4011 so far
diff 25cp  to 49cp : 846 (21.1%) of 4011 so far

Uri Blass · Post by **Uri Blass** » Fri May 03, 2019 7:02 am

Laskos wrote: ↑Thu May 02, 2019 10:23 am VERY interesting.

"SF dev Strong", that is, SF_dev on 4 cores at 500ms/position compared to SF_dev on 1 core at 100ms/position is SIGNIFICANTLY closer to Lc0 late nets of each run 10xxx, 30xxx, 35xxx, 40xxx, by a pretty hefty margin of about 3.5% of matched moves in quiet positions. The matrix is here:

Code: Select all

sim

  Key:

  1) Andscacs 0.95 (time: 100 ms  scale: 1.0)
  2) Ethereal 11.25 (time: 100 ms  scale: 1.0)
  3) Fire 7.1 (time: 100 ms  scale: 1.0)
  4) Fruit 2.1 (time: 100 ms  scale: 1.0)
  5) Komodo 12.3 (time: 100 ms  scale: 1.0)
  6) Lc0 11261 (time: 100 ms  scale: 3.0)
  7) Lc0 30643 (time: 100 ms  scale: 3.0)
  8) Lc0 32930 (time: 100 ms  scale: 3.0)
  9) Lc0 36091 (time: 100 ms  scale: 3.0)
 10) Lc0 40320 (time: 100 ms  scale: 3.0)
 11) Lc0 41006 (time: 100 ms  scale: 3.0)
 12) Lc0 41654 (time: 100 ms  scale: 3.0)
 13) Lc0 42184 (time: 100 ms  scale: 3.0)
 14) Senpai 1.0 (time: 100 ms  scale: 1.0)
 15) SF 10 (time: 100 ms  scale: 1.0)
 16) SF 8 (time: 100 ms  scale: 1.0)
 17) SF dev (time: 100 ms  scale: 1.0)
 18) SF dev Strong (time: 100 ms  scale: 5.0)

         1     2     3     4     5     6     7     8     9    10    11    12	13    14    15    16    17    18
  1.  ----- 49.19 45.69 37.95 48.17 44.94 40.30 44.21 44.88 44.34 44.17 44.74 45.00 46.88 50.36 52.22 49.93 49.13
  2.  49.19 ----- 48.05 39.58 48.57 46.83 42.27 45.90 46.90 46.88 46.64 46.37 46.15 48.66 52.15 52.48 52.09 50.78
  3.  45.69 48.05 ----- 40.17 46.43 42.87 39.97 42.27 43.20 43.60 42.69 43.34 43.00 45.35 48.36 50.24 47.69 46.04
  4.  37.95 39.58 40.17 ----- 39.51 35.91 34.87 35.70 35.96 37.00 36.08 36.14 35.79 46.55 37.81 39.88 37.50 37.23
  5.  48.17 48.57 46.43 39.51 ----- 45.78 41.31 45.10 45.95 45.40 45.16 45.40 45.63 48.28 50.10 51.18 50.15 50.05
  6.  44.94 46.83 42.87 35.91 45.78 ----- 51.27 72.89 69.60 64.08 71.05 72.97 73.11 42.92 50.78 48.53 50.41 54.18
  7.  40.30 42.27 39.97 34.87 41.31 51.27 ----- 48.99 50.04 52.20 50.29 50.16 49.45 40.08 43.55 42.39 43.11 43.57
  8.  44.21 45.90 42.27 35.70 45.10 72.89 48.99 ----- 70.95 64.83 74.27 76.66 76.61 42.73 50.39 47.95 49.54 53.12
  9.  44.88 46.90 43.20 35.96 45.95 69.60 50.04 70.95 ----- 66.16 71.16 71.55 71.19 43.70 50.23 48.24 50.25 53.82
 10.  44.34 46.88 43.60 37.00 45.40 64.08 52.20 64.83 66.16 ----- 68.38 66.75 65.70 43.71 49.45 47.44 48.75 50.79
 11.  44.17 46.64 42.69 36.08 45.16 71.05 50.29 74.27 71.16 68.38 ----- 78.14 77.71 43.11 49.89 48.30 49.62 52.83
 12.  44.74 46.37 43.34 36.14 45.40 72.97 50.16 76.66 71.55 66.75 78.14 ----- 84.04 43.34 49.95 48.20 49.71 53.39
 13.  45.00 46.15 43.00 35.79 45.63 73.11 49.45 76.61 71.19 65.70 77.71 84.04 ----- 43.06 49.79 48.18 49.66 53.24
 14.  46.88 48.66 45.35 46.55 48.28 42.92 40.08 42.73 43.70 43.71 43.11 43.34 43.06 ----- 46.42 48.07 46.56 44.70
 15.  50.36 52.15 48.36 37.81 50.10 50.78 43.55 50.39 50.23 49.45 49.89 49.95 49.79 46.42 ----- 58.76 63.17 60.94
 16.  52.22 52.48 50.24 39.88 51.18 48.53 42.39 47.95 48.24 47.44 48.30 48.20 48.18 48.07 58.76 ----- 57.13 56.54
 17.  49.93 52.09 47.69 37.50 50.15 50.41 43.11 49.54 50.25 48.75 49.62 49.71 49.66 46.56 63.17 57.13 ----- 61.60
 18.  49.13 50.78 46.04 37.23 50.05 54.18 43.57 53.12 53.82 50.79 52.83 53.39 53.24 44.70 60.94 56.54 61.60 -----

As percentages go, this SF_dev_Strong has the highest matching moves numbers to late Lc0 nets of every run, higher compared to the matches to other AB engines. So, positional strength IS a significant factor in Lc0 runs convergence. How much it is a factor is hard to say. I am not sure what is the percentage of quiet positions having a unique best move. Some positions must have 2 or more best moves due to transpositions, for example, and some simply because, as with TBs, there are sometimes several choices to the same theoretical outcome. Still, a large fraction of quiet positions DO have unique best moves.
It is a distinct possibility that all Lc0 completed or late runs are extremely strong positionally and the convergence of the runs is due to that, although several other explanations are plausible too (same engine, MCTS search, very similar runs, etc.).

A position has a unique best move only if there is only one winning move or only one move to save the game.
It means that the position is not quiet based on my understanding.

Why all Lc0 runs result in such similarity of quiet moves selection?

Why all Lc0 runs result in such similarity of quiet moves selection?

Re: Why all Lc0 runs result in such similarity of quiet moves selection?

Re: Why all Lc0 runs result in such similarity of quiet moves selection?

Re: Why all Lc0 runs result in such similarity of quiet moves selection?

Re: Why all Lc0 runs result in such similarity of quiet moves selection?

Re: Why all Lc0 runs result in such similarity of quiet moves selection?

Re: Why all Lc0 runs result in such similarity of quiet moves selection?

Re: Why all Lc0 runs result in such similarity of quiet moves selection?

Re: Why all Lc0 runs result in such similarity of quiet moves selection?

Re: Why all Lc0 runs result in such similarity of quiet moves selection?