How to measure overall similarity

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

How to measure overall similarity

Post by Ferdy »

Given:

Code: Select all

positions = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

engine1_bm = [a, b, c, d, e, f, g, h, i, j]
engine2_bm = [a, x, c, d, e, v, t, r, i, s]
engine3_bm ...
...
The bm in first position for engine1 is a, for engine2 the bm is also a.

Similarity of bm between engine1 and engine2 is:
Similarity = a, c, d, e, i
Similarity = 5/10 = 50%

Then compute similarity between engine1 and engine3, engine1 and engine 4 and so on.

So it would look like this as in sim.

Code: Select all

------ Stockfish 10 (time: 20 ms scale: 1.0) ------
 68.67  Stockfish 10 r1 (time: 20 ms scale: 1.0)
 38.06  Senpai 2.0 (time: 200 ms scale: 1.0)
 37.63  Demolito 2018-10-29 (time: 200 ms scale: 1.0)
 37.11  RofChade Version 2.0 (time: 200 ms scale: 1.0)
 37.10  Texel 1.07 (time: 200 ms scale: 1.0)
 36.94  Gull 3 x64 (time: 200 ms scale: 1.0)
 36.70  Arasan 21.1 (time: 200 ms scale: 1.0)
 35.28  Wasp 3.50 (time: 200 ms scale: 1.0)
 35.26  Booot 6.3.1 (time: 200 ms scale: 1.0)
 34.58  Atlas 3.91 (time: 200 ms scale: 1.0)
 34.38  Hannibal 1.7 x64 (time: 200 ms scale: 1.0)
 33.41  Andscacs 0.95 (time: 200 ms scale: 1.0)
 31.28  Stockfish 9 (time: 20 ms scale: 1.0)
 31.03  The Baron 3.44 (time: 200 ms scale: 1.0)
 30.18  Lc0 v0.21.1 w11258-120x9 (time: 5000 ms scale: 1.0)
 29.93  Pedone 1.9 (time: 200 ms scale: 1.0)
 29.15  Ethereal 11.25 (time: 200 ms scale: 1.0)
 26.40  Vajolet2 2.6.1 (time: 200 ms scale: 1.0)
 26.26  Fizbo 2 (time: 200 ms scale: 1.0)
 25.69  Laser 1.7 (time: 200 ms scale: 1.0)
 25.49  Cheng 4.39 (time: 200 ms scale: 1.0)
 24.87  Deuterium 2019.2.37.37 (time: 200 ms scale: 1.0)
 24.63  Xiphos 0.5 (time: 200 ms scale: 1.0)
 20.45  Nemorino 5.00 (time: 200 ms scale: 1.0)
 19.86  Amoeba 2.8 (time: 200 ms scale: 1.0)
 19.26  Schooner 2.0.34 (time: 200 ms scale: 1.0)
 18.06  SmarThink 1.98 (time: 200 ms scale: 1.0)
So what is the best approach to measure the overall similarity of Stockfish 10?

Currently I am using average or mean. I will use this data to construct a dendrogram.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: How to measure overall similarity

Post by Laskos »

Ferdy wrote: Tue Apr 02, 2019 4:21 pm Given:

Code: Select all

positions = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

engine1_bm = [a, b, c, d, e, f, g, h, i, j]
engine2_bm = [a, x, c, d, e, v, t, r, i, s]
engine3_bm ...
...
The bm in first position for engine1 is a, for engine2 the bm is also a.

Similarity of bm between engine1 and engine2 is:
Similarity = a, c, d, e, i
Similarity = 5/10 = 50%

Then compute similarity between engine1 and engine3, engine1 and engine 4 and so on.

So it would look like this as in sim.

Code: Select all

------ Stockfish 10 (time: 20 ms scale: 1.0) ------
 68.67  Stockfish 10 r1 (time: 20 ms scale: 1.0)
 38.06  Senpai 2.0 (time: 200 ms scale: 1.0)
 37.63  Demolito 2018-10-29 (time: 200 ms scale: 1.0)
 37.11  RofChade Version 2.0 (time: 200 ms scale: 1.0)
 37.10  Texel 1.07 (time: 200 ms scale: 1.0)
 36.94  Gull 3 x64 (time: 200 ms scale: 1.0)
 36.70  Arasan 21.1 (time: 200 ms scale: 1.0)
 35.28  Wasp 3.50 (time: 200 ms scale: 1.0)
 35.26  Booot 6.3.1 (time: 200 ms scale: 1.0)
 34.58  Atlas 3.91 (time: 200 ms scale: 1.0)
 34.38  Hannibal 1.7 x64 (time: 200 ms scale: 1.0)
 33.41  Andscacs 0.95 (time: 200 ms scale: 1.0)
 31.28  Stockfish 9 (time: 20 ms scale: 1.0)
 31.03  The Baron 3.44 (time: 200 ms scale: 1.0)
 30.18  Lc0 v0.21.1 w11258-120x9 (time: 5000 ms scale: 1.0)
 29.93  Pedone 1.9 (time: 200 ms scale: 1.0)
 29.15  Ethereal 11.25 (time: 200 ms scale: 1.0)
 26.40  Vajolet2 2.6.1 (time: 200 ms scale: 1.0)
 26.26  Fizbo 2 (time: 200 ms scale: 1.0)
 25.69  Laser 1.7 (time: 200 ms scale: 1.0)
 25.49  Cheng 4.39 (time: 200 ms scale: 1.0)
 24.87  Deuterium 2019.2.37.37 (time: 200 ms scale: 1.0)
 24.63  Xiphos 0.5 (time: 200 ms scale: 1.0)
 20.45  Nemorino 5.00 (time: 200 ms scale: 1.0)
 19.86  Amoeba 2.8 (time: 200 ms scale: 1.0)
 19.26  Schooner 2.0.34 (time: 200 ms scale: 1.0)
 18.06  SmarThink 1.98 (time: 200 ms scale: 1.0)
So what is the best approach to measure the overall similarity of Stockfish 10?

Currently I am using average or mean. I will use this data to construct a dendrogram.
Why you need to average? To have some impression? Suspects are usually punctual, from X to Y, and dendrograms are used in this case with a matrix, and even more detailed data position-wise, bootstrapping and so on, not averages.
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: How to measure overall similarity

Post by Ferdy »

Laskos wrote: Tue Apr 02, 2019 4:38 pm Why you need to average? To have some impression? Suspects are usually punctual, from X to Y, and dendrograms are used in this case with a matrix, and even more detailed data position-wise, bootstrapping and so on, not averages.
Trying to create data points with overall similarity and rating of engine.

Image


Then construct the heirarchical clustering.

Image

Stronger engines are grouped in same cluster, understadable since their ratings are close to each other x-wise, but what about the similarity along y? Stockfish 10 and Stockfish 10 r1 are the same engine they have the highest similarity of course at 68.67. Looking at the dendrogram, they are indeed clustered together. But I have not checked yet if the other engines would behave the same both in dendrogram and the detailed similarity test result. I am still running some engines.

This is my data so far.

Code: Select all

sim version 3

  Key:

  1) Amoeba 2.8 (time: 200 ms scale: 1.0)
  2) Andscacs 0.95 (time: 200 ms scale: 1.0)
  3) Arasan 21.1 (time: 200 ms scale: 1.0)
  4) Booot 6.3.1 (time: 200 ms scale: 1.0)
  5) Cheng 4.39 (time: 200 ms scale: 1.0)
  6) Demolito 2018-10-29 (time: 200 ms scale: 1.0)
  7) Deuterium 2019.2.37.37 (time: 200 ms scale: 1.0)
  8) Ethereal 11.25 (time: 200 ms scale: 1.0)
  9) Fizbo 2 (time: 200 ms scale: 1.0)
 10) Gull 3 x64 (time: 200 ms scale: 1.0)
 11) Hannibal 1.7 x64 (time: 200 ms scale: 1.0)
 12) Laser 1.7 (time: 200 ms scale: 1.0)
 13) Lc0 v0.21.1 w11258-120x9 (time: 5000 ms scale: 1.0)
 14) Nemorino 5.00 (time: 200 ms scale: 1.0)
 15) Pedone 1.9 (time: 200 ms scale: 1.0)
 16) RofChade Version 2.0 (time: 200 ms scale: 1.0)
 17) Schooner 2.0.34 (time: 200 ms scale: 1.0)
 18) Senpai 2.0 (time: 200 ms scale: 1.0)
 19) SmarThink 1.98 (time: 200 ms scale: 1.0)
 20) Stockfish 10 (time: 20 ms scale: 1.0)
 21) Stockfish 10 r1 (time: 20 ms scale: 1.0)
 22) Stockfish 9 (time: 20 ms scale: 1.0)
 23) Texel 1.07 (time: 200 ms scale: 1.0)
 24) The Baron 3.44 (time: 200 ms scale: 1.0)
 25) Vajolet2 2.6.1 (time: 200 ms scale: 1.0)
 26) Wasp 3.50 (time: 200 ms scale: 1.0)
 27) Xiphos 0.5 (time: 200 ms scale: 1.0)

         1     2     3     4     5     6     7     8     9    10    11    12    13    14    15    16    17    18    19    20    21    22    23    24    25    26    27
  1.  ----- 20.32 25.20 22.55 19.66 25.46 16.57 18.17 17.36 24.41 23.49 16.64 19.92 14.65 21.61 23.04 13.39 25.53 14.36 19.86 20.02 18.40 24.87 22.99 18.38 23.55 15.99
  2.  20.32 ----- 40.16 40.40 29.70 42.69 27.14 32.82 29.16 43.53 40.18 28.74 32.57 21.09 31.71 41.15 22.04 43.03 20.73 33.41 33.99 28.90 41.85 32.90 29.97 40.08 26.95
  3.  25.20 40.16 ----- 41.56 33.53 46.72 28.98 34.07 31.03 46.65 42.72 29.23 33.66 23.27 35.37 44.72 21.98 47.73 22.49 36.70 36.96 31.55 47.45 38.91 32.73 44.77 27.74
  4.  22.55 40.40 41.56 ----- 30.26 45.52 28.21 33.52 29.63 45.24 41.67 29.51 31.82 23.08 33.47 42.51 22.08 43.28 20.48 35.26 36.25 31.35 42.52 34.40 31.74 43.28 26.73
  5.  19.66 29.70 33.53 30.26 ----- 32.92 22.35 24.62 23.71 33.66 30.83 22.14 24.81 19.12 26.68 31.97 17.84 34.12 17.54 25.49 26.27 23.08 34.45 28.91 23.96 32.20 21.57
  6.  25.46 42.69 46.72 45.52 32.92 ----- 31.82 35.13 31.46 48.92 45.55 30.49 33.90 25.42 38.77 47.35 24.47 49.14 23.27 37.63 38.50 32.24 46.30 39.33 34.62 47.45 29.40
  7.  16.57 27.14 28.98 28.21 22.35 31.82 ----- 24.10 21.35 30.30 28.74 21.91 21.78 18.56 24.82 30.67 17.54 30.34 16.47 24.87 25.30 22.03 29.52 25.37 23.96 30.09 20.68
  8.  18.17 32.82 34.07 33.52 24.62 35.13 24.10 ----- 24.97 35.65 32.06 26.40 28.19 19.59 26.88 36.96 19.90 36.21 17.37 29.15 29.40 25.49 35.34 28.07 25.65 33.76 22.98
  9.  17.36 29.16 31.03 29.63 23.71 31.46 21.35 24.97 ----- 31.35 29.36 22.29 29.07 17.35 24.99 31.31 16.45 32.47 17.43 26.26 26.81 23.17 33.02 26.60 22.83 29.59 21.50
 10.  24.41 43.53 46.65 45.24 33.66 48.92 30.30 35.65 31.35 ----- 44.45 31.31 34.43 24.13 36.36 46.78 23.83 49.24 23.02 36.94 37.22 32.19 47.24 37.93 34.07 45.87 29.01
 11.  23.49 40.18 42.72 41.67 30.83 45.55 28.74 32.06 29.36 44.45 ----- 28.82 31.65 23.84 34.32 41.85 22.08 44.76 21.49 34.38 35.08 30.17 42.57 37.00 32.14 43.18 26.94
 12.  16.64 28.74 29.23 29.51 22.14 30.49 21.91 26.40 22.29 31.31 28.82 ----- 23.51 17.39 25.14 30.66 17.99 31.20 14.76 25.69 25.78 21.76 30.71 24.42 23.05 30.35 20.61
 13.  19.92 32.57 33.66 31.82 24.81 33.90 21.78 28.19 29.07 34.43 31.65 23.51 ----- 19.03 26.77 33.98 18.07 35.46 18.03 30.18 30.59 25.56 34.60 29.11 24.69 32.07 23.28
 14.  14.65 21.09 23.27 23.08 19.12 25.42 18.56 19.59 17.35 24.13 23.84 17.39 19.03 ----- 20.49 22.93 15.54 25.01 13.18 20.45 21.36 18.39 23.82 20.98 20.66 24.70 17.32
 15.  21.61 31.71 35.37 33.47 26.68 38.77 24.82 26.88 24.99 36.36 34.32 25.14 26.77 20.49 ----- 35.52 19.79 37.11 17.88 29.93 30.81 26.44 36.77 30.77 27.77 35.65 24.12
 16.  23.04 41.15 44.72 42.51 31.97 47.35 30.67 36.96 31.31 46.78 41.85 30.66 33.98 22.93 35.52 ----- 23.05 46.49 21.63 37.11 37.96 31.42 46.32 35.89 32.35 43.70 29.61
 17.  13.39 22.04 21.98 22.08 17.84 24.47 17.54 19.90 16.45 23.83 22.08 17.99 18.07 15.54 19.79 23.05 ----- 24.08 13.21 19.26 19.71 17.80 22.31 19.65 19.34 22.51 17.61
 18.  25.53 43.03 47.73 43.28 34.12 49.14 30.34 36.21 32.47 49.24 44.76 31.20 35.46 25.01 37.11 46.49 24.08 ----- 22.88 38.06 38.75 32.19 48.59 39.99 33.42 46.41 29.97
 19.  14.36 20.73 22.49 20.48 17.54 23.27 16.47 17.37 17.43 23.02 21.49 14.76 18.03 13.18 17.88 21.63 13.21 22.88 ----- 18.06 18.39 16.01 22.12 20.53 16.81 21.91 15.22
 20.  19.86 33.41 36.70 35.26 25.49 37.63 24.87 29.15 26.26 36.94 34.38 25.69 30.18 20.45 29.93 37.11 19.26 38.06 18.06 ----- 68.67 31.28 37.10 31.03 26.40 35.28 24.63
 21.  20.02 33.99 36.96 36.25 26.27 38.50 25.30 29.40 26.81 37.22 35.08 25.78 30.59 21.36 30.81 37.96 19.71 38.75 18.39 68.67 ----- 31.54 37.79 31.62 27.23 36.03 24.82
 22.  18.40 28.90 31.55 31.35 23.08 32.24 22.03 25.49 23.17 32.19 30.17 21.76 25.56 18.39 26.44 31.42 17.80 32.19 16.01 31.28 31.54 ----- 31.94 26.32 24.10 30.82 21.87
 23.  24.87 41.85 47.45 42.52 34.45 46.30 29.52 35.34 33.02 47.24 42.57 30.71 34.60 23.82 36.77 46.32 22.31 48.59 22.12 37.10 37.79 31.94 ----- 38.65 31.88 45.74 29.11
 24.  22.99 32.90 38.91 34.40 28.91 39.33 25.37 28.07 26.60 37.93 37.00 24.42 29.11 20.98 30.77 35.89 19.65 39.99 20.53 31.03 31.62 26.32 38.65 ----- 27.29 37.11 23.80
 25.  18.38 29.97 32.73 31.74 23.96 34.62 23.96 25.65 22.83 34.07 32.14 23.05 24.69 20.66 27.77 32.35 19.34 33.42 16.81 26.40 27.23 24.10 31.88 27.29 ----- 32.51 21.74
 26.  23.55 40.08 44.77 43.28 32.20 47.45 30.09 33.76 29.59 45.87 43.18 30.35 32.07 24.70 35.65 43.70 22.51 46.41 21.91 35.28 36.03 30.82 45.74 37.11 32.51 ----- 26.56
 27.  15.99 26.95 27.74 26.73 21.57 29.40 20.68 22.98 21.50 29.01 26.94 20.61 23.28 17.32 24.12 29.61 17.61 29.97 15.22 24.63 24.82 21.87 29.11 23.80 21.74 26.56 -----
I am using this libs to construct the dendrogram.
from scipy.cluster.hierarchy import dendrogram, linkage
import matplotlib.pyplot as plt
import numpy as np
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: How to measure overall similarity

Post by Ferdy »

I take out the rating in the array, so the distance will all be based alone on the simiarity.

Looking again at the function:

Code: Select all

scipy.cluster.hierarchy.linkage(y, method='single', metric='euclidean', optimal_ordering=False)
where:
Parameters:
y : ndarray
A condensed distance matrix. A condensed distance matrix is a flat array containing the upper triangular of the distance matrix. This is the form that pdist returns. Alternatively, a collection of m observation vectors in n dimensions may be passed as an m by n array. All elements of the condensed distance matrix must be finite, i.e. no NaNs or infs.

So I can just use the matrix directly except the similarity vs itself is set to a high value say 80.

Code: Select all

si = 80  # sim vs itself
Y = np.array([
    [si, 44, 25, 40], 
    [44, si, 55, 25],
    [25, 55, si, 35],
    [40, 25, 35, si]
    ])
[si, 44, 25, 40], --> similarity vector of player 1
[44, si, 55, 25], ... of player 2
and so on

Code:

Code: Select all

# -*- coding: utf-8 -*-
"""
dg2.py

"""

from scipy.cluster.hierarchy import dendrogram, linkage  
from matplotlib import pyplot as plt
import numpy as np

si = 80  # sim vs itself
Y = np.array([
    [si, 44, 25, 40],  # sf10
    [44, si, 55, 25],  # sf9
    [25, 55, si, 35],  # sf8
    [40, 25, 35, si]  # sf7
    ])

print(f'Similarity matrix, similarity against itself is set at {si}')
print(Y)

mymethod = 'ward'
mymetric = 'euclidean'

linked = linkage(Y, method=mymethod, metric=mymetric, optimal_ordering=True)

labelList = ['Sf10', 'Sf9', 'Sf8', 'Sf7']

plt.figure(figsize=(6, 4))  
dendrogram(linked,  
            orientation='right',
            labels=labelList,
            distance_sort='descending',
            show_leaf_counts=True)
plt.show()

Plot:

Image