Maybe not the best diversity of strongest chess engines under development

Laskos · Post by **Laskos** » Sat Nov 14, 2020 7:19 pm

With a modified Sim (I changed the set of positions) at 100 ms/position and at 1000 ms/position when specified as x10, I am getting the following, maybe not the nicest picture of similarity with the current strength progress of engines. All NN enabled engines cluster together, and Leela NN branch together with NNUE branch. Earlier engines show more diversity that the current strongest ones. NINU 0.3 is the Night Nurse 0.3 net.

Code: Select all


  Key:

  1) Andsc095_100ms (time: 100 ms  scale: 1)
  2) Dragon_1000ms (time: 100 ms  scale: 10.0)
  3) Dragon_100ms (time: 100 ms  scale: 1.0)
  4) Ethe_1275 (time: 100 ms  scale: 1.0)
  5) Fruit21_100ms (time: 100 ms  scale: 1)
  6) Igel27_100ms (time: 100 ms  scale: 1)
  7) Kom_14_100ms (time: 100 ms  scale: 1)
  8) Lc0_LS15_1000ms (time: 100 ms  scale: 10.0)
  9) Lc0_LS15_100ms (time: 100 ms  scale: 1.0)
 10) Lc0_SV3010_100ms (time: 100 ms  scale: 1.0)
 11) SF11_100ms (time: 100 ms  scale: 1)
 12) SF12_1000ms (time: 100 ms  scale: 10)
 13) SF12_100ms (time: 100 ms  scale: 1)
 14) SF12_Igel_100ms (time: 100 ms  scale: 1)
 15) SF12_NightNurse03_100ms (time: 100 ms  scale: 1)
 16) SF8_100ms (time: 100 ms  scale: 1)
 17) Shredd_12_100ms (time: 100 ms  scale: 1)

         1     2     3     4     5     6     7     8     9    10    11    12    13    14    15    16    17
  1.  ----- 47.15 46.85 48.25 37.55 50.40 53.40 50.35 50.30 48.20 55.10 48.25 49.55 48.85 47.10 52.85 49.50
  2.  47.15 ----- 67.60 52.00 33.35 59.05 53.20 66.90 67.10 65.35 54.55 67.40 65.25 59.45 59.35 51.20 47.45
  3.  46.85 67.60 ----- 52.90 33.50 60.25 54.00 63.70 64.15 62.75 56.55 65.30 64.70 61.35 60.00 52.90 48.55
  4.  48.25 52.00 52.90 ----- 39.30 53.95 53.10 55.05 55.70 54.00 55.05 54.85 55.40 52.90 53.10 53.40 52.40
  5.  37.55 33.35 33.50 39.30 ----- 35.75 36.40 35.35 35.95 34.70 37.65 35.00 35.20 34.40 33.35 39.60 45.40
  6.  50.40 59.05 60.25 53.95 35.75 ----- 54.15 61.95 64.25 60.85 56.15 59.95 61.30 62.85 58.70 53.30 50.45
  7.  53.40 53.20 54.00 53.10 36.40 54.15 ----- 53.70 54.35 53.25 61.65 52.85 53.70 52.45 50.95 60.25 53.30
  8.  50.35 66.90 63.70 55.05 35.35 61.95 53.70 ----- 88.20 77.50 56.70 72.00 66.95 62.50 60.45 53.10 51.95
  9.  50.30 67.10 64.15 55.70 35.95 64.25 54.35 88.20 ----- 77.50 57.30 70.25 66.25 62.80 60.95 54.00 52.05
 10.  48.20 65.35 62.75 54.00 34.70 60.85 53.25 77.50 77.50 ----- 55.30 70.75 66.00 60.05 58.80 51.15 50.35
 11.  55.10 54.55 56.55 55.05 37.65 56.15 61.65 56.70 57.30 55.30 ----- 55.25 56.35 54.75 52.80 63.15 52.85
 12.  48.25 67.40 65.30 54.85 35.00 59.95 52.85 72.00 70.25 70.75 55.25 ----- 70.45 61.80 60.80 51.95 50.10
 13.  49.55 65.25 64.70 55.40 35.20 61.30 53.70 66.95 66.25 66.00 56.35 70.45 ----- 63.55 62.60 53.35 51.70
 14.  48.85 59.45 61.35 52.90 34.40 62.85 52.45 62.50 62.80 60.05 54.75 61.80 63.55 ----- 60.40 53.00 48.00
 15.  47.10 59.35 60.00 53.10 33.35 58.70 50.95 60.45 60.95 58.80 52.80 60.80 62.60 60.40 ----- 52.20 48.75
 16.  52.85 51.20 52.90 53.40 39.60 53.30 60.25 53.10 54.00 51.15 63.15 51.95 53.35 53.00 52.20 ----- 53.50
 17.  49.50 47.45 48.55 52.40 45.40 50.45 53.30 51.95 52.05 50.35 52.85 50.10 51.70 48.00 48.75 53.50 -----

Alayan · Post by **Alayan** » Sat Nov 14, 2020 8:24 pm

I'm not surprised in the least.

Could you add Ethereal 12.75 data ? I'm curious how it clusters.

Laskos · Post by **Laskos** » Sat Nov 14, 2020 8:54 pm

Alayan wrote: ↑Sat Nov 14, 2020 8:24 pm I'm not surprised in the least.

Could you add Ethereal 12.75 data ? I'm curious how it clusters.

It clusters with classical engines, and is quite individual.

Madeleine Birchfield · Sun Nov 15, 2020 8:03 am

What about Seer, Halogen 8, and Minic 3?

Laskos · Post by **Laskos** » Sun Nov 15, 2020 1:15 pm

Madeleine Birchfield wrote: ↑Sun Nov 15, 2020 8:03 am What about Seer, Halogen 8, and Minic 3?

Maybe I will check a few more, but I am not going to test every NNUE engine.

Frank Quisinsky · Post by **Frank Quisinsky** » Sun Nov 15, 2020 1:39 pm

Kai,

That's a really good point!!
I found out that all NN engines I am looking have problems with complicated A80, A81, E97-E99 openings.
Engines lost understanding with NN.

Can be a good idea to put best A80 or E99 lines inside your test?
Best lines can be found really easy with our FEBOS database and the ranking system we developed.
The FEOBOS positions are sorted with the developed ranking system.

Thank you for the graphic and your work again!
Interesting what you do all the time (still reader).

Best
Frank

Laskos · Post by **Laskos** » Sun Nov 15, 2020 3:17 pm

Frank Quisinsky wrote: ↑Sun Nov 15, 2020 1:39 pm Kai,

That's a really good point!!
I found out that all NN engines I am looking have problems with complicated A80, A81, E97-E99 openings.
Engines lost understanding with NN.

Can be a good idea to put best A80 or E99 lines inside your test?
Best lines can be found really easy with our FEBOS database and the ranking system we developed.
The FEOBOS positions are sorted with the developed ranking system.

Thank you for the graphic and your work again!
Interesting what you do all the time (still reader).

Best
Frank

I think classical engines too have problems with KID and Dutch, but NNUE are probably even worse in this respect as these openings are a bit peculiar. NNUE engines excel in mainstream openings, where they are almost the level of Lc0 positionally. But NNUE engines and Lc0 are underperforming in deviating from usual openings positions and in chess variants. I have probably more than 1000 of KID openings, maybe I will build a Sim including them in the set.

Frank Quisinsky · Post by **Frank Quisinsky** » Sun Nov 15, 2020 3:29 pm

Hi Kai,

I test Duch lines often for my first impressions of a for me unknown engine or engine version!
Maestro is SlowChess here, great understandings for 92 of 100 of my dutch test-positions!

Not important for your thread!
But all what I like to write is ...
What you find out seems to be a main problem for NN ideas.

End of the day, most chess programs do the same or lost the own face.

Best
Frank

Laskos · Post by **Laskos** » Mon Nov 16, 2020 9:22 pm

Madeleine Birchfield wrote: ↑Sun Nov 15, 2020 8:03 am What about Seer, Halogen 8, and Minic 3?

I seem to be unable to run Seer 1.1 and Halogen 8.1 with Sim, even if I messed with Sim.tcl file. I am not sure what's the matter, maybe they are not fully UCI compliant. I was curious about them, as they seem to be original NNUE implementations.

AndrewGrant · Post by **AndrewGrant** » Tue Nov 17, 2020 4:35 am

Unfortunate, and entirely unsurprising in regards to the NNUE similarities. Perhaps, some hope though. Even though Komodo and Stockfish are, it appears, trained on the same code base, the differences between their evals, and its usage in training, is enough to have at least some diversity. You don't get quite near the level of intra-engine play. So one can return back to the argument of, "Its unique if its trained on different data, even if the trainer is the same", which was the failed mantra of DeusX, but seemed to work for Leelenstein and Allie.

Maybe not the best diversity of strongest chess engines under development

Maybe not the best diversity of strongest chess engines under development

Re: Maybe not the best diversity of strongest chess engines under development

Re: Maybe not the best diversity of strongest chess engines under development

Re: Maybe not the best diversity of strongest chess engines under development

Re: Maybe not the best diversity of strongest chess engines under development

Re: Maybe not the best diversity of strongest chess engines under development

Re: Maybe not the best diversity of strongest chess engines under development

Re: Maybe not the best diversity of strongest chess engines under development

Re: Maybe not the best diversity of strongest chess engines under development

Re: Maybe not the best diversity of strongest chess engines under development