Maybe not the best diversity of strongest chess engines under development

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Maybe not the best diversity of strongest chess engines under development

Post by Laskos »

With a modified Sim (I changed the set of positions) at 100 ms/position and at 1000 ms/position when specified as x10, I am getting the following, maybe not the nicest picture of similarity with the current strength progress of engines. All NN enabled engines cluster together, and Leela NN branch together with NNUE branch. Earlier engines show more diversity that the current strongest ones. NINU 0.3 is the Night Nurse 0.3 net.

Code: Select all


  Key:

  1) Andsc095_100ms (time: 100 ms  scale: 1)
  2) Dragon_1000ms (time: 100 ms  scale: 10.0)
  3) Dragon_100ms (time: 100 ms  scale: 1.0)
  4) Ethe_1275 (time: 100 ms  scale: 1.0)
  5) Fruit21_100ms (time: 100 ms  scale: 1)
  6) Igel27_100ms (time: 100 ms  scale: 1)
  7) Kom_14_100ms (time: 100 ms  scale: 1)
  8) Lc0_LS15_1000ms (time: 100 ms  scale: 10.0)
  9) Lc0_LS15_100ms (time: 100 ms  scale: 1.0)
 10) Lc0_SV3010_100ms (time: 100 ms  scale: 1.0)
 11) SF11_100ms (time: 100 ms  scale: 1)
 12) SF12_1000ms (time: 100 ms  scale: 10)
 13) SF12_100ms (time: 100 ms  scale: 1)
 14) SF12_Igel_100ms (time: 100 ms  scale: 1)
 15) SF12_NightNurse03_100ms (time: 100 ms  scale: 1)
 16) SF8_100ms (time: 100 ms  scale: 1)
 17) Shredd_12_100ms (time: 100 ms  scale: 1)

         1     2     3     4     5     6     7     8     9    10    11    12    13    14    15    16    17
  1.  ----- 47.15 46.85 48.25 37.55 50.40 53.40 50.35 50.30 48.20 55.10 48.25 49.55 48.85 47.10 52.85 49.50
  2.  47.15 ----- 67.60 52.00 33.35 59.05 53.20 66.90 67.10 65.35 54.55 67.40 65.25 59.45 59.35 51.20 47.45
  3.  46.85 67.60 ----- 52.90 33.50 60.25 54.00 63.70 64.15 62.75 56.55 65.30 64.70 61.35 60.00 52.90 48.55
  4.  48.25 52.00 52.90 ----- 39.30 53.95 53.10 55.05 55.70 54.00 55.05 54.85 55.40 52.90 53.10 53.40 52.40
  5.  37.55 33.35 33.50 39.30 ----- 35.75 36.40 35.35 35.95 34.70 37.65 35.00 35.20 34.40 33.35 39.60 45.40
  6.  50.40 59.05 60.25 53.95 35.75 ----- 54.15 61.95 64.25 60.85 56.15 59.95 61.30 62.85 58.70 53.30 50.45
  7.  53.40 53.20 54.00 53.10 36.40 54.15 ----- 53.70 54.35 53.25 61.65 52.85 53.70 52.45 50.95 60.25 53.30
  8.  50.35 66.90 63.70 55.05 35.35 61.95 53.70 ----- 88.20 77.50 56.70 72.00 66.95 62.50 60.45 53.10 51.95
  9.  50.30 67.10 64.15 55.70 35.95 64.25 54.35 88.20 ----- 77.50 57.30 70.25 66.25 62.80 60.95 54.00 52.05
 10.  48.20 65.35 62.75 54.00 34.70 60.85 53.25 77.50 77.50 ----- 55.30 70.75 66.00 60.05 58.80 51.15 50.35
 11.  55.10 54.55 56.55 55.05 37.65 56.15 61.65 56.70 57.30 55.30 ----- 55.25 56.35 54.75 52.80 63.15 52.85
 12.  48.25 67.40 65.30 54.85 35.00 59.95 52.85 72.00 70.25 70.75 55.25 ----- 70.45 61.80 60.80 51.95 50.10
 13.  49.55 65.25 64.70 55.40 35.20 61.30 53.70 66.95 66.25 66.00 56.35 70.45 ----- 63.55 62.60 53.35 51.70
 14.  48.85 59.45 61.35 52.90 34.40 62.85 52.45 62.50 62.80 60.05 54.75 61.80 63.55 ----- 60.40 53.00 48.00
 15.  47.10 59.35 60.00 53.10 33.35 58.70 50.95 60.45 60.95 58.80 52.80 60.80 62.60 60.40 ----- 52.20 48.75
 16.  52.85 51.20 52.90 53.40 39.60 53.30 60.25 53.10 54.00 51.15 63.15 51.95 53.35 53.00 52.20 ----- 53.50
 17.  49.50 47.45 48.55 52.40 45.40 50.45 53.30 51.95 52.05 50.35 52.85 50.10 51.70 48.00 48.75 53.50 -----

Image
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: Maybe not the best diversity of strongest chess engines under development

Post by Alayan »

I'm not surprised in the least.

Could you add Ethereal 12.75 data ? I'm curious how it clusters.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Maybe not the best diversity of strongest chess engines under development

Post by Laskos »

Alayan wrote: Sat Nov 14, 2020 8:24 pm I'm not surprised in the least.

Could you add Ethereal 12.75 data ? I'm curious how it clusters.

It clusters with classical engines, and is quite individual.


Image
Madeleine Birchfield
Posts: 512
Joined: Tue Sep 29, 2020 4:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: Maybe not the best diversity of strongest chess engines under development

Post by Madeleine Birchfield »

What about Seer, Halogen 8, and Minic 3?
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Maybe not the best diversity of strongest chess engines under development

Post by Laskos »

Madeleine Birchfield wrote: Sun Nov 15, 2020 8:03 am What about Seer, Halogen 8, and Minic 3?
Maybe I will check a few more, but I am not going to test every NNUE engine.
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Maybe not the best diversity of strongest chess engines under development

Post by Frank Quisinsky »

Kai,

That's a really good point!!
I found out that all NN engines I am looking have problems with complicated A80, A81, E97-E99 openings.
Engines lost understanding with NN.

Can be a good idea to put best A80 or E99 lines inside your test?
Best lines can be found really easy with our FEBOS database and the ranking system we developed.
The FEOBOS positions are sorted with the developed ranking system.

Thank you for the graphic and your work again!
Interesting what you do all the time (still reader).

Best
Frank
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Maybe not the best diversity of strongest chess engines under development

Post by Laskos »

Frank Quisinsky wrote: Sun Nov 15, 2020 1:39 pm Kai,

That's a really good point!!
I found out that all NN engines I am looking have problems with complicated A80, A81, E97-E99 openings.
Engines lost understanding with NN.

Can be a good idea to put best A80 or E99 lines inside your test?
Best lines can be found really easy with our FEBOS database and the ranking system we developed.
The FEOBOS positions are sorted with the developed ranking system.

Thank you for the graphic and your work again!
Interesting what you do all the time (still reader).

Best
Frank
I think classical engines too have problems with KID and Dutch, but NNUE are probably even worse in this respect as these openings are a bit peculiar. NNUE engines excel in mainstream openings, where they are almost the level of Lc0 positionally. But NNUE engines and Lc0 are underperforming in deviating from usual openings positions and in chess variants. I have probably more than 1000 of KID openings, maybe I will build a Sim including them in the set.
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Maybe not the best diversity of strongest chess engines under development

Post by Frank Quisinsky »

Hi Kai,

I test Duch lines often for my first impressions of a for me unknown engine or engine version!
Maestro is SlowChess here, great understandings for 92 of 100 of my dutch test-positions!

Not important for your thread!
But all what I like to write is ...
What you find out seems to be a main problem for NN ideas.

End of the day, most chess programs do the same or lost the own face.

Best
Frank
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Maybe not the best diversity of strongest chess engines under development

Post by Laskos »

Madeleine Birchfield wrote: Sun Nov 15, 2020 8:03 am What about Seer, Halogen 8, and Minic 3?

I seem to be unable to run Seer 1.1 and Halogen 8.1 with Sim, even if I messed with Sim.tcl file. I am not sure what's the matter, maybe they are not fully UCI compliant. I was curious about them, as they seem to be original NNUE implementations.
AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Maybe not the best diversity of strongest chess engines under development

Post by AndrewGrant »

Unfortunate, and entirely unsurprising in regards to the NNUE similarities. Perhaps, some hope though. Even though Komodo and Stockfish are, it appears, trained on the same code base, the differences between their evals, and its usage in training, is enough to have at least some diversity. You don't get quite near the level of intra-engine play. So one can return back to the argument of, "Its unique if its trained on different data, even if the trainer is the same", which was the failed mantra of DeusX, but seemed to work for Leelenstein and Allie.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )