NNUE Research Project

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: NNUE Research Project

Post by Ferdy »

Made a similar comparison from the data in ff2 folder with different metrics. The result is similar to table 4. As movetime increases similarity increases. This comparison also involved ff2 and sf12. ff2 is more similar to sf13 than sf12.

Code: Select all

                    pair  MinErr  MaxErr  ErrSqStdev  RMS  PosTried  MvSimPct  TotalPos
 FF2-100ms vs SF12-100ms     340     454        9790   60      7984        58      8238
 FF2-100ms vs SF13-100ms     287     426        5411   44      8071        59      8238
 FF2-250ms vs SF12-250ms     317     370        8130   57      7967        62      8238
 FF2-250ms vs SF13-250ms     312     274        4080   40      8044        64      8238
 FF2-500ms vs SF12-500ms     304     378        7890   55      7951        63      8238
 FF2-500ms vs SF13-500ms     283     283        3923   38      8021        66      8238
 

Code: Select all

MinErr    : The minimum absolute score difference between score1 and score2.
            Or minimmum from all abs(score1 - score2)
MaxErr    : The maximum absolute score difference between score1 and score2.
            Or maximum from all abs(score1 - score2)
ErrSqStdev: The sample standard deviation of the error square.
            Or sqrt(sum((errsq_i - mean) * (errsq_i - mean))/N-1)
            where: N=Postried, errsq_i = score1-score2 @ posnum i
            higher value means the errsq are more spread out and similarity is weaker.
RMS       : The Root Mean Square or sqrt(Sum(error*error)/PosTried)
            where: error = score1 - score2
PosTried  : The number of positions that are actually compared.
            When engine score is above 500 or below -500
            that position is not included in the score comparison.
MvSimPct  : Move similarity Percentage, if engine1 and engine2 moves are the same
            count it as similar. Or (100 * num_similar_move/TotalPos)
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: NNUE Research Project

Post by Rebel »

Ferdy wrote: Sat Mar 13, 2021 1:12 am
Rebel wrote: Fri Mar 12, 2021 6:45 pm
Ferdy wrote: Fri Mar 12, 2021 3:11 pm
Rebel wrote: Thu Mar 11, 2021 9:05 pm Regarding the questions, I compiled a download with all the epd files at http://rebel13.nl/nnue-epd.7z

Contents:

Code: Select all

11-03-2021  19:58    <DIR>          .
11-03-2021  19:58    <DIR>          ..
11-03-2021  19:54    <DIR>          ff2
04-03-2021  08:59           647.686 Nemorino.epd
09-03-2021  22:18           644.556 Orion_0.8.epd
08-03-2021  12:18           649.858 SF12-Igel-270.epd
08-03-2021  12:35           649.635 SF12-Igel-280.epd
08-03-2021  21:18           652.073 SF12-Igel-290.epd
08-03-2021  12:53           652.728 SF12-Minic.epd
08-03-2021  13:11           653.371 SF12-napping-nexus.epd
08-03-2021  13:29           650.594 SF12-nascent-nutrient.epd
09-03-2021  11:03           652.710 SF12-Orion_0.7.epd
08-03-2021  13:46           652.583 SF12-sf-0c6fc5ef48e1.epd
08-03-2021  17:44           652.711 SF12-sf-516f5b95189a.epd
08-03-2021  17:27           652.513 SF12-sf-dd0c4c630f7e.epd
09-03-2021  09:40           652.457 SF12-sv-20200720-1017.epd
09-03-2021  10:00           652.411 SF12-sv-20200721-0909.epd
09-03-2021  10:20           652.651 SF12-sv-20200721-1432.epd
09-03-2021  10:38           652.882 SF12-sv-20200906-1046.epd
09-03-2021  10:56           652.892 SF12-sv-20200908-1733.epd
09-03-2021  11:15           652.759 SF12-sv-20200914-1520.epd
08-03-2021  11:43           652.698 SF12.epd
08-03-2021  21:35           651.330 SF13-Igel-290.epd
08-03-2021  23:30           652.440 SF13-Rubi-2.01.epd
09-03-2021  09:06           651.983 SF13-sf-6b7a4192c303.epd
09-03-2021  09:26           651.878 SF13-sf-94816594b327.epd
08-03-2021  12:01           652.459 SF13.epd
Remarks:
1. nets are tested as much as possible with SF12 and SF13.
2. Nemorino and Orion 0.8 are the exceptions since they have a different file format.
3. The Orion 0.7 version has the exact SF12 net.
4. the folder ff2 contains the epd's of sf12, 13 and ff2 at 100ms, 250 and 500ms, made by someone else on a different pc.
5. epd's labelled with "sv" are the tested "sergio" nets.
6. epd's labelled with "sf" are Stockfish nets.
Under folder ff2:
FF2-100ms.epd is the output when ff2 engine uses the ff2 net?
Yes, out of the box.
What is SF12-100ms.epd?
To calculate the RMS and SIM -

Code: Select all

sim-score ff2-100ms.epd sf12-100ms.epd
sim-score ff2-100ms.epd sf13-100ms.epd

sim-score ff2-250ms.epd sf12-250ms.epd
sim-score ff2-250ms.epd sf13-250ms.epd

sim-score ff2-500ms.epd sf12-500ms.epd
sim-score ff2-500ms.epd sf13-500ms.epd
There is also a file named SF12.epd, outside the ff2 folder, what is the difference between this file and SF12-100ms.epd in the ff2 folder?
From above:
4. the folder ff2 contains the epd's of sf12, 13 and ff2 at 100ms, 250 and 500ms, made by someone else on a different (faster) pc

The rest is mine, difference pc, different speed, hence the separation, all tested at 100ms. You can make any comparison.
90% of coding is debugging, the other 10% is writing bugs.
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: NNUE Research Project

Post by Ferdy »

Rebel wrote: Sat Mar 13, 2021 2:20 am
Ferdy wrote: Sat Mar 13, 2021 1:12 am
Rebel wrote: Fri Mar 12, 2021 6:45 pm
Ferdy wrote: Fri Mar 12, 2021 3:11 pm
Rebel wrote: Thu Mar 11, 2021 9:05 pm Regarding the questions, I compiled a download with all the epd files at http://rebel13.nl/nnue-epd.7z

Contents:

Code: Select all

11-03-2021  19:58    <DIR>          .
11-03-2021  19:58    <DIR>          ..
11-03-2021  19:54    <DIR>          ff2
04-03-2021  08:59           647.686 Nemorino.epd
09-03-2021  22:18           644.556 Orion_0.8.epd
08-03-2021  12:18           649.858 SF12-Igel-270.epd
08-03-2021  12:35           649.635 SF12-Igel-280.epd
08-03-2021  21:18           652.073 SF12-Igel-290.epd
08-03-2021  12:53           652.728 SF12-Minic.epd
08-03-2021  13:11           653.371 SF12-napping-nexus.epd
08-03-2021  13:29           650.594 SF12-nascent-nutrient.epd
09-03-2021  11:03           652.710 SF12-Orion_0.7.epd
08-03-2021  13:46           652.583 SF12-sf-0c6fc5ef48e1.epd
08-03-2021  17:44           652.711 SF12-sf-516f5b95189a.epd
08-03-2021  17:27           652.513 SF12-sf-dd0c4c630f7e.epd
09-03-2021  09:40           652.457 SF12-sv-20200720-1017.epd
09-03-2021  10:00           652.411 SF12-sv-20200721-0909.epd
09-03-2021  10:20           652.651 SF12-sv-20200721-1432.epd
09-03-2021  10:38           652.882 SF12-sv-20200906-1046.epd
09-03-2021  10:56           652.892 SF12-sv-20200908-1733.epd
09-03-2021  11:15           652.759 SF12-sv-20200914-1520.epd
08-03-2021  11:43           652.698 SF12.epd
08-03-2021  21:35           651.330 SF13-Igel-290.epd
08-03-2021  23:30           652.440 SF13-Rubi-2.01.epd
09-03-2021  09:06           651.983 SF13-sf-6b7a4192c303.epd
09-03-2021  09:26           651.878 SF13-sf-94816594b327.epd
08-03-2021  12:01           652.459 SF13.epd
Remarks:
1. nets are tested as much as possible with SF12 and SF13.
2. Nemorino and Orion 0.8 are the exceptions since they have a different file format.
3. The Orion 0.7 version has the exact SF12 net.
4. the folder ff2 contains the epd's of sf12, 13 and ff2 at 100ms, 250 and 500ms, made by someone else on a different pc.
5. epd's labelled with "sv" are the tested "sergio" nets.
6. epd's labelled with "sf" are Stockfish nets.
Under folder ff2:
FF2-100ms.epd is the output when ff2 engine uses the ff2 net?
Yes, out of the box.
What is SF12-100ms.epd?
To calculate the RMS and SIM -

Code: Select all

sim-score ff2-100ms.epd sf12-100ms.epd
sim-score ff2-100ms.epd sf13-100ms.epd

sim-score ff2-250ms.epd sf12-250ms.epd
sim-score ff2-250ms.epd sf13-250ms.epd

sim-score ff2-500ms.epd sf12-500ms.epd
sim-score ff2-500ms.epd sf13-500ms.epd
There is also a file named SF12.epd, outside the ff2 folder, what is the difference between this file and SF12-100ms.epd in the ff2 folder?
From above:
4. the folder ff2 contains the epd's of sf12, 13 and ff2 at 100ms, 250 and 500ms, made by someone else on a different (faster) pc

The rest is mine, difference pc, different speed, hence the separation, all tested at 100ms. You can make any comparison.
Got it thanks.
David Carteau
Posts: 121
Joined: Sat May 24, 2014 9:09 am
Location: France
Full name: David Carteau

Re: NNUE Research Project

Post by David Carteau »

Rebel wrote: Fri Mar 12, 2021 6:29 pm Consider it done!
A big thank you !
Regards,
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: NNUE Research Project

Post by Rebel »

I am looking for more engines that moved to NNUE.

Anyone?

The data so far is too less for conclusions.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: NNUE Research Project

Post by Rebel »

RubiChess wrote: Fri Mar 12, 2021 12:35 pm
Rebel wrote: Wed Mar 10, 2021 9:29 pm NNUE Research Project
March 10, 2021

It´s generally known by now similarity testing on moves does not work with NNUE nets. On this page we will try to research if it is not possible using other methods. One method is to calculate the Root-mean-square deviation (or RMS) of the scores instead of moves as after all NNUE is a set of scores. We will present data and the source code for discussion.

Let´s start at the beginning of NNUE in the summer of 2020 the starting point of the NNUE revolution when the Stockfish team implemented the Sergio nets. Our first goal is to measure the stability of the RMS of Stockfish NNUE nets. From the Sergio nets we calculate the RMS of the very first 3 nets (july) and the last 3 (september) and compare the RMS with the final SF12 net, see table one. In table two the nets between SF12 and SF13 are compared plus 5 nets after the release of SF13.

....

http://rebel13.nl/home/nnue.html
Observation/remark #2 looks strange. Talks about Igel-2.7/2.8 but then switches to Fat Fritz 2 instead of Igel-2.9 (which looks different from <=2.8).

Regards, Andreas
Updated the page.
90% of coding is debugging, the other 10% is writing bugs.
Madeleine Birchfield
Posts: 512
Joined: Tue Sep 29, 2020 4:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: NNUE Research Project

Post by Madeleine Birchfield »

Rebel wrote: Sat Mar 13, 2021 11:39 am I am looking for more engines that moved to NNUE.

Anyone?

The data so far is too less for conclusions.
The latest versions of Pedone, Marvin, Komodo, Scorpio, Halogen, Seer, and Tornado should have NNUE as well.
Last edited by Madeleine Birchfield on Sat Mar 13, 2021 4:59 pm, edited 1 time in total.
dkappe
Posts: 1631
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: NNUE Research Project

Post by dkappe »

Rebel wrote: Sat Mar 13, 2021 11:39 am I am looking for more engines that moved to NNUE.

Anyone?

The data so far is too less for conclusions.
Toga III https://github.com/dkappe/TogaIII/releases
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: NNUE Research Project

Post by Rebel »

dkappe wrote: Sat Mar 13, 2021 4:59 pm
Rebel wrote: Sat Mar 13, 2021 11:39 am I am looking for more engines that moved to NNUE.

Anyone?

The data so far is too less for conclusions.
Toga III https://github.com/dkappe/TogaIII/releases
RMS=72.63
SIM=13.71

Nice :D
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: NNUE Research Project

Post by Rebel »

dkappe wrote: Sat Mar 13, 2021 4:59 pm
Rebel wrote: Sat Mar 13, 2021 11:39 am I am looking for more engines that moved to NNUE.

Anyone?

The data so far is too less for conclusions.
Toga III https://github.com/dkappe/TogaIII/releases
Solly, ignore previous post, real numbers -

SF12 -> RMS=72.63 | SIM=46.22
SF13 -> RMS=72.20 | SIM=45.65
90% of coding is debugging, the other 10% is writing bugs.