SPCC: Testrun of Fire 8.NN finished

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
pohl4711
Posts: 2439
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

SPCC: Testrun of Fire 8.NN finished

Post by pohl4711 »

AB-testrun of Fire 8.NN finished - impressive progress!

https://www.sp-cc.de

(Perhaps you have to clear your browsercache or reload the website)
AndrewGrant
Posts: 1756
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: SPCC: Testrun of Fire 8.NN finished

Post by AndrewGrant »

Are you planning on adding any indications or marks to the engines that are using Stockfish networks?
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
User avatar
pohl4711
Posts: 2439
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Fire 8.NN finished

Post by pohl4711 »

AndrewGrant wrote: Mon May 31, 2021 11:19 am Are you planning on adding any indications or marks to the engines that are using Stockfish networks?
I am thinking about this. Not sure, how to handle it without making the list too much confusing.

Update: I did it. I wrote this below my ratinglist:

"Some engines are using a nnue-net based on Stockfish evals. Because of the free downloadable Stockfish nnue-nets and the free downloadable code of Stockfish, it is pretty hard for a tester to decide, which engine uses much or less or nothing of Stockfish (code, nnue-nets, coded ideas). So, I decided to test these engines, too. As far as I know the follwing engines use Stockfish-based nnue-nets (if I missed an engine, please contact me):
Fire 8.N, Fire 8.NN, Nemorino 6.00 "
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: SPCC: Testrun of Fire 8.NN finished

Post by kranium »

Some notes:
The probing code used in Fire 8.NN is a modified version of Daniel Shawul's nnue-probe
https://github.com/dshawul/nnue-probe
https://github.com/dshawul/nncpu-probe

which is based on Ronald de Man's Cfish nnue probing functions
https://github.com/syzygy1/Cfish

This nnue-probe code is independent to and quite different than official Stockfish's NNUE implementation

nnue-probe is actually similar in function to syzygy's tbprobe...
tbprobe queries endgame tablebases for info and evaluation data
nnue-probe queries the nnue network for position evaluations

The nnue used here is top reinforcement learning network trained by Sergio Viera. It was tested but not used by official SF.
Here is a complete listing of his efforts:
https://www.comp.nus.edu.sg/~sergio-v/nnue/

A big thanks goes to Daniel and Ronald for nnue-probe and making it available to anyone wanting to try a NNUE implementation in their own engine

And thanks Stefan for all your time and effort and clarity...
You are a star!
AndrewGrant
Posts: 1756
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: SPCC: Testrun of Fire 8.NN finished

Post by AndrewGrant »

pohl4711 wrote: Mon May 31, 2021 11:53 am
AndrewGrant wrote: Mon May 31, 2021 11:19 am Are you planning on adding any indications or marks to the engines that are using Stockfish networks?
I am thinking about this. Not sure, how to handle it without making the list too much confusing.

Update: I did it. I wrote this below my ratinglist:

"Some engines are using a nnue-net based on Stockfish evals. Because of the free downloadable Stockfish nnue-nets and the free downloadable code of Stockfish, it is pretty hard for a tester to decide, which engine uses much or less or nothing of Stockfish (code, nnue-nets, coded ideas). So, I decided to test these engines, too. As far as I know the follwing engines use Stockfish-based nnue-nets (if I missed an engine, please contact me):
Fire 8.N, Fire 8.NN, Nemorino 6.00 "
Thank you.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
User avatar
pohl4711
Posts: 2439
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Fire 8.NN finished

Post by pohl4711 »

AndrewGrant wrote: Mon May 31, 2021 3:39 pm
pohl4711 wrote: Mon May 31, 2021 11:53 am
AndrewGrant wrote: Mon May 31, 2021 11:19 am Are you planning on adding any indications or marks to the engines that are using Stockfish networks?
I am thinking about this. Not sure, how to handle it without making the list too much confusing.

Update: I did it. I wrote this below my ratinglist:

"Some engines are using a nnue-net based on Stockfish evals. Because of the free downloadable Stockfish nnue-nets and the free downloadable code of Stockfish, it is pretty hard for a tester to decide, which engine uses much or less or nothing of Stockfish (code, nnue-nets, coded ideas). So, I decided to test these engines, too. As far as I know the follwing engines use Stockfish-based nnue-nets (if I missed an engine, please contact me):
Fire 8.N, Fire 8.NN, Nemorino 6.00 "
Thank you.
I am not very happy, testing engines, which are using a nnue-net built on SF-evals. But the problem is: If I would not test engines, were the author made clear, that the engine uses a SF-based nnue-net, and I test other engines, were the authors says nothing about the used net (or lies about the used evals of "his" nnue-net) - then I would punish the authors, which are honest about the used net of their engine and would reward engine-authors, which are not honest. That would be the worst way to handle this, I believe...
And what about the code? I found it always difficult, to make a difference between "using code out of Stockfish-code" and "using ideas, tested in SF-Framework and coded in Stockfish, but recoding them in another engine". Because, I think, having ideas and using the hardware-power of the SF-framework to test them, is the main part of the innovation and the work. Not translating these ideas into some lines of program-code.
Example: Is there really a fundamental difference, to take the code for - lets say - Nullmove or Hashtables out of the first engine, using these techniques, or just taking the idea of these concepts and recoding them in a new engine??? Some people may say yes, but I say no.
From that point of view (my point of view), it find it hard to believe, that there is just one strong engine out there, which would be that strong without OpenSource-Stockfish existing. Means, I believe, today all strong engines are more or less SF-derivatives (except Lc0...), even though they (perhaps) do not contain just one single line of SF-code (but ideas taken out of SF-code and verified in SF-Framework). IMHO that is the situation in top engine-chess. And Stockfish-team itself took the idea of nnue-nets from a Shogi-engine and coded this idea/concept "new" for Stockfish.
As a tester, I have to find a way, to handle this (or stop testing). So, for me, I decided to test all these engines, except the nearly 100%-clones of SF (testing all these clones would lead to a distorted rating-list).
Is this the best way? I dont know. But I see no better solution at the moment.
connor_mcmonigle
Posts: 533
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: SPCC: Testrun of Fire 8.NN finished

Post by connor_mcmonigle »

pohl4711 wrote: Tue Jun 01, 2021 7:37 am
AndrewGrant wrote: Mon May 31, 2021 3:39 pm
pohl4711 wrote: Mon May 31, 2021 11:53 am
AndrewGrant wrote: Mon May 31, 2021 11:19 am Are you planning on adding any indications or marks to the engines that are using Stockfish networks?
I am thinking about this. Not sure, how to handle it without making the list too much confusing.

Update: I did it. I wrote this below my ratinglist:

"Some engines are using a nnue-net based on Stockfish evals. Because of the free downloadable Stockfish nnue-nets and the free downloadable code of Stockfish, it is pretty hard for a tester to decide, which engine uses much or less or nothing of Stockfish (code, nnue-nets, coded ideas). So, I decided to test these engines, too. As far as I know the follwing engines use Stockfish-based nnue-nets (if I missed an engine, please contact me):
Fire 8.N, Fire 8.NN, Nemorino 6.00 "
Thank you.
...
I am not very happy, testing engines, which are using a nnue-net built on SF-evals. But the problem is: If I would not test engines, were the author made clear, that the engine uses a SF-based nnue-net, and I test other engines, were the authors says nothing about the used net (or lies about the used evals of "his" nnue-net) - then I would punish the authors, which are honest about the used net of their engine and would reward engine-authors, which are not honest. That would be the worst way to handle this, I believe...
...
Ed developed a tool which can somewhat reliably detect whether a given engine is using a SF-based network. I've also been working on a similar tool which is a little more involved and will hopefully be able to more accurately detect instances of foul play. Most of those using Stockfish training+inference code with their own engine's unique evaluations+search to produce original network weights (Rubi, Igel, etc.) have been pretty careful to document their training process meaning their results should be reproducible.
In short, it should soon (if not already) be fairly difficult to mislead people about the origin of a network. Consequently, testing engines with Stockfish based networks does not benefit the community. Actually, giving such engines attention and, in doing so, detracting attention from more original efforts does more harm than good imho.
connor_mcmonigle
Posts: 533
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: SPCC: Testrun of Fire 8.NN finished

Post by connor_mcmonigle »

pohl4711 wrote: Tue Jun 01, 2021 7:37 am ...
And what about the code? I found it always difficult, to make a difference between "using code out of Stockfish-code" and "using ideas, tested in SF-Framework and coded in Stockfish, but recoding them in another engine". Because, I think, having ideas and using the hardware-power of the SF-framework to test them, is the main part of the innovation and the work. Not translating these ideas into some lines of program-code.
Example: Is there really a fundamental difference, to take the code for - lets say - Nullmove or Hashtables out of the first engine, using these techniques, or just taking the idea of these concepts and recoding them in a new engine??? Some people may say yes, but I say no.
From that point of view (my point of view), it find it hard to believe, that there is just one strong engine out there, which would be that strong without OpenSource-Stockfish existing. Means, I believe, today all strong engines are more or less SF-derivatives (except Lc0...), even though they (perhaps) do not contain just one single line of SF-code (but ideas taken out of SF-code and verified in SF-Framework). IMHO that is the situation in top engine-chess. And Stockfish-team itself took the idea of nnue-nets from a Shogi-engine and coded this idea/concept "new" for Stockfish.
As a tester, I have to find a way, to handle this (or stop testing). So, for me, I decided to test all these engines, except the nearly 100%-clones of SF (testing all these clones would lead to a distorted rating-list).
Is this the best way? I dont know. But I see no better solution at the moment.
Q. Is there really a fundamental difference in taking code vs. taking ideas?

A. Yes. For any given well known search heuristic, such as null move pruning, there exist innumerable potential variants one could opt for in their concrete implementation of said search heuristic. For example, for null move pruning alone, many questions arise: Should back to back null moves be allowed? Under what conditions is it safe to perform a null move? Should a verification search be performed to insure nothing is missed? Under what conditions should a verification search be performed? What formula should be used in computing the null move search depth? I could go on and on here. The details definitely matter.

If an engine makes all the same decisions as Stockfish and implements all the same heuristics found in Stockfish, then it's a clone. Most authors, myself included, test every patch they write as, unless you're just copying everything from another engine, such as Stockfish, almost everything you try will fail. If a given search function isn't copied, many patches which worked for other engines will almost certainly fail for said search function.