NNUE - only from own engine?

Rebel · Post by **Rebel** » Mon Oct 25, 2021 12:27 pm

Gabor Szots wrote: ↑Mon Oct 25, 2021 10:20 am
amanjpro wrote: ↑Mon Oct 25, 2021 4:24 amI believe most of the major rating list testers (CEGT, CCRL and others) are not interested in SF NNUE unless the net is trained solely on the engine's own games

That's true.

I think the issue is important enough open a discussion.

NNUE eval is the result from :

1. Training software (freely available)
2. Quality of the EPD's (hard own work)
3. NNUE implementation (freely available)

Since the elo is in the EPD (and is the gold digging part) I see no good reason reason to put a limitation on the creative part (the quality of the EPD's). Further it is discrimination to starters who are forced to write a good HCE eval first. Everybody should be free to create his own EPD database as he pleases, train it as he pleases, implement it as he pleases.

I see only one limitation, using an existing NNUE from someone else. It means the creative (and hard) part is skipped. Fire comes to mind, I don't test it.

amanjpro · Post by **amanjpro** » Mon Oct 25, 2021 12:54 pm

Rebel wrote: ↑Mon Oct 25, 2021 12:27 pm
Gabor Szots wrote: ↑Mon Oct 25, 2021 10:20 am
amanjpro wrote: ↑Mon Oct 25, 2021 4:24 amI believe most of the major rating list testers (CEGT, CCRL and others) are not interested in SF NNUE unless the net is trained solely on the engine's own games

That's true.
I think the issue is important enough open a discussion.

NNUE eval is the result from :

1. Training software (freely available)
2. Quality of the EPD's (hard own work)
3. NNUE implementation (freely available)

Since the elo is in the EPD (and is the gold digging part) I see no good reason reason to put a limitation on the creative part (the quality of the EPD's). Further it is discrimination to starters who are forced to write a good HCE eval first. Everybody should be free to create his own EPD database as he pleases, train it as he pleases, implement it as he pleases.

I see only one limitation, using an existing NNUE from someone else. It means the creative (and hard) part is skipped. Fire comes to mind, I don't test it.

I know an engine who started with PSQT, then trained a net on that and is already at 2600 probably... So the eval part doesn't need too be that good.

After all nnue is not amazing because of amazing eval, but because it's the result of a somewhat deep search and and eval.

Btw, most of the openbench engines have a completely different architecture as well as probing code than what's found in SF.

Zahak's is nothing more than a glorified PSQT

yeni_sekme · Post by **yeni_sekme** » Mon Oct 25, 2021 2:24 pm

I, as a beginner at programming, use Stockfish and Leela data to train my networks. I don't see any problem with that. Writing a hand-crafted eval and generating data from it would take quite an amount of time. Instead of wasting my time on this, I think it is much more beneficial to train different network architecture from already available data both for me and for the chess community.

smatovic · Post by **smatovic** » Mon Oct 25, 2021 2:43 pm

I am working on alternative search methods on GPU and simply have not the resources (time, hardware) to create/train networks on my own, will gladly use any nets available to focus on other aspects of my engines.

--
Srdja

ChickenLogic · Post by **ChickenLogic** » Mon Oct 25, 2021 3:26 pm

Rebel wrote: ↑Mon Oct 25, 2021 12:27 pm
Gabor Szots wrote: ↑Mon Oct 25, 2021 10:20 am
amanjpro wrote: ↑Mon Oct 25, 2021 4:24 amI believe most of the major rating list testers (CEGT, CCRL and others) are not interested in SF NNUE unless the net is trained solely on the engine's own games

That's true.
I think the issue is important enough open a discussion.

NNUE eval is the result from :

1. Training software (freely available)
2. Quality of the EPD's (hard own work)
3. NNUE implementation (freely available)

Since the elo is in the EPD (and is the gold digging part) I see no good reason reason to put a limitation on the creative part (the quality of the EPD's). Further it is discrimination to starters who are forced to write a good HCE eval first. Everybody should be free to create his own EPD database as he pleases, train it as he pleases, implement it as he pleases.

I see only one limitation, using an existing NNUE from someone else. It means the creative (and hard) part is skipped. Fire comes to mind, I don't test it.

Lately, nearly all of the progress for SF nets have been made in the trainer - not the data. What you call the 'EPD' part (besides the only EPDs used are the opening books for data generation, EPD itself is not suitable for training) can be a gold mine. I've shown that with converted Leela data. However, you severely underestimate the work that goes into the trainer and into testing the trainer. Even if you were to build on top of SF's trainer, there are lot's of things to do differently and a lot of things to improve. It is just as much of a 'gold mine' as the training data itself.

As for fire, 'his' implementation wasn't new. He didn't even bother to train a net on his own; he straight up took it from Sergio's series of nets specifically trained for SF on SF data with SF's trainer. That's the highest level of lazy one can 'achieve' really.

Uri Blass · Post by **Uri Blass** » Mon Oct 25, 2021 3:48 pm

amanjpro wrote: ↑Mon Oct 25, 2021 12:54 pm
Rebel wrote: ↑Mon Oct 25, 2021 12:27 pm
Gabor Szots wrote: ↑Mon Oct 25, 2021 10:20 am
amanjpro wrote: ↑Mon Oct 25, 2021 4:24 amI believe most of the major rating list testers (CEGT, CCRL and others) are not interested in SF NNUE unless the net is trained solely on the engine's own games

That's true.
I think the issue is important enough open a discussion.

NNUE eval is the result from :

1. Training software (freely available)
2. Quality of the EPD's (hard own work)
3. NNUE implementation (freely available)

Since the elo is in the EPD (and is the gold digging part) I see no good reason reason to put a limitation on the creative part (the quality of the EPD's). Further it is discrimination to starters who are forced to write a good HCE eval first. Everybody should be free to create his own EPD database as he pleases, train it as he pleases, implement it as he pleases.

I see only one limitation, using an existing NNUE from someone else. It means the creative (and hard) part is skipped. Fire comes to mind, I don't test it.

I know an engine who started with PSQT, then trained a net on that and is already at 2600 probably... So the eval part doesn't need too be that good.

After all nnue is not amazing because of amazing eval, but because it's the result of a somewhat deep search and and eval.

Btw, most of the openbench engines have a completely different architecture as well as probing code than what's found in SF.

Zahak's is nothing more than a glorified PSQT

1)I read of somebody who got elo that is bigger than 3000 only by PSQT and search so 2600 with PSQT and a net is not impressive.
2600 is today low in computer chess.

2)I do not see the reason programmers care about more elo instead of caring about some knowledge that top programs do not have.

For example top programs do not know to evaluate correctly the following drawn position and you only need to have an evaluation function that simply ask the engine to play against itself at depth 5 and return the result of the game as the static evaluation to do it.

The evaluation of the engine may be expensive because of the time it needs to play against itself at depth 5 and the engine may be weaker but who care about it when the engine can see things that other engines do not see like the fact that the position is a draw even some plies earlier?

5 is of course an arbitrary number and people may change this number to make the static evaluation more expensive and more accurate if they like to do it.

[fen]2k5/1pP5/pP6/P7/8/8/P6B/4K3 w - - 0 1 [/fen]

amanjpro · Post by **amanjpro** » Mon Oct 25, 2021 4:08 pm

Uri Blass wrote: ↑Mon Oct 25, 2021 3:48 pm
amanjpro wrote: ↑Mon Oct 25, 2021 12:54 pm
Rebel wrote: ↑Mon Oct 25, 2021 12:27 pm
Gabor Szots wrote: ↑Mon Oct 25, 2021 10:20 am
amanjpro wrote: ↑Mon Oct 25, 2021 4:24 amI believe most of the major rating list testers (CEGT, CCRL and others) are not interested in SF NNUE unless the net is trained solely on the engine's own games

That's true.
I think the issue is important enough open a discussion.

NNUE eval is the result from :

1. Training software (freely available)
2. Quality of the EPD's (hard own work)
3. NNUE implementation (freely available)

Since the elo is in the EPD (and is the gold digging part) I see no good reason reason to put a limitation on the creative part (the quality of the EPD's). Further it is discrimination to starters who are forced to write a good HCE eval first. Everybody should be free to create his own EPD database as he pleases, train it as he pleases, implement it as he pleases.

I see only one limitation, using an existing NNUE from someone else. It means the creative (and hard) part is skipped. Fire comes to mind, I don't test it.

I know an engine who started with PSQT, then trained a net on that and is already at 2600 probably... So the eval part doesn't need too be that good.

After all nnue is not amazing because of amazing eval, but because it's the result of a somewhat deep search and and eval.

Btw, most of the openbench engines have a completely different architecture as well as probing code than what's found in SF.

Zahak's is nothing more than a glorified PSQT
1)I read of somebody who got elo that is bigger than 3000 only by PSQT and search so 2600 with PSQT and a net is not impressive.
2600 is today low in computer chess.

Well, the point still applies, you create a very basic eval, generate data, train a net, then use the net to generate more data and train a new net, and etc. with 8 cores, you can generate at least 40M fens a day. Zahak's first network was trained over 57M fens (that is one day's worth of data)
and it was still able to beat my somewhat elaborate eval.

Uri Blass wrote: ↑Mon Oct 25, 2021 3:48 pm
2)I do not see the reason programmers care about more elo instead of caring about some knowledge that top programs do not have.

Again, if rating lists are here, if tournaments are here, then the elo-craze stays... we make it competition like, and then when somebody cares about it is blamed for it? I don't get the logic really...

I care about other things apart from making the engine stronger (MORE ELO), but so does most of the programmers, that is why you see MultiPV there, even though it adds no strength, or extra features like "own book", "searchmoves", "go mate" and etc...

Uri Blass wrote: ↑Mon Oct 25, 2021 3:48 pm
For example top programs do not know to evaluate correctly the following drawn position and you only need to have an evaluation function that simply ask the engine to play against itself at depth 5 and return the result of the game as the static evaluation to do it.

The evaluation of the engine may be expensive because of the time it needs to play against itself at depth 5 and the engine may be weaker but who care about it when the engine can see things that other engines do not see like the fact that the position is a draw even some plies earlier?

5 is of course an arbitrary number and people may change this number to make the static evaluation more expensive and more accurate if they like to do it.

[fen]2k5/1pP5/pP6/P7/8/8/P6B/4K3 w - - 0 1 [/fen]

Sorry I don't understand the message here

amanjpro · Post by **amanjpro** » Mon Oct 25, 2021 4:14 pm

Uri Blass wrote: ↑Mon Oct 25, 2021 3:48 pm For example top programs do not know to evaluate correctly the following drawn position and you only need to have an evaluation function that simply ask the engine to play against itself at depth 5 and return the result of the game as the static evaluation to do it.

The evaluation of the engine may be expensive because of the time it needs to play against itself at depth 5 and the engine may be weaker but who care about it when the engine can see things that other engines do not see like the fact that the position is a draw even some plies earlier?

5 is of course an arbitrary number and people may change this number to make the static evaluation more expensive and more accurate if they like to do it.

[fen]2k5/1pP5/pP6/P7/8/8/P6B/4K3 w - - 0 1 [/fen]

Oh, after re-reading it, I think I get what you mean... ok, lemme tell you that in Zahak 6.2, I spent a whole month trying to recognize draw patterns, and wrong-color bishop was one of them, and it still fails on this particular example. You need to "search/play" 100 plies to know it, which is a longgggg time, basically if you want to play this in blitz you will timeout certainly

Madeleine Birchfield · Mon Oct 25, 2021 4:30 pm

This is entirely the fault of TCEC and their stupid NNUE guidelines. On 19 October 2020, they added the following to the TCEC rules:

Guidelines for use of NNUE at TCEC:
1. NNUE code can be used and considered as if it was a library (even if it is not literally one).
2. Custom modifications to the basic NNUE code are strongly encouraged, it should be considered rather like a starting point.
3. All NNUE training data should be generated by the unique engine's own search and/or eval code.
Do the NNUE guidelines apply outside NNUE technology?
No, of course not.

https://wiki.chessdom.org/index.php?tit ... oldid=1468

Point number 3 is the relevant point here and is the reason behind the rating lists only caring about engines trained with their own search and eval. It is also something that the rating lists have violated multiple times, testing Nemorino 6.00 which used Stockfish's evaluation for training, and Stockfish 14, which used Leela's evaluation for training.

Recently, on 22 October 2021, TCEC have depreciated their NNUE guidelines, removing it from their rules section:

https://wiki.chessdom.org/index.php?tit ... oldid=2077
https://wiki.chessdom.org/index.php?tit ... oldid=2029

which means it should be about time the rating lists followed suit.

purechess · Post by **purechess** » Mon Oct 25, 2021 4:57 pm

Madeleine Birchfield wrote: ↑Mon Oct 25, 2021 4:30 pm Stockfish 14, which used Leela's evaluation for training.

Not exactly....

The LCZero team has provided a collection of billions of positions evaluated by Leela that we have combined with billions of positions evaluated by Stockfish to train the NNUE net that powers Stockfish 14.

NNUE - only from own engine?

NNUE - only from own engine?

Re: NNUE - only from own engine?

Re: NNUE - only from own engine?

Re: NNUE - only from own engine?

Re: NNUE - only from own engine?

Re: NNUE - only from own engine?

Re: NNUE - only from own engine?

Re: NNUE - only from own engine?

Re: NNUE - only from own engine?

Re: NNUE - only from own engine?