Ethereal Pawn-King NN

AndrewGrant · Post by **AndrewGrant** » Sat Sep 19, 2020 7:34 pm

So this is not an official release, but I want to share that I have pushed another commit into Ethereal, Ethereal 12.58, which implements a [224, 32, 1] NN which is used to adjust the evaluation of positions by looking at the placement of Pawns & Kings. The computation is extremely well hashed -- hitting upwards of 92% hit-rates in the main PawnKing Cache, another 14% hit-rate in the general EvalCache, and then getting the usual benefit of being stored by the Transposition Table. The incremental update portion accounts for a ~2.5% slowdown, and the evaluation itself is <1.0% a slowdown. Overall, the program is essentially just as fast, at least on modern architectures where the compiler plays nicely.

I've applied the same regression testing that I always do for releases. The last release clocked in at ~+18.XX. This version is already coming in at +26.08. It will be awhile until I make my way up to Ethereal 12.75 for a new release, and hopefully by that time I will have trained some better networks, and added more into the evaluation. The training code is built using PyTorch, and works by mapping the (NN+static_eval) => Win rates. I will not be sharing that code at this time.

If you are someone who uses Ethereal, and are used to building your own, I suggest you go and update. This is the single largest patch committed into Ethereal in the last two years. It massively reduces the evaluation inaccuracy.

The Network is embedded (Not like SF, as my networks are only ~140kb). At this time there is no option to swap out the file, nor is there an option to disable the Network. Someone interested can easily figure out the format by taking a look at weights/pknet_224x32x1.net.

If there is interest, I can generate some compilations for Windows / Android users. Although I will say, I have not seen what kind of performance hit weaker architectures take, but I expect it to be very low.

The commit:
https://github.com/AndyGrant/Ethereal/c ... f359399567

The commit message:

Add a [224, 32, 1] NN to augment the existing Pawn King evaluation.

This network has 224 inputs, mapped to white King bb, the white Pawn bb (minus the promotion ranks), the black King bb, and the black Pawn bb (minus the promotion ranks). We incrementally update the 1st layer of Neurons. Due to Pawn Hashing, Eval Caching, and the TT, we very rarely actually have to perform the computation to move from the 1st layer to the output Neuron.

The training code for these networks is a private implementation using the PyTorch framework. The goal of the trainer is to train to output a centipawn value, which is then put through a sigmoid after being offset by a static evaluation of the position.

This work could not have been done without the work of @KierenP , the author of https://github.com/KierenP/Halogen . Halogen offers some generic NN structure code which made trying out new nets quick and easy. While developing the networks, I used a c++ fork of Ethereal which contained Halogen's NN code.

The final code does not contain anything from Halogen, nor does it contain anything from Stockfish or the Leela projects. The code is entirely new, and is not based in full or in part upon the work on the Stockfish project's NNUE. This network _augments_ the existing evaluation. This network does _not_ replace the existing evaluation in any way, shape, or form.

ELO | 25.81 +- 10.38 (95%)
SPRT | 10.0+0.1s Threads=1 Hash=8MB
LLR | 2.95 (-2.94, 2.94) [0.00, 5.00]
Games | N: 1888 W: 484 L: 344 D: 1060
http://chess.grantnet.us/test/7422/

ELO | 23.39 +- 8.71 (95%)
SPRT | 60.0+0.6s Threads=1 Hash=64MB
LLR | 2.95 (-2.94, 2.94) [0.00, 5.00]
Games | N: 1904 W: 362 L: 234 D: 1308
http://chess.grantnet.us/test/7423/

BENCH : 4,679,412

D Sceviour · Post by **D Sceviour** » Sat Sep 19, 2020 8:06 pm

Schooner has been doing that for some time with a 256 position map. This [224, 32, 1] map looks interesting and I will put it on the agenda for things to looks at. I wonder how many unique positions can be obtained from training data sets?

peter · Post by **peter** » Sat Sep 19, 2020 8:22 pm

AndrewGrant wrote: ↑Sat Sep 19, 2020 7:34 pm If there is interest, I can generate some compilations for Windows / Android users.

Great news, thank!
Windows SSE4.1-popcnt would be great.

Looking forward regards

Guenther · Post by **Guenther** » Sat Sep 19, 2020 8:48 pm

AndrewGrant wrote: ↑Sat Sep 19, 2020 7:34 pm So this is not an official release, but I want to share that I have pushed another commit into Ethereal, Ethereal 12.58, which implements a [224, 32, 1] NN which is used to adjust the evaluation of positions by looking at the placement of Pawns & Kings. The computation is extremely well hashed -- hitting upwards of 92% hit-rates in the main PawnKing Cache, another 14% hit-rate in the general EvalCache, and then getting the usual benefit of being stored by the Transposition Table. The incremental update portion accounts for a ~2.5% slowdown, and the evaluation itself is <1.0% a slowdown. Overall, the program is essentially just as fast, at least on modern architectures where the compiler plays nicely.

I've applied the same regression testing that I always do for releases. The last release clocked in at ~+18.XX. This version is already coming in at +26.08. It will be awhile until I make my way up to Ethereal 12.75 for a new release, and hopefully by that time I will have trained some better networks, and added more into the evaluation. The training code is built using PyTorch, and works by mapping the (NN+static_eval) => Win rates. I will not be sharing that code at this time.

If you are someone who uses Ethereal, and are used to building your own, I suggest you go and update. This is the single largest patch committed into Ethereal in the last two years. It massively reduces the evaluation inaccuracy.

...

This sounds interesting. What would be the best way to measure the 'unknown' speed loss due to weak hardware (mine)
with the KB-NN? I guess startpos nps average wouldn't be sufficient?

BTW I get an overflow warning depending on the new network.c file

Code: Select all

gcc -O3 -std=gnu11 -Wall -Wextra -Wshadow -static -DNDEBUG -flto -march=native *.c pyrrhic/tbprobe.c -lpthread -lm -o ../dist/Ethereal-x64-nopopcnt.exe
In function 'initPKNetwork',
    inlined from 'main' at uci.c:70:5:
network.c:85:9: warning: '__builtin_memcpy' writing 624 bytes into a region of size 623 [-Wstringop-overflow=]
   85 |         strcpy(weights, PKWeights[i + PKNETWORK_LAYER1]);
      |         ^
uci.c: In function 'main':
network.c:84:14: note: at offset 0 to an object with size 623 declared here
   84 |         char weights[strlen(PKWeights[i + PKNETWORK_LAYER1])];
      |              ^

AndrewGrant · Post by **AndrewGrant** » Sat Sep 19, 2020 8:56 pm

Guenther wrote: ↑Sat Sep 19, 2020 8:48 pm This sounds interesting. What would be the best way to measure the 'unknown' speed loss due to weak hardware (mine)
with the KB-NN? I guess startpos nps average wouldn't be sufficient?

BTW I get an overflow warning depending on the new network.c file
Code: Select all
gcc -O3 -std=gnu11 -Wall -Wextra -Wshadow -static -DNDEBUG -flto -march=native *.c pyrrhic/tbprobe.c -lpthread -lm -o ../dist/Ethereal-x64-nopopcnt.exe
In function 'initPKNetwork',
    inlined from 'main' at uci.c:70:5:
network.c:85:9: warning: '__builtin_memcpy' writing 624 bytes into a region of size 623 [-Wstringop-overflow=]
   85 |         strcpy(weights, PKWeights[i + PKNETWORK_LAYER1]);
      |         ^
uci.c: In function 'main':
network.c:84:14: note: at offset 0 to an object with size 623 declared here
   84 |         char weights[strlen(PKWeights[i + PKNETWORK_LAYER1])];
      |              ^

So the amount of effort in the NN is proportional to what % of moves on the board are involving the Pawns or Kings. The start position is fairly fast, as their are many other development options. Blockaded positions are fast because the Pawns are static. King Pawn endgames are the worst case, because most all moves involve updates the NN computation.

Those warnings were pointed out to me. I'm pretty sure the solution will be to add 1. 1 + strlen(....), but I'll need to look at it closer and get a compiler version which reports the warning, as none of mine currently do.

Tony P. · Post by **Tony P.** » Sat Sep 19, 2020 10:31 pm

This looks like a September edition of April Fools'

Tony P. · Post by **Tony P.** » Sat Sep 19, 2020 11:01 pm

Simply adding the NN output to the static eval looks too primitive to me... I'd be convinced if the NN output a king-pawn-configuration-dependent PST or even a whole set of eval params to be then plugged into the classical static eval.

AndrewGrant · Post by **AndrewGrant** » Sat Sep 19, 2020 11:03 pm

Tony P. wrote: ↑Sat Sep 19, 2020 11:01 pm Simply adding the NN output to the static eval looks too primitive to me... I'd be convinced if the NN output a king-pawn-configuration-dependent PST or even a whole set of eval constants to be then plugged into the static eval.

Primitive was the objective. Maintain the uniqueness of the evaluation, but allow NNs to slightly correct the evaluation to reduce loss.
Note that what I'm doing is akin to outputting a PST for Pawns + Kings, and then summing it up. I'm just skipping the steps.

dkappe · Post by **dkappe** » Sun Sep 20, 2020 12:01 am

AndrewGrant wrote: ↑Sat Sep 19, 2020 11:03 pm
Tony P. wrote: ↑Sat Sep 19, 2020 11:01 pm Simply adding the NN output to the static eval looks too primitive to me... I'd be convinced if the NN output a king-pawn-configuration-dependent PST or even a whole set of eval constants to be then plugged into the static eval.
Primitive was the objective. Maintain the uniqueness of the evaluation, but allow NNs to slightly correct the evaluation to reduce loss.
Note that what I'm doing is akin to outputting a PST for Pawns + Kings, and then summing it up. I'm just skipping the steps.

Nicely done.

Tony P. · Post by **Tony P.** » Sun Sep 20, 2020 12:46 am

I meant PSTs for the rest of the pieces (so a [224, 32, 512] net if all the piece types are present, possibly not always updated in QS), but yeah, they'd be inefficient because most of the piece-square occupancies don't occur in search within the same KP structure, and I'm having a bad hair day, sorry.

Ethereal Pawn-King NN

Ethereal Pawn-King NN

Re: Ethereal Pawn-King NN

Re: Ethereal Pawn-King NN

Re: Ethereal Pawn-King NN

Re: Ethereal Pawn-King NN

Re: Ethereal Pawn-King NN

Re: Ethereal Pawn-King NN

Re: Ethereal Pawn-King NN

Re: Ethereal Pawn-King NN

Re: Ethereal Pawn-King NN