Ethereal Pawn-King NN

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

AndrewGrant
Posts: 1756
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Ethereal Pawn-King NN

Post by AndrewGrant »

So this is not an official release, but I want to share that I have pushed another commit into Ethereal, Ethereal 12.58, which implements a [224, 32, 1] NN which is used to adjust the evaluation of positions by looking at the placement of Pawns & Kings. The computation is extremely well hashed -- hitting upwards of 92% hit-rates in the main PawnKing Cache, another 14% hit-rate in the general EvalCache, and then getting the usual benefit of being stored by the Transposition Table. The incremental update portion accounts for a ~2.5% slowdown, and the evaluation itself is <1.0% a slowdown. Overall, the program is essentially just as fast, at least on modern architectures where the compiler plays nicely.

I've applied the same regression testing that I always do for releases. The last release clocked in at ~+18.XX. This version is already coming in at +26.08. It will be awhile until I make my way up to Ethereal 12.75 for a new release, and hopefully by that time I will have trained some better networks, and added more into the evaluation. The training code is built using PyTorch, and works by mapping the (NN+static_eval) => Win rates. I will not be sharing that code at this time.

If you are someone who uses Ethereal, and are used to building your own, I suggest you go and update. This is the single largest patch committed into Ethereal in the last two years. It massively reduces the evaluation inaccuracy.

The Network is embedded (Not like SF, as my networks are only ~140kb). At this time there is no option to swap out the file, nor is there an option to disable the Network. Someone interested can easily figure out the format by taking a look at weights/pknet_224x32x1.net.

If there is interest, I can generate some compilations for Windows / Android users. Although I will say, I have not seen what kind of performance hit weaker architectures take, but I expect it to be very low.

The commit:
https://github.com/AndyGrant/Ethereal/c ... f359399567

The commit message:
Add a [224, 32, 1] NN to augment the existing Pawn King evaluation.

This network has 224 inputs, mapped to white King bb, the white Pawn bb (minus the promotion ranks), the black King bb, and the black Pawn bb (minus the promotion ranks). We incrementally update the 1st layer of Neurons. Due to Pawn Hashing, Eval Caching, and the TT, we very rarely actually have to perform the computation to move from the 1st layer to the output Neuron.

The training code for these networks is a private implementation using the PyTorch framework. The goal of the trainer is to train to output a centipawn value, which is then put through a sigmoid after being offset by a static evaluation of the position.

This work could not have been done without the work of @KierenP , the author of https://github.com/KierenP/Halogen . Halogen offers some generic NN structure code which made trying out new nets quick and easy. While developing the networks, I used a c++ fork of Ethereal which contained Halogen's NN code.

The final code does not contain anything from Halogen, nor does it contain anything from Stockfish or the Leela projects. The code is entirely new, and is not based in full or in part upon the work on the Stockfish project's NNUE. This network _augments_ the existing evaluation. This network does _not_ replace the existing evaluation in any way, shape, or form.

ELO | 25.81 +- 10.38 (95%)
SPRT | 10.0+0.1s Threads=1 Hash=8MB
LLR | 2.95 (-2.94, 2.94) [0.00, 5.00]
Games | N: 1888 W: 484 L: 344 D: 1060
http://chess.grantnet.us/test/7422/

ELO | 23.39 +- 8.71 (95%)
SPRT | 60.0+0.6s Threads=1 Hash=64MB
LLR | 2.95 (-2.94, 2.94) [0.00, 5.00]
Games | N: 1904 W: 362 L: 234 D: 1308
http://chess.grantnet.us/test/7423/

BENCH : 4,679,412
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
D Sceviour
Posts: 570
Joined: Mon Jul 20, 2015 5:06 pm

Re: Ethereal Pawn-King NN

Post by D Sceviour »

Schooner has been doing that for some time with a 256 position map. This [224, 32, 1] map looks interesting and I will put it on the agenda for things to looks at. I wonder how many unique positions can be obtained from training data sets?
peter
Posts: 3186
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: Ethereal Pawn-King NN

Post by peter »

AndrewGrant wrote: Sat Sep 19, 2020 7:34 pm If there is interest, I can generate some compilations for Windows / Android users.
Great news, thank!
Windows SSE4.1-popcnt would be great.

Looking forward regards
Peter.
User avatar
Guenther
Posts: 4607
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: Ethereal Pawn-King NN

Post by Guenther »

AndrewGrant wrote: Sat Sep 19, 2020 7:34 pm So this is not an official release, but I want to share that I have pushed another commit into Ethereal, Ethereal 12.58, which implements a [224, 32, 1] NN which is used to adjust the evaluation of positions by looking at the placement of Pawns & Kings. The computation is extremely well hashed -- hitting upwards of 92% hit-rates in the main PawnKing Cache, another 14% hit-rate in the general EvalCache, and then getting the usual benefit of being stored by the Transposition Table. The incremental update portion accounts for a ~2.5% slowdown, and the evaluation itself is <1.0% a slowdown. Overall, the program is essentially just as fast, at least on modern architectures where the compiler plays nicely.

I've applied the same regression testing that I always do for releases. The last release clocked in at ~+18.XX. This version is already coming in at +26.08. It will be awhile until I make my way up to Ethereal 12.75 for a new release, and hopefully by that time I will have trained some better networks, and added more into the evaluation. The training code is built using PyTorch, and works by mapping the (NN+static_eval) => Win rates. I will not be sharing that code at this time.

If you are someone who uses Ethereal, and are used to building your own, I suggest you go and update. This is the single largest patch committed into Ethereal in the last two years. It massively reduces the evaluation inaccuracy.

...
This sounds interesting. What would be the best way to measure the 'unknown' speed loss due to weak hardware (mine)
with the KB-NN? I guess startpos nps average wouldn't be sufficient?

BTW I get an overflow warning depending on the new network.c file

Code: Select all

gcc -O3 -std=gnu11 -Wall -Wextra -Wshadow -static -DNDEBUG -flto -march=native *.c pyrrhic/tbprobe.c -lpthread -lm -o ../dist/Ethereal-x64-nopopcnt.exe
In function 'initPKNetwork',
    inlined from 'main' at uci.c:70:5:
network.c:85:9: warning: '__builtin_memcpy' writing 624 bytes into a region of size 623 [-Wstringop-overflow=]
   85 |         strcpy(weights, PKWeights[i + PKNETWORK_LAYER1]);
      |         ^
uci.c: In function 'main':
network.c:84:14: note: at offset 0 to an object with size 623 declared here
   84 |         char weights[strlen(PKWeights[i + PKNETWORK_LAYER1])];
      |              ^
https://rwbc-chess.de

trollwatch:
Talkchess nowadays is a joke - it is full of trolls/idiots/people stuck in the pleistocene > 80% of the posts fall into this category...
AndrewGrant
Posts: 1756
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Ethereal Pawn-King NN

Post by AndrewGrant »

Guenther wrote: Sat Sep 19, 2020 8:48 pm This sounds interesting. What would be the best way to measure the 'unknown' speed loss due to weak hardware (mine)
with the KB-NN? I guess startpos nps average wouldn't be sufficient?

BTW I get an overflow warning depending on the new network.c file

Code: Select all

gcc -O3 -std=gnu11 -Wall -Wextra -Wshadow -static -DNDEBUG -flto -march=native *.c pyrrhic/tbprobe.c -lpthread -lm -o ../dist/Ethereal-x64-nopopcnt.exe
In function 'initPKNetwork',
    inlined from 'main' at uci.c:70:5:
network.c:85:9: warning: '__builtin_memcpy' writing 624 bytes into a region of size 623 [-Wstringop-overflow=]
   85 |         strcpy(weights, PKWeights[i + PKNETWORK_LAYER1]);
      |         ^
uci.c: In function 'main':
network.c:84:14: note: at offset 0 to an object with size 623 declared here
   84 |         char weights[strlen(PKWeights[i + PKNETWORK_LAYER1])];
      |              ^
So the amount of effort in the NN is proportional to what % of moves on the board are involving the Pawns or Kings. The start position is fairly fast, as their are many other development options. Blockaded positions are fast because the Pawns are static. King Pawn endgames are the worst case, because most all moves involve updates the NN computation.

Those warnings were pointed out to me. I'm pretty sure the solution will be to add 1. 1 + strlen(....), but I'll need to look at it closer and get a compiler version which reports the warning, as none of mine currently do.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
Tony P.
Posts: 216
Joined: Sun Jan 22, 2017 8:30 pm
Location: Russia

Re: Ethereal Pawn-King NN

Post by Tony P. »

This looks like a September edition of April Fools' :wink:
Tony P.
Posts: 216
Joined: Sun Jan 22, 2017 8:30 pm
Location: Russia

Re: Ethereal Pawn-King NN

Post by Tony P. »

Simply adding the NN output to the static eval looks too primitive to me... I'd be convinced if the NN output a king-pawn-configuration-dependent PST or even a whole set of eval params to be then plugged into the classical static eval.
AndrewGrant
Posts: 1756
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Ethereal Pawn-King NN

Post by AndrewGrant »

Tony P. wrote: Sat Sep 19, 2020 11:01 pm Simply adding the NN output to the static eval looks too primitive to me... I'd be convinced if the NN output a king-pawn-configuration-dependent PST or even a whole set of eval constants to be then plugged into the static eval.
Primitive was the objective. Maintain the uniqueness of the evaluation, but allow NNs to slightly correct the evaluation to reduce loss.
Note that what I'm doing is akin to outputting a PST for Pawns + Kings, and then summing it up. I'm just skipping the steps.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
dkappe
Posts: 1631
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Ethereal Pawn-King NN

Post by dkappe »

AndrewGrant wrote: Sat Sep 19, 2020 11:03 pm
Tony P. wrote: Sat Sep 19, 2020 11:01 pm Simply adding the NN output to the static eval looks too primitive to me... I'd be convinced if the NN output a king-pawn-configuration-dependent PST or even a whole set of eval constants to be then plugged into the static eval.
Primitive was the objective. Maintain the uniqueness of the evaluation, but allow NNs to slightly correct the evaluation to reduce loss.
Note that what I'm doing is akin to outputting a PST for Pawns + Kings, and then summing it up. I'm just skipping the steps.
Nicely done.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Tony P.
Posts: 216
Joined: Sun Jan 22, 2017 8:30 pm
Location: Russia

Re: Ethereal Pawn-King NN

Post by Tony P. »

I meant PSTs for the rest of the pieces (so a [224, 32, 512] net if all the piece types are present, possibly not always updated in QS), but yeah, they'd be inefficient because most of the piece-square occupancies don't occur in search within the same KP structure, and I'm having a bad hair day, sorry.