Amoeba : A Modern TSCP Type Engine

Discussion of chess software programming and technical issues.

Moderator: Ras

catugocatugocatugo
Posts: 32
Joined: Thu Feb 16, 2023 12:56 pm
Full name: Florea Aurelian

Re: Amoeba : A Modern TSCP Type Engine

Post by catugocatugocatugo »

I have reread what chessprogramming wiki has to say about NNUE (https://www.chessprogramming.org/NNUE & https://www.chessprogramming.org/Stockfish_NNUE).
I'd guess that the general NNUE is to be used in our program as the Stockfish NNUE is used in a specific program. Although probably it is worth keeping an eye on it.
I have some very basic questions about NNUE. Why only one hidden layer? From theory we know that we need two layers for aproximating any function.
Why not more normal activation funtions like say TANH? Is that because of the binary imput?
Also (per my ealier question) have noticed that having an binary input layer makes a RBFNN not that usable, unless I have missed something.
Even though in orthodox chess one layer suffices maybe for fairy chess it would not.
User avatar
hgm
Posts: 28321
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Amoeba : A Modern TSCP Type Engine

Post by hgm »

I don't follow Stockfish development, and IIRC the original Shogi NNUE had two hidden layers. So I suppose that for Chess one of these layers could be optimized away.

I think ReLu activation functions are used for efficiency reasons. They only require the sum of the inputs to be clipped to a smaller number of bits, in a saturating way. Intel-architecture CPUs have instructions that make it possible to do thos for many integers packed in a single word simultaneously. They doný have instructions that calculate a large number of tanh functions simultaneously in a single clock cycle.

I don't think the input layer applies any activation function; it is just a fancy name for the data fed to the network. It is binary because chess allows only 0 or 1 piece on each square. The first layer of weights produces a piece-square sum, which can assume many values over a wide range, so applying different activation functions to that sum really makes a difference.
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Amoeba : A Modern TSCP Type Engine

Post by Mike Sherwin »

HGM, Is this project still in active development?
User avatar
hgm
Posts: 28321
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Amoeba : A Modern TSCP Type Engine

Post by hgm »

In principle it is, but I have not been active in the area of chess programming these past weeks, because of other pressing business. I hope to resume it soon; I was already close to having a version that would be able to play a game of chess using PST-only evaluation.
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Amoeba : A Modern TSCP Type Engine

Post by Mike Sherwin »

hgm wrote: Tue Mar 18, 2025 7:25 pm In principle it is, but I have not been active in the area of chess programming these past weeks, because of other pressing business. I hope to resume it soon; I was already close to having a version that would be able to play a game of chess using PST-only evaluation.
:D
User avatar
lithander
Posts: 912
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Amoeba : A Modern TSCP Type Engine

Post by lithander »

catugocatugocatugo wrote: Tue Mar 11, 2025 9:46 am I have reread what chessprogramming wiki has to say about NNUE (https://www.chessprogramming.org/NNUE & https://www.chessprogramming.org/Stockfish_NNUE).
I'd guess that the general NNUE is to be used in our program as the Stockfish NNUE is used in a specific program. Although probably it is worth keeping an eye on it.
I have some very basic questions about NNUE. Why only one hidden layer? From theory we know that we need two layers for aproximating any function.
Why not more normal activation funtions like say TANH? Is that because of the binary imput?
Also (per my ealier question) have noticed that having an binary input layer makes a RBFNN not that usable, unless I have missed something.
Even though in orthodox chess one layer suffices maybe for fairy chess it would not.
You need to be able to run millions of evals per second. Everything that speeds up the eval may be worth a sacrifice in accuracy. If you consider how well PSQTs work already and that each neuron in the hidden layer is basically it's own set of PSQTs it's no surprise that hundreds of them can provide already quite good eval for chess.

With flexible trainers like bullet you can relatively easily experiment with different network architectures and try a 2nd layer if you want.

What seems to be more promising is to use buckets e.g. have multiples of the 768 inputs. But for a learning engine I would keep it simple like (768->256)x2->1

Another things that is done for the sake of speed at the cost of accuracy is quantization. So instead of using floats you map floats to [0..255] (8-bits) and then use shorts (16-bits) as much as possible while accumulating. To get some accuracy back actually SCRelu (squared, clipped Relu) is better than CRelu as it basically allocates more precision (bits) to smaller values. When you switch from CRelu to SCRelu you can gain some 20 Elo, but only if you take extra care that you don't have to widen the shorts to integers to early and half your throughput. You can get around this by ordering the required operations carefully. (See Lizards optimization described in the Wiki)
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
catugocatugocatugo
Posts: 32
Joined: Thu Feb 16, 2023 12:56 pm
Full name: Florea Aurelian

Re: Amoeba : A Modern TSCP Type Engine

Post by catugocatugocatugo »

Just a small caveat mister lithander . This engine is meant to also play some simple, or even maybe more complex fairy chess. And that is actually my main interest here. But I do think that your advice still holds. It is just that maybe the nets should be bigger. Thanks for your help!
catugocatugocatugo
Posts: 32
Joined: Thu Feb 16, 2023 12:56 pm
Full name: Florea Aurelian

Re: Amoeba : A Modern TSCP Type Engine

Post by catugocatugocatugo »

So if I understand correctly, for the AI piece values+PST+NNUE is enough for a good engine. I am not sure though, for the fairy chess games where there is not much work done for knowing the piece values, how will the NNUE adapt. I am not talking about gross errors, but errors less that 1.5 pawns, as I think that within 1.5 pawns I can aproximate the value of any piece. The same goes for the PST. There the discussion is even more complex.
User avatar
hgm
Posts: 28321
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Amoeba : A Modern TSCP Type Engine

Post by hgm »

I don't think NNUE engines use piece values or PST. Just NNUE. It will learn the piece values and their average dependence on location during training.

Problem with variants could be the amount of training required. To grasp a simple concept like piece values the training material should contain a large-enough representative selection of each material combination. (Large enough to average out all positional factors.) With only 5 piece types (the King is always present), that is not too hard. Now try to extend that to, say, Chu Shogi, where you have some 36 different piece types (distinguishing promoted and promotable types). How many more material combinations will have to be learned now?
catugocatugocatugo
Posts: 32
Joined: Thu Feb 16, 2023 12:56 pm
Full name: Florea Aurelian

Re: Amoeba : A Modern TSCP Type Engine

Post by catugocatugocatugo »

hgm wrote: Thu Apr 03, 2025 8:30 am I don't think NNUE engines use piece values or PST. Just NNUE. It will learn the piece values and their average dependence on location during training.

Problem with variants could be the amount of training required. To grasp a simple concept like piece values the training material should contain a large-enough representative selection of each material combination. (Large enough to average out all positional factors.) With only 5 piece types (the King is always present), that is not too hard. Now try to extend that to, say, Chu Shogi, where you have some 36 different piece types (distinguishing promoted and promotable types). How many more material combinations will have to be learned now?
Please, pardon my misunderstanding with the PV and PST.
Anyway, it is not just the large sample that is needed but also a larger hidden layer(accumulator). The two games I am working on both have 14 piece types and 100 squares. This means an input layer of 2800 values. So, an accumulator for this should be at least 4096 I'd say. The question I'm pondering is should there be another hidden layer, 64 neurons second hidden layer. Probably initially no, but for a stronger engine this is certainty required, especially if it figures that there are some nested concepts in the game.