NNUE outer product vs tensor product

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
Madeleine Birchfield
Posts: 296
Joined: Tue Sep 29, 2020 2:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

NNUE outer product vs tensor product

Post by Madeleine Birchfield » Mon Nov 02, 2020 6:44 pm

Stockfish's NNUE implementation if I recall correctly used a tensor product to express the half king pawn relationship for the input features, but Seer's NNUE implementation used an outer product instead of a tensor product for the input features. What differences come between using a tensor product vs using an outer product in NNUE, and Is there any advantage to using one or the other?

connor_mcmonigle
Posts: 125
Joined: Sun Sep 06, 2020 2:40 am
Full name: Connor McMonigle

Re: NNUE outer product vs tensor product

Post by connor_mcmonigle » Mon Nov 02, 2020 8:24 pm

The tensor product is a generalization of the notion of an outer product (read: in this context they mean the same thing). Consequently, the question is a bit meaningless. I've not seen halfkp features described by a tensor product before, but one could consider it a tensor product of piece and king features if they wanted to.

If we let vec(x) denote flattening x, a X b represent the outerproduct of a and b, B represent the board features in {0, 1}^12x8x8 (a boolean "tensor"), and let B[0, :, :] and B[1, :, :] (using slice notation) denote white and black king planes respectively, then Seer's input is (vec(vec(B[0, :, :]) X vec(B)), vec(vec(B[1, :, :]) X vec(B))). This differs from the input features currently used in SF networks which would be (vec(vec(B[0, :, :]) X vec(B[:2, :, :])), vec(vec(B[1, :, :]) X vec(B[:2, :, :]))).

Current Stockfish input features are referred to by the name "halfkp" (king X piece). AFAIK, some individuals experimenting with switching to use Seer style input for Stockfish networks have coined the name "halfka" (king X all). Also notable is that, in Stockfish, the (black king) X (piece) features are rotated (though the rotating is an accidental, possibly elo gaining, bug as mirroring was intended). Seer doesn't use mirroring/rotating as it uses separate affine transforms ("feature transformers") for the white and black halves of the halfka input.

Madeleine Birchfield
Posts: 296
Joined: Tue Sep 29, 2020 2:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: NNUE outer product vs tensor product

Post by Madeleine Birchfield » Mon Nov 02, 2020 8:47 pm

connor_mcmonigle wrote:
Mon Nov 02, 2020 8:24 pm
The tensor product is a generalization of the notion of an outer product (read: in this context they mean the same thing). Consequently, the question is a bit meaningless. I've not seen halfkp features described by a tensor product before, but one could consider it a tensor product of piece and king features if they wanted to.
Thanks, because all explanations I have received up until now have referred to king-piece input features as a tensor product of the piece and king features. I know that the outer product is a more specific version of the tensor product, so was wondering if the additional properties of the outer product meant slightly different behaviour than general tensor product based features, but it would seem not at all.

Post Reply