Page 1 of 1

NNUE outer product vs tensor product

Posted: Mon Nov 02, 2020 7:44 pm
by Madeleine Birchfield
Stockfish's NNUE implementation if I recall correctly used a tensor product to express the half king pawn relationship for the input features, but Seer's NNUE implementation used an outer product instead of a tensor product for the input features. What differences come between using a tensor product vs using an outer product in NNUE, and Is there any advantage to using one or the other?

Re: NNUE outer product vs tensor product

Posted: Mon Nov 02, 2020 9:24 pm
by connor_mcmonigle
The tensor product is a generalization of the notion of an outer product (read: in this context they mean the same thing). Consequently, the question is a bit meaningless. I've not seen halfkp features described by a tensor product before, but one could consider it a tensor product of piece and king features if they wanted to.

If we let vec(x) denote flattening x, a X b represent the outerproduct of a and b, B represent the board features in {0, 1}^12x8x8 (a boolean "tensor"), and let B[0, :, :] and B[1, :, :] (using slice notation) denote white and black king planes respectively, then Seer's input is (vec(vec(B[0, :, :]) X vec(B)), vec(vec(B[1, :, :]) X vec(B))). This differs from the input features currently used in SF networks which would be (vec(vec(B[0, :, :]) X vec(B[:2, :, :])), vec(vec(B[1, :, :]) X vec(B[:2, :, :]))).

Current Stockfish input features are referred to by the name "halfkp" (king X piece). AFAIK, some individuals experimenting with switching to use Seer style input for Stockfish networks have coined the name "halfka" (king X all). Also notable is that, in Stockfish, the (black king) X (piece) features are rotated (though the rotating is an accidental, possibly elo gaining, bug as mirroring was intended). Seer doesn't use mirroring/rotating as it uses separate affine transforms ("feature transformers") for the white and black halves of the halfka input.

Re: NNUE outer product vs tensor product

Posted: Mon Nov 02, 2020 9:47 pm
by Madeleine Birchfield
connor_mcmonigle wrote: Mon Nov 02, 2020 9:24 pm The tensor product is a generalization of the notion of an outer product (read: in this context they mean the same thing). Consequently, the question is a bit meaningless. I've not seen halfkp features described by a tensor product before, but one could consider it a tensor product of piece and king features if they wanted to.
Thanks, because all explanations I have received up until now have referred to king-piece input features as a tensor product of the piece and king features. I know that the outer product is a more specific version of the tensor product, so was wondering if the additional properties of the outer product meant slightly different behaviour than general tensor product based features, but it would seem not at all.