NNUE outer product vs tensor product
Moderators: hgm, Dann Corbit, Harvey Williamson
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.

 Posts: 296
 Joined: Tue Sep 29, 2020 2:29 pm
 Location: Dublin, Ireland
 Full name: Madeleine Birchfield
NNUE outer product vs tensor product
Stockfish's NNUE implementation if I recall correctly used a tensor product to express the half king pawn relationship for the input features, but Seer's NNUE implementation used an outer product instead of a tensor product for the input features. What differences come between using a tensor product vs using an outer product in NNUE, and Is there any advantage to using one or the other?

 Posts: 125
 Joined: Sun Sep 06, 2020 2:40 am
 Full name: Connor McMonigle
Re: NNUE outer product vs tensor product
The tensor product is a generalization of the notion of an outer product (read: in this context they mean the same thing). Consequently, the question is a bit meaningless. I've not seen halfkp features described by a tensor product before, but one could consider it a tensor product of piece and king features if they wanted to.
If we let vec(x) denote flattening x, a X b represent the outerproduct of a and b, B represent the board features in {0, 1}^12x8x8 (a boolean "tensor"), and let B[0, :, :] and B[1, :, :] (using slice notation) denote white and black king planes respectively, then Seer's input is (vec(vec(B[0, :, :]) X vec(B)), vec(vec(B[1, :, :]) X vec(B))). This differs from the input features currently used in SF networks which would be (vec(vec(B[0, :, :]) X vec(B[:2, :, :])), vec(vec(B[1, :, :]) X vec(B[:2, :, :]))).
Current Stockfish input features are referred to by the name "halfkp" (king X piece). AFAIK, some individuals experimenting with switching to use Seer style input for Stockfish networks have coined the name "halfka" (king X all). Also notable is that, in Stockfish, the (black king) X (piece) features are rotated (though the rotating is an accidental, possibly elo gaining, bug as mirroring was intended). Seer doesn't use mirroring/rotating as it uses separate affine transforms ("feature transformers") for the white and black halves of the halfka input.
If we let vec(x) denote flattening x, a X b represent the outerproduct of a and b, B represent the board features in {0, 1}^12x8x8 (a boolean "tensor"), and let B[0, :, :] and B[1, :, :] (using slice notation) denote white and black king planes respectively, then Seer's input is (vec(vec(B[0, :, :]) X vec(B)), vec(vec(B[1, :, :]) X vec(B))). This differs from the input features currently used in SF networks which would be (vec(vec(B[0, :, :]) X vec(B[:2, :, :])), vec(vec(B[1, :, :]) X vec(B[:2, :, :]))).
Current Stockfish input features are referred to by the name "halfkp" (king X piece). AFAIK, some individuals experimenting with switching to use Seer style input for Stockfish networks have coined the name "halfka" (king X all). Also notable is that, in Stockfish, the (black king) X (piece) features are rotated (though the rotating is an accidental, possibly elo gaining, bug as mirroring was intended). Seer doesn't use mirroring/rotating as it uses separate affine transforms ("feature transformers") for the white and black halves of the halfka input.

 Posts: 296
 Joined: Tue Sep 29, 2020 2:29 pm
 Location: Dublin, Ireland
 Full name: Madeleine Birchfield
Re: NNUE outer product vs tensor product
Thanks, because all explanations I have received up until now have referred to kingpiece input features as a tensor product of the piece and king features. I know that the outer product is a more specific version of the tensor product, so was wondering if the additional properties of the outer product meant slightly different behaviour than general tensor product based features, but it would seem not at all.connor_mcmonigle wrote: ↑Mon Nov 02, 2020 8:24 pmThe tensor product is a generalization of the notion of an outer product (read: in this context they mean the same thing). Consequently, the question is a bit meaningless. I've not seen halfkp features described by a tensor product before, but one could consider it a tensor product of piece and king features if they wanted to.