pst_op [7][64] = {.....}
pst_eg [7][64]
One array for all pièces
Or
pst_pawn_op [64]
pst_knight_op [64]
...
pst_knight_eg [64]
...
Etc..
One array for each pièces (white , find the black values with a flip [64] array just like in TSCP code)
It's always a choice. Write short slow and less error prone generic code or efficient code. Losing a factor two is only 70 elo.
Choice is easy. Never write complicated efficient code unless you have too. By the way looks like I have to write many implementations before I get the right one.
Also Skippers speed increased with a factor 5 but it plays even worse so I regret latest changes.
Last edited by Henk on Wed Aug 26, 2015 11:23 am, edited 2 times in total.
If you define pst[7][64], writing something like pst[KNIGHT][sq] with KNIGHT a predefined constant produces exactly the same code as defineing pst_knight[64] and writing pst_knight[sq]. The compiler just considers an array element with an index known at compile time as if it was a scalar variable of the same type.
The problem with pst_knight[64] is that you have to write ugly code like
sje wrote:It is faster to flip a square index by an inline exclusive-or calculation than by a table look-up.
Don't quite follow that. It's not XOR vs table look-up.
It's XOR + table look-up vs table-lookup. Obviously not doing the XOR is faster.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
matthewlai wrote:It's XOR + table look-up vs table-lookup. Obviously not doing the XOR is faster.
It is not so obvious. Bigger table potentially lead to more cache misses. But XOR may be completely free in many cases due computational parallelism in modern processors.
matthewlai wrote:It's XOR + table look-up vs table-lookup. Obviously not doing the XOR is faster.
It is not so obvious. Bigger table potentially lead to more cache misses. But XOR may be completely free in many cases due computational parallelism in modern processors.
That is true. However, tables this size take up negligible L1D cache, and will stay there regardless.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
In my new engine Simple I even store the eg and op values in the same table (pst[7][64]) of 32-bit integers, as (op-eg/3)*N + eg, where N is the constant 2^16. I have a game phase that runs from 0 (eg) to 24 (op). In the evaluation I then calculate (8*N+gamePhase)*pstEval>>19. Some algebra shows that this is equal to
The part in braces contains a factor 2^32, and is thus clipped off the 32-bit result by overflow. Shifting right by 16 bits pushes the term gamePhase*eg (which is smaller than 2^16, eg running only upto 4000 or so even if the piece ase value is included in the PST) out of the int, and undoes the multiplication by N on the remaining term, so that it would leave you with
gamePhase/3 * (op-eg) + 8*eg
(give or take some rounding), where gamePhase/3 runs from 0 to 8 (so let us call that 8*phi). The result is then
8*(phi*op + (1-phi)*eg)
which is 8 times the sought interpolated result. Currently the factor 8 is also removed by right-shifting 3 bits extra (19 instead of 16), but it occurred to me that this is really counterproductive, and that it would be better to let the engine use units of 1/800 or 1/400 Pawn internally than have it use centiPawn. Then I can more easily reduce drawish positions (like KNKP) to sub-Pawn values by dividing the score by 2, 4 or 8 without completely losing the smaller evaluation terms like centralization to rounding, so that the engine will continue to prefer good moves in the hope for an opponent mistake, rather than throwing in the towel and running his King towards a corner to sulk, because it no longer sees the corner is worse than the center.
The evaluation parameters other than the PST (like isolated-Pawn penalty, open-file bonus for Rooks) are also defined in this (op-eg)/2*N+eg format, so that I can add them all to the pstEval value first, and then use the formula above for interpolating to the current game phase.