GothicChessInventor wrote:By the way, just "playing the optimization game", I noticed this makes crafty a tiny bit faster on the 2 systems on which I tested it:
Code: Select all
#define LOWER_ITERATION_BOUND ((iteration_depth) << 1)
#define UPPER_ITERATION_BOUND ((iteration_depth) << 2)
#define NEW_EXTENDED(x,y) ((((x) << 2) - (y)) / ((y) << 1))
#define LimitExtensions(extended, ply) \
extended = Min(extended, INCPLY); \
if(ply > LOWER_ITERATION_BOUND) { \
if(ply <= UPPER_ITERATION_BOUND) \
extended = NEW_EXTENDED(iteration_depth, ply); \
else \
extended = 0; \
}
Despite your macros obfuscate and make it harder to follow, you changed the semantic
Code: Select all
#define LimitExtensions(extended,ply)\
extended = Min(extended,ply); \
if(ply > 2 * iteration_depth) { \
if(ply < 4 * iteration_depth) \
extended = extended*(4*iteration_depth - ply)/(4*iteration_depth); \
else \
extended = 0; \
}
in a significant way which is likely to have more effects in search than replacing lea or add by shift. If it makes Crafty stronger - why not. Not the first time to found improvements by accident or even bugs
c = a*{1,2,4,8} + b + const may translate to one x86 lea instruction.
add reg, reg is also likely the better replacement (or at least not worse) for shift left one on some architectures, specially on net burst. Nowadays I prefere multiplication (or even division) by a constant in the C-source and leave it to the compiler to generate optimal code for the specific target platform.
At least vc2005 I am aware of, does quite optimally and implements, what is suggested in the Software Optimization Guide for AMD Family 10h Processors (same for K8 btw.)
http://www.amd.com/us-en/assets/content ... /40546.pdf
Chapter 8 Integer Optimizations
8.2 Alternative Code for Multiplying by a Constant.