Re: Incremental Zobrist - slow?
Posted: Sat Jun 20, 2009 11:18 pm
Perhaps the side-to-move or castling keys, which are est not done differentially, are not handled correctly?
side to move keys particularly when you do a null move can be a gotcha.hgm wrote:Perhaps the side-to-move or castling keys, which are est not done differentially, are not handled correctly?
There's nothing inherently bad about movsx/movzx, x86 compilers generate those any time you load a small (8-bit or 16-bit) value from memory into a 32-bit or 64-bit register. In some cases (like arrays of small values), its better to use 8-bit or 16-bit values so they can fit into the cache more easily. But for a one-off global or temporary variable on the stack, it probably makes no difference. movsx is used to sign-extend a signed 8- or 16-bit value into the full register (i.e. for signed types), and movzx is used to zero-extend it (for unsigned types). On modern chips these instructions don't really cost more than a normal 32-bit load, and extending the value into the full register is important to avoid either partial register stalls, or false dependences on previous values of the register.bob wrote:BTW if that "structure" is a group of bit fields, get rid of that for certain. You should do dyour own and/or/etc stuff. Whenever I see movsx/movzx type instructions, that's a warning that things are not as efficiently laid out as possible.
I was thinking of the case where you use 8-bit or 16-bit values and do some computations, then extend to 32/64 bits for the subscripting. The movsx becomes redundant had you used the correct size initially. I've seen expressions where there were several such length conversions when it could have been avoided. And there's certainly no benefit to shorter values except for cache utilization issues.wgarvin wrote:There's nothing inherently bad about movsx/movzx, x86 compilers generate those any time you load a small (8-bit or 16-bit) value from memory into a 32-bit or 64-bit register. In some cases (like arrays of small values), its better to use 8-bit or 16-bit values so they can fit into the cache more easily. But for a one-off global or temporary variable on the stack, it probably makes no difference. movsx is used to sign-extend a signed 8- or 16-bit value into the full register (i.e. for signed types), and movzx is used to zero-extend it (for unsigned types). On modern chips these instructions don't really cost more than a normal 32-bit load, and extending the value into the full register is important to avoid either partial register stalls, or false dependences on previous values of the register.bob wrote:BTW if that "structure" is a group of bit fields, get rid of that for certain. You should do dyour own and/or/etc stuff. Whenever I see movsx/movzx type instructions, that's a warning that things are not as efficiently laid out as possible.
But I agree 100% that using bitfields for anything where performance is important, is a bad idea. Compilers usually generate lousy code for bitfields. If you need to pack multiple values into a register, you should write the and/or/shifts yourself (maybe in an inline accessor method, or macro).
Ah, I see what you mean, I forgot about the register-to-register case. And I notice now that he has a couple of them near the bottom of his code snippet.bob wrote:I was thinking of the case where you use 8-bit or 16-bit values and do some computations, then extend to 32/64 bits for the subscripting. The movsx becomes redundant had you used the correct size initially. I've seen expressions where there were several such length conversions when it could have been avoided. And there's certainly no benefit to shorter values except for cache utilization issues.wgarvin wrote:There's nothing inherently bad about movsx/movzx, x86 compilers generate those any time you load a small (8-bit or 16-bit) value from memory into a 32-bit or 64-bit register. In some cases (like arrays of small values), its better to use 8-bit or 16-bit values so they can fit into the cache more easily. But for a one-off global or temporary variable on the stack, it probably makes no difference. movsx is used to sign-extend a signed 8- or 16-bit value into the full register (i.e. for signed types), and movzx is used to zero-extend it (for unsigned types). On modern chips these instructions don't really cost more than a normal 32-bit load, and extending the value into the full register is important to avoid either partial register stalls, or false dependences on previous values of the register.bob wrote:BTW if that "structure" is a group of bit fields, get rid of that for certain. You should do dyour own and/or/etc stuff. Whenever I see movsx/movzx type instructions, that's a warning that things are not as efficiently laid out as possible.
But I agree 100% that using bitfields for anything where performance is important, is a bad idea. Compilers usually generate lousy code for bitfields. If you need to pack multiple values into a register, you should write the and/or/shifts yourself (maybe in an inline accessor method, or macro).
Code: Select all
000000014000C8BA movzx eax,r14b
000000014000C8BE movzx ecx,r13b
What about chars? If you use those, the math will be done as bytes, and then have to be extended to a longer value for use as a subscript or address offset.vladstamate wrote:Strange thing is I do not have any bitfields anywhere in my code. Only normal structures. I have no idea why would the compiler generate code like that.
Regards,
Vlad.
Ah yes! I do have lots of those in the Move structure. I figured I save space. But if the compiler has to generate that kind of code to deal with them I might as well then use unsigned ints and do my own shifts and mask, like you suggested.bob wrote: What about chars? If you use those, the math will be done as bytes, and then have to be extended to a longer value for use as a subscript or address offset.
You want to havevladstamate wrote:Hi,
Yes you were both right. Null-move was definitely a place that I had forgotten about. Also the castling rights are quite tricky to follow.
However I have now fixed all the bugs and the search really got a speed boost. I went from around 1.1knps to 1.6knps, so almost 50% increase.
Thank you all for your advice!
Regards,
Vlad.