couple of questions about stockfish code ?

syzygy · Post by **syzygy** » Sat Oct 29, 2016 2:06 am

kbhearn wrote:Knew i shouldn't have tried extracting extraneous brackets... one of them was needed, corrected below:

typedef uint32_t Score; 

inline Score make_score&#40;unsigned int mg, unsigned int eg&#41; &#123; // implicitly converting signed inputs
    return &#40;eg << 16&#41; + mg;
&#125;

// mask and manual sign extension
inline Value mg_Value&#40;Score s&#41; &#123;
    static const uint32_t mask = 0xFFFFU;
    static const int sign = 0x8000;
    return (&#40;int&#41;&#40;s & mask&#41; ^ sign&#41; - sign;
&#125;

// if lower 16 bits are negative add 0x8000 to propagate up the borrowed 1 from upper 16 bits, then shift and sign extend
inline Value eg_Value&#40;Score s&#41; &#123;
    static const uint32_t borrow = 0x8000U;
    static const int sign = 0x8000;

    return (&#40;int&#41;(&#40;s + borrow&#41; >> 16&#41; ^ sign&#41; - sign;
&#125;

https://github.com/syzygy1/Stockfish/co ... e_no_union

I think Score has to remain an enum for the division operator overload to work (not 100% sure). But that should not have any impact.

I measure 0.15% slow down, which, even if reliable, is well within normal compiler noise. For other or older compilers it might be different.

Of course the most important thing is to get addition/subtraction (and multiplication by an integer) fast. Extraction of eg and mg is a far less common operation.

Regarding readability... I'm not so sure manually handling the sign is "more readable" than a series of casts. But whatever solution ones uses to wrap two signed 16-bit ints into one 32-bit int, it is going to involve a bit of arithmetic trickery. It's not as easy as "just use clean and simple shift instructions and you are done". (However, it is certainly better than the solution shown in the opening post, if only because it does not depend on endianness.)

It's still not fully clear to me whether SF's current implementation gives undefined behaviour according to the C++11 standard. But it is probably guaranteed to work on the major compilers.

It seems that whatever int-wrapping solution one chooses, there will be some (very mild) reliance on implementation-defined behaviour.

syzygy · Post by **syzygy** » Sat Oct 29, 2016 3:43 am

https://github.com/syzygy1/Stockfish/co ... no_union_2

Code: Select all

enum Score &#58; uint32_t &#123; SCORE_ZERO &#125;;

inline Score make_score&#40;int mg, int eg&#41; &#123;
    return Score&#40;&#40;unsigned&#40;eg&#41; << 16&#41; + mg&#41;;
&#125;

inline Value mg_value&#40;Score s&#41; &#123;
    return Value&#40;int16_t&#40;uint16_t&#40;s&#41;));
&#125;

inline Value eg_value&#40;Score s&#41; &#123;
    return Value&#40;int16_t&#40;uint16_t&#40;&#40;s + 0x8000&#41; >> 16&#41;));
&#125;

This seems to be reasonable clean and simple and free of UB.

It is essentially equivalent to SF's current union solution, but it seems to affect code generation quite a lot. I measure no speed difference, though.

(Unfortunately it is difficult to compare the generated code produce with LTO. At least I don't know of a way to get gcc to produce readable assembly code in LTO mode, so code that is fully equivalent to the final object code.)

kbhearn · Post by **kbhearn** » Sat Oct 29, 2016 4:00 am

i'm not sure that that int16_t(...) cast is well defined - i believe it's implementation defined and what SF's union is trying to dodge since it's a cast of a value out-of-range of the target type casted to a signed type (which all would-be negative values would be out of range positive values instead).

I did read something about the intXX_t types being strictly 2's complement though as opposed to regular signed ints being implementation defined, so maybe they also bring it back to well-defined casts from unsigned values out of range but one would have to go searching the standard for the relevant passages on them and see if the guarantee is there.

and if it's well defined you should be able to directly cast to the int16_t without the stop at uint16_t (certainly technically works on gcc and i imagine other compilers - i just don't know if it's rely on implementation defined behavior - i think it is but maybe it's not)

BeyondCritics · Post by **BeyondCritics** » Sat Oct 29, 2016 4:10 am

Code: Select all

/// A Score object encodes a middle game and an end game Value, such that integer
/// vector operations &#40;see ENABLE_FULL_OPERATORS_ON&#41; are possible,
/// as long as there is no integer overflow/underflow in any component.
enum Score &#58; int &#123; SCORE_ZERO &#125;;

/// mg, eg must be valid Values
inline Score make_score&#40;int mg, int eg&#41; &#123;
  static const int val_bits = 16;

  return Score&#40;&#40;eg << val_bits&#41; + mg&#41;;
&#125;

/// Extract the end game value from a score
inline Value eg_value&#40;Score s&#41; &#123;
  static const int val_bits = 16;
  static const int min_val = SHRT_MIN;

  /// Requires arithmetic right shift properly implementing floor function
  return Value&#40;&#40;int&#40;s&#41; - min_val&#41; >> val_bits&#41;;
&#125;

/// Extract the middle game value from a score
inline Value mg_value&#40;Score s&#41; &#123;
  static const int val_bits = 16;
  static const int val_mask = &#40;1 << val_bits&#41; - 1;
  static const int min_val = SHRT_MIN;

  /// Requires 2^s complement arithmetic.
  return Value&#40;(&#40;int&#40;s&#41; - min_val&#41; & val_mask&#41; + min_val&#41;;
&#125;

This implementation is derived from reverse engineering of the current implementation.
I know it is "unfair" to use arithmetic right shift and 2^s complement, but i don't require more from my environment than any other modern language like java, c#, rust, etc..
I have tested this at home and it seems to work.

kbhearn · Post by **kbhearn** » Sat Oct 29, 2016 4:17 am

http://stackoverflow.com/questions/9220 ... onversions

seems to indicate that indeed int16_t conversions from uint are still implementation defined behavior if the unsigned value of what you're converting can't convert directly to a positive value represented in the signed integer. so casting to int32_t and doing the sign extension yourself is the 'compliant' option for converting a 16 bit unsigned integer back to a signed integer and then you could further cast it down to a 16 bit int if you wanted since now the value it represents would be in range...

kbhearn · Post by **kbhearn** » Sat Oct 29, 2016 4:46 am

BeyondCritics wrote:

Code: Select all

/// A Score object encodes a middle game and an end game Value, such that integer
/// vector operations &#40;see ENABLE_FULL_OPERATORS_ON&#41; are possible,
/// as long as there is no integer overflow/underflow in any component.
enum Score &#58; int &#123; SCORE_ZERO &#125;;

/// mg, eg must be valid Values
inline Score make_score&#40;int mg, int eg&#41; &#123;
  static const int val_bits = 16;

  return Score&#40;&#40;eg << val_bits&#41; + mg&#41;;
&#125;

/// Extract the end game value from a score
inline Value eg_value&#40;Score s&#41; &#123;
  static const int val_bits = 16;
  static const int min_val = SHRT_MIN;

  /// Requires arithmetic right shift properly implementing floor function
  return Value&#40;&#40;int&#40;s&#41; - min_val&#41; >> val_bits&#41;;
&#125;

/// Extract the middle game value from a score
inline Value mg_value&#40;Score s&#41; &#123;
  static const int val_bits = 16;
  static const int val_mask = &#40;1 << val_bits&#41; - 1;
  static const int min_val = SHRT_MIN;

  /// Requires 2^s complement arithmetic.
  return Value&#40;(&#40;int&#40;s&#41; - min_val&#41; & val_mask&#41; + min_val&#41;;
&#125;

This implementation is derived from reverse engineering of the current implementation.
I know it is "unfair" to use arithmetic right shift and 2^s complement, but i don't require more from my environment than any other modern language like java, c#, rust, etc..
I have tested this at home and it seems to work.

and that arithmetic shift is the other thing the union was trying to avoid as implementation defined behavior; such aggravating barriers to conveying such a simple task :)

BeyondCritics · Post by **BeyondCritics** » Sat Oct 29, 2016 4:54 am

Ehm, is it true that your implementation uses 2^s complement essentially?
If so, would you really believe not to have a working right shift in your system? I really think you shouldn't, therefore i proposed to use it, since it makes absolutely sense.

kbhearn · Post by **kbhearn** » Sat Oct 29, 2016 5:57 am

BeyondCritics wrote:Ehm, is it true that your implementation uses 2^s complement essentially?
If so, would you really believe not to have a working right shift in your system? I really think you shouldn't, therefore i proposed to use it, since it makes absolutely sense.

Technically the trick i used should work on 1's complement and sign and magnitude signed integers as well, twiddling the 16th bit and then subtracting its magnitude should allow the implementation to create the right negative number. i.e. regardless of implementation 0000FFFF from the unsigned intends to represent -1 but for the moment is 65535. after the bit twiddle it becomes 00007FFF = 32767. subtracting 00008000 = 32768 it becomes -1 however -1 is represented. Unless the range 0000000-0000FFFF are not guaranteed to represent contiguous positive numbers in all integer representations, it should just work. If that guarantee isn't there (and maybe it's not), then a conditional assignment using an unsigned comparison would be a way through.

Honestly in my own project I'd probably just write to my compiler's implementation instead of the standard and use an (int16_t) cast and forget about it but i can understand why a large open source program like stockfish would prefer to write to the standard and avoid implementation-defined behavior in the event someone decides to use an oddball console compiler on the stockfish code for instance.

BeyondCritics · Post by **BeyondCritics** » Sat Oct 29, 2016 7:12 am

kbhearn wrote: ...
Technically the trick i used should work on 1's complement and sign and magnitude signed integers as well, twiddling the 16th bit and then subtracting its magnitude should allow the implementation to create the right negative number. i.e. regardless of implementation 0000FFFF from the unsigned intends to represent -1 but for the moment is 65535. after the bit twiddle it becomes 00007FFF = 32767. subtracting 00008000 = 32768 it becomes -1 however -1 is represented. Unless the range 0000000-0000FFFF are not guaranteed to represent contiguous positive numbers in all integer representations, it should just work. If that guarantee isn't there (and maybe it's not), then a conditional assignment using an unsigned comparison would be a way through.

Sorry, you messed it up completely. True is e.g., that regardless of implementation

(signed short)(unsigned short)0x0000FFFF == -1

yields the value true, but it does not mean that in general (unsigned short)0x0000FFFF is the same bit pattern as (signed short)(-1). This is true if you have a 2^s complement representation and untrue if you have 1's complement and sign magnitude representation. It is not difficult to construct counterexamples for your implementation in the latter cases.

kbhearn wrote: Honestly in my own project I'd probably just write to my compiler's implementation instead of the standard and use an (int16_t) cast and forget about it but i can understand why a large open source program like stockfish would prefer to write to the standard and avoid implementation-defined behavior in the event someone decides to use an oddball console compiler on the stockfish code for instance.

I don't believe Stockfish is there for kids playing around with"oddball compilers". I don't honestly believe there is such a compiler. Which compiler did you have in mind? "Stockfish" shall be compiled exactly as the maintainer specifies it in the makefile. Stockfish is an ultra high performance program, going very great length to achieve the best performance possible and it should be treated like that.

kbhearn · Post by **kbhearn** » Sat Oct 29, 2016 7:42 am

BeyondCritics wrote:
kbhearn wrote: ...
Technically the trick i used should work on 1's complement and sign and magnitude signed integers as well, twiddling the 16th bit and then subtracting its magnitude should allow the implementation to create the right negative number. i.e. regardless of implementation 0000FFFF from the unsigned intends to represent -1 but for the moment is 65535. after the bit twiddle it becomes 00007FFF = 32767. subtracting 00008000 = 32768 it becomes -1 however -1 is represented. Unless the range 0000000-0000FFFF are not guaranteed to represent contiguous positive numbers in all integer representations, it should just work. If that guarantee isn't there (and maybe it's not), then a conditional assignment using an unsigned comparison would be a way through.
Sorry, you messed it up completely. True is e.g., that regardless of implementation
(signed short)(unsigned short)0x0000FFFF == -1
yields the value true, but it does not mean that in general (unsigned short)0x0000FFFF is the same bit pattern as (signed short)(-1). This is true if you have a 2^s complement representation and untrue if you have 1's complement and sign magnitude representation. It is not difficult to construct counterexamples for your implementation in the latter cases.

let me try again clearer - we have in our uint32_t which behaves as 2's complement because 2's complement arithmetic is the same as unsigned modulo 2^n arithmetic:

0xFFFF0000 say and want to extract the eg value.

so we add the 8000 shift it right and get

0x0000FFFF

now we cast that to an int32_t and the value it represents fits so it stays

0x0000FFFF whether int is 1's complement, 2's complement, or sign+magnitude. some odd representations of signed integers the bit pattern would change, i'm not sure if the standard allows them or not (but int32_t is apparently not supposed to be defined if it's not 2's complement so there's that... but let's just continue pretending we don't know the precise bit format of a signed int) - now we xor off 0x00008000 and get 0x00007FFF. the next operation we do is not a bit operation, but an arithmetic one, so we convert these to the values they actually represent and find that the subraction results in -1. that -1 could be 0xFFFFFFFF, 0x80000001, or 0xFFFFFFFE - but that's no longer important to us, we got our -1 out :)

As for SF's goals, that's not for me to say - I just could understand avoiding implementation defined behavior if the performance hit isn't measurable.

couple of questions about stockfish code ?

Re: couple of questions about stockfish code ?

Re: couple of questions about stockfish code ?

Re: couple of questions about stockfish code ?

Proposal requiring arithmetic right shift + 2^s complement

Re: couple of questions about stockfish code ?

Re: Proposal requiring arithmetic right shift + 2^s compleme

Re: Proposal requiring arithmetic right shift + 2^s compleme

Re: Proposal requiring arithmetic right shift + 2^s compleme

Re: Proposal requiring arithmetic right shift + 2^s compleme

Re: Proposal requiring arithmetic right shift + 2^s compleme