strcpy() revisited

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: strcpy() revisited

Post by syzygy »

bob wrote:
syzygy wrote:I don't know how helpful and complete it will be, but gcc-4.9 will have an UB detector:
UndefinedBehaviorSanitizer (ubsan), a fast undefined behavior detector, has been added and can be enabled via -fsanitize=undefined. Various computations will be instrumented to detect undefined behavior at runtime. UndefinedBehaviorSanitizer is currently available for the C and C++ languages.
http://gcc.gnu.org/gcc-4.9/changes.html
will it catch this one:

a = b + c;

???

with some values of b and c that is undefined. For other values, it is perfectly correct. What WILL it do I wonder? Surely not a warning "this might produce integer overflow which is undefined behavior." Crafty would only produce 20K of those warnings or so...
"at runtime"
User avatar
Codesquid
Posts: 138
Joined: Tue Aug 23, 2011 10:25 pm
Location: Germany

Re: strcpy() revisited

Post by Codesquid »

bob wrote:
syzygy wrote:I don't know how helpful and complete it will be, but gcc-4.9 will have an UB detector:
UndefinedBehaviorSanitizer (ubsan), a fast undefined behavior detector, has been added and can be enabled via -fsanitize=undefined. Various computations will be instrumented to detect undefined behavior at runtime. UndefinedBehaviorSanitizer is currently available for the C and C++ languages.
http://gcc.gnu.org/gcc-4.9/changes.html
will it catch this one:

a = b + c;

???

with some values of b and c that is undefined.
If it's anything like the undefined behavior sanitizer in Clang, it's working at runtime. If you actually call at runtime a=b+c with values that cause an overflow, it'll print a message at runtime.
nanos gigantium humeris insidentes
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: strcpy() revisited

Post by syzygy »

I've just compiled a gcc-4.9 snapshot.
Executing the loop that was optimised to while (1) now gives:

Code: Select all

over.c:7:24: runtime error: signed integer overflow: 1073741824 + 1073741824 cannot be represented in type 'int'
I've briefly tested crafty. Analyzing the opening position does not seem to trigger any (detectable) UB.

My private engine running single threaded on the opening position did not give errors, either. However, running it with 6 threads:

Code: Select all

smpsearch.c:179:30: runtime error: left shift of negative value -2
Let's see:

Code: Select all

  long64 split_mask = (-2LL&#41; << waiting_node->ply;
long64 is unsigned. I tried to be too clever here and should replace -2LL by 0xfffffffffffffffeULL.

It only prints the message once, which is nice.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: strcpy() revisited

Post by bob »

syzygy wrote:
bob wrote:
syzygy wrote:I don't know how helpful and complete it will be, but gcc-4.9 will have an UB detector:
UndefinedBehaviorSanitizer (ubsan), a fast undefined behavior detector, has been added and can be enabled via -fsanitize=undefined. Various computations will be instrumented to detect undefined behavior at runtime. UndefinedBehaviorSanitizer is currently available for the C and C++ languages.
http://gcc.gnu.org/gcc-4.9/changes.html
will it catch this one:

a = b + c;

???

with some values of b and c that is undefined. For other values, it is perfectly correct. What WILL it do I wonder? Surely not a warning "this might produce integer overflow which is undefined behavior." Crafty would only produce 20K of those warnings or so...
"at runtime"
Seems like an interesting challenge. the overflow is "undefined" according to the compiler, so how exactly is it going to catch it? And is it REALLY going to add a "jo error" after EVERY add? That ought to make everyone happy, seeing performance drop by a factor of 2.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: strcpy() revisited

Post by bob »

Codesquid wrote:
bob wrote:
syzygy wrote:I don't know how helpful and complete it will be, but gcc-4.9 will have an UB detector:
UndefinedBehaviorSanitizer (ubsan), a fast undefined behavior detector, has been added and can be enabled via -fsanitize=undefined. Various computations will be instrumented to detect undefined behavior at runtime. UndefinedBehaviorSanitizer is currently available for the C and C++ languages.
http://gcc.gnu.org/gcc-4.9/changes.html
will it catch this one:

a = b + c;

???

with some values of b and c that is undefined.
If it's anything like the undefined behavior sanitizer in Clang, it's working at runtime. If you actually call at runtime a=b+c with values that cause an overflow, it'll print a message at runtime.
Which on x86 means a "jo error" after EVERY integer add, subtract, multiply? Not to mention shifts and such? Another "anti-optimization" that is going to slow things way down...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: strcpy() revisited

Post by bob »

syzygy wrote:I've just compiled a gcc-4.9 snapshot.
Executing the loop that was optimised to while (1) now gives:

Code: Select all

over.c&#58;7&#58;24&#58; runtime error&#58; signed integer overflow&#58; 1073741824 + 1073741824 cannot be represented in type 'int'
I've briefly tested crafty. Analyzing the opening position does not seem to trigger any (detectable) UB.

My private engine running single threaded on the opening position did not give errors, either. However, running it with 6 threads:

Code: Select all

smpsearch.c&#58;179&#58;30&#58; runtime error&#58; left shift of negative value -2
Let's see:

Code: Select all

  long64 split_mask = (-2LL&#41; << waiting_node->ply;
long64 is unsigned. I tried to be too clever here and should replace -2LL by 0xfffffffffffffffeULL.

It only prints the message once, which is nice.
So a shift left can't be done on a negative number even though the hardware will do what you are asking perfectly? -2 << 1 is -4.

This ought to be pretty funny because right now with gcc 4.7.3, if you use the expression a * 256 where a is a signed int, the compiler cheerfully emits the instruction sall $8, %edx.

Classic optimization to turn a multiply by a power of 2 into a shift. Which it will then no doubt bitch about at runtime? :)

And it still has a few brain-dead extra register moves lying around just for fun...
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: strcpy() revisited

Post by syzygy »

bob wrote:
syzygy wrote:I've just compiled a gcc-4.9 snapshot.
Executing the loop that was optimised to while (1) now gives:

Code: Select all

over.c&#58;7&#58;24&#58; runtime error&#58; signed integer overflow&#58; 1073741824 + 1073741824 cannot be represented in type 'int'
I've briefly tested crafty. Analyzing the opening position does not seem to trigger any (detectable) UB.

My private engine running single threaded on the opening position did not give errors, either. However, running it with 6 threads:

Code: Select all

smpsearch.c&#58;179&#58;30&#58; runtime error&#58; left shift of negative value -2
Let's see:

Code: Select all

  long64 split_mask = (-2LL&#41; << waiting_node->ply;
long64 is unsigned. I tried to be too clever here and should replace -2LL by 0xfffffffffffffffeULL.

It only prints the message once, which is nice.
So a shift left can't be done on a negative number even though the hardware will do what you are asking perfectly? -2 << 1 is -4.

This ought to be pretty funny because right now with gcc 4.7.3, if you use the expression a * 256 where a is a signed int, the compiler cheerfully emits the instruction sall $8, %edx.

Classic optimization to turn a multiply by a power of 2 into a shift. Which it will then no doubt bitch about at runtime? :)

And it still has a few brain-dead extra register moves lying around just for fun...
Not much of what we've been discussing for 2 weeks now has gotten through to you, has it?
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: strcpy() revisited

Post by Rein Halbersma »

bob wrote: So a shift left can't be done on a negative number even though the hardware will do what you are asking perfectly? -2 << 1 is -4.

This ought to be pretty funny because right now with gcc 4.7.3, if you use the expression a * 256 where a is a signed int, the compiler cheerfully emits the instruction sall $8, %edx.

Classic optimization to turn a multiply by a power of 2 into a shift. Which it will then no doubt bitch about at runtime? :)

And it still has a few brain-dead extra register moves lying around just for fun...
Here I have to sympathize with Bob. In Kernighan & Ritchie 2nd Ed., left-shifting was always done by shifting the bit pattern, right-filling with 0s. Right-shifting a signed integer was implementation-defined (left-filling with either 0 or the sign-bit). However, somewhere along the line, the C Standard settled on defining left-shifting negative integers as undefined behavior. You can try gcc -traditional to emulate the old semantics.
User avatar
Codesquid
Posts: 138
Joined: Tue Aug 23, 2011 10:25 pm
Location: Germany

Re: strcpy() revisited

Post by Codesquid »

bob wrote:
a = b + c
Codesquid wrote:If it's anything like the undefined behavior sanitizer in Clang, it's working at runtime. If you actually call at runtime a=b+c with values that cause an overflow, it'll print a message at runtime.
Which on x86 means a "jo error" after EVERY integer add, subtract, multiply? Not to mention shifts and such?
No, only the actual signed integer overflows at runtime.
Another "anti-optimization" that is going to slow things way down...
It's a debugging feature, it's not supposed to be active in productive builds of your code.
nanos gigantium humeris insidentes
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: strcpy() revisited

Post by syzygy »

I found another occurrence of UB in my tablebase generator:

Code: Select all

int myrand&#40;void&#41;
&#123;
  static int tbl&#91;31&#93; = &#123;
    -1726662223, 379960547, 1735697613, 1040273694, 1313901226,
    1627687941, -179304937, -2073333483, 1780058412, -1989503057,
    -615974602, 344556628, 939512070, -1249116260, 1507946756,
    -812545463, 154635395, 1388815473, -1926676823, 525320961,
    -1009028674, 968117788, -123449607, 1284210865, 435012392,
    -2017506339, -911064859, -370259173, 1132637927, 1398500161,
    -205601318,
  &#125;;

  static int f = 3;
  static int r = 0;

  int result;

  tbl&#91;f&#93; += tbl&#91;r&#93;;
  result = &#40;tbl&#91;f&#93; >> 1&#41; & 0x7fffffff;

  f++;
  if &#40;f >= 31&#41; f = 0;
  r++;
  if &#40;r >= 31&#41; r= 0;

  return result;
&#125;
The statement "tbl[f] += tbl[r]" unsurprisingly leads to signed overflow.

What is funny is that I "stole" this code from glibc. I realised too late I did not want the generated random numbers to vary from platform to platform (or from glibc version to glibc version), so I had to hardwire the random number generator I had already used to generate my own set of tables.

From glibc, random_r.c:

Code: Select all

380       int32_t *fptr = buf->fptr;
381       int32_t *rptr = buf->rptr;
382       int32_t *end_ptr = buf->end_ptr;
383       int32_t val;
384
385       val = *fptr += *rptr;
int32_t is signed, so this code also leads to signed overflow, i.e. to UB.

Seems to be a bug (which admittedly is unlikely to ever lead to unexpected results, at least not on x86), unless glibc is compiled with -fwrapv.