strcpy() revisited

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: strcpy() revisited

Post by bob »

syzygy wrote:
bob wrote:
syzygy wrote:I've just compiled a gcc-4.9 snapshot.
Executing the loop that was optimised to while (1) now gives:

Code: Select all

over.c:7:24: runtime error: signed integer overflow: 1073741824 + 1073741824 cannot be represented in type 'int'
I've briefly tested crafty. Analyzing the opening position does not seem to trigger any (detectable) UB.

My private engine running single threaded on the opening position did not give errors, either. However, running it with 6 threads:

Code: Select all

smpsearch.c:179:30: runtime error: left shift of negative value -2
Let's see:

Code: Select all

  long64 split_mask = (-2LL&#41; << waiting_node->ply;
long64 is unsigned. I tried to be too clever here and should replace -2LL by 0xfffffffffffffffeULL.

It only prints the message once, which is nice.
So a shift left can't be done on a negative number even though the hardware will do what you are asking perfectly? -2 << 1 is -4.

This ought to be pretty funny because right now with gcc 4.7.3, if you use the expression a * 256 where a is a signed int, the compiler cheerfully emits the instruction sall $8, %edx.

Classic optimization to turn a multiply by a power of 2 into a shift. Which it will then no doubt bitch about at runtime? :)

And it still has a few brain-dead extra register moves lying around just for fun...
Not much of what we've been discussing for 2 weeks now has gotten through to you, has it?
What? That a compiler can ignore integer overflow and handle it normally, except for when it doesn't want to and chooses to abort your code. EVEN if it is the one that causes the problem (shifting negative value left instead of multiplying)??? I get it. I just don't think it makes one bit of sense...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: strcpy() revisited

Post by bob »

syzygy wrote:I found another occurrence of UB in my tablebase generator:

Code: Select all

int myrand&#40;void&#41;
&#123;
  static int tbl&#91;31&#93; = &#123;
    -1726662223, 379960547, 1735697613, 1040273694, 1313901226,
    1627687941, -179304937, -2073333483, 1780058412, -1989503057,
    -615974602, 344556628, 939512070, -1249116260, 1507946756,
    -812545463, 154635395, 1388815473, -1926676823, 525320961,
    -1009028674, 968117788, -123449607, 1284210865, 435012392,
    -2017506339, -911064859, -370259173, 1132637927, 1398500161,
    -205601318,
  &#125;;

  static int f = 3;
  static int r = 0;

  int result;

  tbl&#91;f&#93; += tbl&#91;r&#93;;
  result = &#40;tbl&#91;f&#93; >> 1&#41; & 0x7fffffff;

  f++;
  if &#40;f >= 31&#41; f = 0;
  r++;
  if &#40;r >= 31&#41; r= 0;

  return result;
&#125;
The statement "tbl[f] += tbl[r]" unsurprisingly leads to signed overflow.

What is funny is that I "stole" this code from glibc. I realised too late I did not want the generated random numbers to vary from platform to platform (or from glibc version to glibc version), so I had to hardwire the random number generator I had already used to generate my own set of tables.

From glibc, random_r.c:

Code: Select all

380       int32_t *fptr = buf->fptr;
381       int32_t *rptr = buf->rptr;
382       int32_t *end_ptr = buf->end_ptr;
383       int32_t val;
384
385       val = *fptr += *rptr;
int32_t is signed, so this code also leads to signed overflow, i.e. to UB.

Seems to be a bug (which admittedly is unlikely to ever lead to unexpected results, at least not on x86), unless glibc is compiled with -fwrapv.
Don't you mean the other way around? -fwrapv is supposed to make integer overflow work as the hardware intends...

There are dozens of PRNGs that use overflow (ignoring it, actually). No big surprise that another big batch of programs will be broken assuming they actually start to actively check for overflow, something I still find hard to believe. They could give up the unsafe optimizations that cause problems with overflow and lose far less performance than testing the overflow flag after every arithmetic operation, something that I still don't believe they will actually do.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: strcpy() revisited

Post by bob »

Codesquid wrote:
bob wrote:
a = b + c
Codesquid wrote:If it's anything like the undefined behavior sanitizer in Clang, it's working at runtime. If you actually call at runtime a=b+c with values that cause an overflow, it'll print a message at runtime.
Which on x86 means a "jo error" after EVERY integer add, subtract, multiply? Not to mention shifts and such?
No, only the actual signed integer overflows at runtime.
Another "anti-optimization" that is going to slow things way down...
It's a debugging feature, it's not supposed to be active in productive builds of your code.
So they aren't REALLY catching overflows, just a half-assed attempt to catch some. Maybe they can generate a random number and if > .9, they catch the overflow, otherwise ignore.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: strcpy() revisited

Post by wgarvin »

bob wrote: So they aren't REALLY catching overflows, just a half-assed attempt to catch some. Maybe they can generate a random number and if > .9, they catch the overflow, otherwise ignore.
I think maybe you misunderstood.. The feature they're talking about is a 'sanitizer option': you add -fsanitize-integer or something to the compiler options, and it generates extra runtime checks in the compiled code. It will probably be pretty comprehensive, and won't completely destroy the performance although it certainly will have a cost. (Maybe one conditional branch over a call instruction, for each integer op that might cause overflow?). I think the one in clang was made by Regehr and or his students, and it can detect pretty much any UB from integer overflow.

I guess the main weakness of this type of tool is that for it to recognize that you have a problem, you have to actually exercise that code path and trigger the UB in your program, which might be difficult. Used alone, an absence of diagnostics from the sanitizer doesn't really mean much. Used together with fuzzing and code coverage tools, it could probably be pretty powerful. But probably few developers will go to that level of effort.
Last edited by wgarvin on Tue Dec 10, 2013 9:28 am, edited 1 time in total.
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: strcpy() revisited

Post by syzygy »

bob wrote:
Codesquid wrote:
bob wrote:
a = b + c
Codesquid wrote:If it's anything like the undefined behavior sanitizer in Clang, it's working at runtime. If you actually call at runtime a=b+c with values that cause an overflow, it'll print a message at runtime.
Which on x86 means a "jo error" after EVERY integer add, subtract, multiply? Not to mention shifts and such?
No, only the actual signed integer overflows at runtime.
Another "anti-optimization" that is going to slow things way down...
It's a debugging feature, it's not supposed to be active in productive builds of your code.
So they aren't REALLY catching overflows, just a half-assed attempt to catch some. Maybe they can generate a random number and if > .9, they catch the overflow, otherwise ignore.
Are you really not getting the picture, still?

The -fsanitize=undefined feature simply inserts runtime checks for various kinds of UB (in the sense of the standard). Every signed integer addition that overflows at runtime will be caught. I suppose it also detects the use of uninitialised variables, and so on.

Why is this useful? Simple: if the programmer intended to write a standard C program, he cannot have intended any UB to happen. So if it happens, that points to a bug.

So it's a debugging feature.
The feature will not be used for production builds.
The feature will not be used by the compiler itself to look for optimisation possibilities.

This is really not difficult. Just get things a bit straight in your mind and you might see it.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: strcpy() revisited

Post by wgarvin »

Catching Integer Errors with Clang:
regehr's blog wrote: Peng Li and I at Utah, along with our collaborators Will Dietz and Vikram Adve at UIUC, wrote an integer overflow checker for Clang which has found problems in most C/C++ codes that we have looked at. Do you remember how pervasive memory safety errors were before Valgrind came out? Integer overflows are that way right now.

...

One thing we realized very early on in this work is that integer overflows are surprisingly difficult to understand, particularly when they occur in the middle of complex expressions. For example, see this somewhat undignified interaction between me and the main PHP guy. As a result, we put a lot of work into emitting good error messages.
Their sanitizer is apparently part of clang version 3.3.
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: strcpy() revisited

Post by AlvaroBegue »

bob wrote:There are dozens of PRNGs that use overflow (ignoring it, actually).
Then there are dozens of PRNGs that were written sloppily. Those same PRNGs should be written using unsigned integer types, whose overflow behavior is guaranteed by the standard.

I checked a first edition of K&R (published on 1978) and this distinction between overflow of signed and unsigned integer types was already there.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: strcpy() revisited

Post by bob »

syzygy wrote:
bob wrote:
Codesquid wrote:
bob wrote:
a = b + c
Codesquid wrote:If it's anything like the undefined behavior sanitizer in Clang, it's working at runtime. If you actually call at runtime a=b+c with values that cause an overflow, it'll print a message at runtime.
Which on x86 means a "jo error" after EVERY integer add, subtract, multiply? Not to mention shifts and such?
No, only the actual signed integer overflows at runtime.
Another "anti-optimization" that is going to slow things way down...
It's a debugging feature, it's not supposed to be active in productive builds of your code.
So they aren't REALLY catching overflows, just a half-assed attempt to catch some. Maybe they can generate a random number and if > .9, they catch the overflow, otherwise ignore.
Are you really not getting the picture, still?

The -fsanitize=undefined feature simply inserts runtime checks for various kinds of UB (in the sense of the standard). Every signed integer addition that overflows at runtime will be caught. I suppose it also detects the use of uninitialised variables, and so on.

Why is this useful? Simple: if the programmer intended to write a standard C program, he cannot have intended any UB to happen. So if it happens, that points to a bug.

So it's a debugging feature.
The feature will not be used for production builds.
The feature will not be used by the compiler itself to look for optimisation possibilities.

This is really not difficult. Just get things a bit straight in your mind and you might see it.
So, as I have stated, multiple times now, "It really does NOT eliminate overflows". It really does NOT detect overflows. ONLY in a debug environment. Which is way less than 1% of a programs total execution life.

I'll repeat. It does nothing useful to the current discussion.

The compiler optimizes for production builds. This is used for debugging. No connection I can see. Which leaves us in never-never land. If you cause overflow by using constants, the compiler will see it, and optimize it away. If you don't, the program might crash should you trigger that code path during debugging, otherwise it just "does the right thing". Unlike when it optimizes overflow away, which is certainly "the wrong thing."
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: strcpy() revisited

Post by syzygy »

bob wrote:
syzygy wrote:
bob wrote:
Codesquid wrote:
bob wrote:
a = b + c
Codesquid wrote:If it's anything like the undefined behavior sanitizer in Clang, it's working at runtime. If you actually call at runtime a=b+c with values that cause an overflow, it'll print a message at runtime.
Which on x86 means a "jo error" after EVERY integer add, subtract, multiply? Not to mention shifts and such?
No, only the actual signed integer overflows at runtime.
Another "anti-optimization" that is going to slow things way down...
It's a debugging feature, it's not supposed to be active in productive builds of your code.
So they aren't REALLY catching overflows, just a half-assed attempt to catch some. Maybe they can generate a random number and if > .9, they catch the overflow, otherwise ignore.
Are you really not getting the picture, still?

The -fsanitize=undefined feature simply inserts runtime checks for various kinds of UB (in the sense of the standard). Every signed integer addition that overflows at runtime will be caught. I suppose it also detects the use of uninitialised variables, and so on.

Why is this useful? Simple: if the programmer intended to write a standard C program, he cannot have intended any UB to happen. So if it happens, that points to a bug.

So it's a debugging feature.
The feature will not be used for production builds.
The feature will not be used by the compiler itself to look for optimisation possibilities.

This is really not difficult. Just get things a bit straight in your mind and you might see it.
So, as I have stated, multiple times now, "It really does NOT eliminate overflows". It really does NOT detect overflows. ONLY in a debug environment. Which is way less than 1% of a programs total execution life.

I'll repeat. It does nothing useful to the current discussion.

The compiler optimizes for production builds. This is used for debugging. No connection I can see. Which leaves us in never-never land. If you cause overflow by using constants, the compiler will see it, and optimize it away. If you don't, the program might crash should you trigger that code path during debugging, otherwise it just "does the right thing". Unlike when it optimizes overflow away, which is certainly "the wrong thing."
There is nothing I can do for you.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: strcpy() revisited

Post by bob »

syzygy wrote:
bob wrote:
syzygy wrote:
bob wrote:
Codesquid wrote:
bob wrote:
a = b + c
Codesquid wrote:If it's anything like the undefined behavior sanitizer in Clang, it's working at runtime. If you actually call at runtime a=b+c with values that cause an overflow, it'll print a message at runtime.
Which on x86 means a "jo error" after EVERY integer add, subtract, multiply? Not to mention shifts and such?
No, only the actual signed integer overflows at runtime.
Another "anti-optimization" that is going to slow things way down...
It's a debugging feature, it's not supposed to be active in productive builds of your code.
So they aren't REALLY catching overflows, just a half-assed attempt to catch some. Maybe they can generate a random number and if > .9, they catch the overflow, otherwise ignore.
Are you really not getting the picture, still?

The -fsanitize=undefined feature simply inserts runtime checks for various kinds of UB (in the sense of the standard). Every signed integer addition that overflows at runtime will be caught. I suppose it also detects the use of uninitialised variables, and so on.

Why is this useful? Simple: if the programmer intended to write a standard C program, he cannot have intended any UB to happen. So if it happens, that points to a bug.

So it's a debugging feature.
The feature will not be used for production builds.
The feature will not be used by the compiler itself to look for optimisation possibilities.

This is really not difficult. Just get things a bit straight in your mind and you might see it.
So, as I have stated, multiple times now, "It really does NOT eliminate overflows". It really does NOT detect overflows. ONLY in a debug environment. Which is way less than 1% of a programs total execution life.

I'll repeat. It does nothing useful to the current discussion.

The compiler optimizes for production builds. This is used for debugging. No connection I can see. Which leaves us in never-never land. If you cause overflow by using constants, the compiler will see it, and optimize it away. If you don't, the program might crash should you trigger that code path during debugging, otherwise it just "does the right thing". Unlike when it optimizes overflow away, which is certainly "the wrong thing."
There is nothing I can do for you.
Only because there is nothing I need done.