How could a compiler break the lockless hashing method?

rbarreira · Post by **rbarreira** » Wed Dec 11, 2013 1:38 am

bob wrote: I've been programming since 1968. I've used integer overflow so many times (ever seen a linear congruential PRNG as just one example) and they have worked flawlessly. Now some compiler guys have decided that since it is undefined, even though it is an established and working programming paradigm, they can break it in any way they choose; throw the code out, make it abort, write an error warning out, or make demons fly out of your nose. All with something that has been used for so long it is not funny, and something that would work PERFECTLY still if the compiler guys had just left it alone.

This sounds like a doctor that says, "hmm. splinter in finger, that's undefined in my medical books, so I'm removing your arm. or worse." When a band-aid and a little topical antibiotic would do just fine. Now I'm walking down the street thinking "holy crap, where did my arm go? Last time I looked I just had a splinter in one finger."

I STILL call that malicious compiler/library development.

Now you're gonna make me get out the K&R C programming book from 1988 (2nd edition):

So the creators of the C programming language have been warning us for at least 25 years not to rely on int overflow.

bob · Post by **bob** » Wed Dec 11, 2013 1:40 am

syzygy wrote:
bob wrote:
syzygy wrote:
bob wrote:The problem I keep coming back to is that they optimize as if no UB happens, but they do not know.
Exactly, they optimise as if no UB happens.
There is no need to detect UB in order to optimise as if no UB happens.
So what does it matter whether they know there is UB or not?

It matters in the sense that they can't warn you at compile-time that your buggy code might break. But you have been warned by us.
If they optimize as if no UB happens, they might well just delete that block of code. That breaks things unnecessarily. One CAN optimize without making lots of invalid assumptions. IE I can get you to the moon in 10 seconds. You might not survive the initial acceleration, but I assume you MIGHT. You only need to average about 24,000 miles per second. just under 1/7th the speed of light. To get to 60mph in 1 sec, you need a little over 2G's. Good luck surviving that trip...
Ok you're turning and twisting again, so I guess you have now gotten the point that they don't need to know that UB happens in order to optimise as if no UB happens? You deserve a round of applause.

I hope you now stop "keep coming back" to this same problem (but of course you will do just that or your name wouldn't be Bob).

They do have to know to optimize away the code being discussed. Or are you changing your position about "demons..." Certainly races are undefined, but if they optimize as if no UB happens, that is what I call "doing the right thing" for races. The UB will work just fine for me. The optimization about a + 1 > a being optimized into "1" is an unsafe one. One that doesn't help much (I compiled/ran crafty disabling and enabling this with zero effect on speed) and one which can certainly lead to unexpected results, ala the linear congruential PRNGs I mentioned. No need to break those just to trigger a very minimal optimization and break a program in a much more serious way.

BTW your statement leaves a lot to be desired, accuracy-wise. The compilers to not optimize "as if no UB occurs." If they can prove it occurs, they break the code by optimizing the test away entirely. That's a bit different from "optimizing as if no UB occurs." And that is the case that causes the most grief and should be avoided.

bob · Post by **bob** » Wed Dec 11, 2013 1:46 am

rbarreira wrote:
bob wrote: I've been programming since 1968. I've used integer overflow so many times (ever seen a linear congruential PRNG as just one example) and they have worked flawlessly. Now some compiler guys have decided that since it is undefined, even though it is an established and working programming paradigm, they can break it in any way they choose; throw the code out, make it abort, write an error warning out, or make demons fly out of your nose. All with something that has been used for so long it is not funny, and something that would work PERFECTLY still if the compiler guys had just left it alone.

This sounds like a doctor that says, "hmm. splinter in finger, that's undefined in my medical books, so I'm removing your arm. or worse." When a band-aid and a little topical antibiotic would do just fine. Now I'm walking down the street thinking "holy crap, where did my arm go? Last time I looked I just had a splinter in one finger."

I STILL call that malicious compiler/library development.
Now you're gonna make me get out the K&R C programming book from 1988 (2nd edition):

So the creators of the C programming language have been warning us for at least 25 years not to rely on int overflow.

addendum:

and for 25+ years it has been working flawless because all current machines use 2's complement, none of the other nonsense the standards committee decided need to be included. I don't see why they were so afraid of considering actual hardware, rather than what some moron MIGHT possibly want to design in the future (which no one has or is likely to do either)..

syzygy · Post by **syzygy** » Wed Dec 11, 2013 1:58 am

bob wrote:BTW your statement leaves a lot to be desired, accuracy-wise. The compilers to not optimize "as if no UB occurs."

Well, the words were yours. Check.

mvk · Post by **mvk** » Wed Dec 11, 2013 2:14 am

bob wrote:I don't see why they were so afraid of considering actual hardware, rather than what some moron MIGHT possibly want to design in the future (which no one has or is likely to do either)..

Most likely because Unisys was on the committee, and they wanted to be able to support standard C up to the Clearpath IX series (1996). Java did this (signed overflow) a lot better IMHO.

wgarvin · Post by **wgarvin** » Wed Dec 11, 2013 2:28 am

For some optimizations they don't have to know it occurs. When there's a program construct for which some possible executions have defined behavior and others have UB, they can ignore the UB possibility and just translate it into something that works correctly for the defined cases. If they always had to handle the undefined cases too they would often be forced to generate less efficient code. e.g. for a signed int X, they will translate "X*2 / 2" into "X" even though this will change the value of the expression if integer overflow occurred during the multiply expression X*2.

This was one of a bunch of examples given in this blog post by Chris Lattner that I linked, and which you claimed to have read:

Signed integer overflow: If arithmetic on an 'int' type (for example) overflows, the result is undefined. One example is that "INT_MAX+1" is not guaranteed to be INT_MIN. This behavior enables certain classes of optimizations that are important for some code. For example, knowing that INT_MAX+1 is undefined allows optimizing "X+1 > X" to "true". Knowing the multiplication "cannot" overflow (because doing so would be undefined) allows optimizing "X*2/2" to "X". While these may seem trivial, these sorts of things are commonly exposed by inlining and macro expansion. A more important optimization that this allows is for "<=" loops like this:

for (i = 0; i <= N; ++i) { ... }

In this loop, the compiler can assume that the loop will iterate exactly N+1 times if "i" is undefined on overflow, which allows a broad range of loop optimizations to kick in. On the other hand, if the variable is defined to wrap around on overflow, then the compiler must assume that the loop is possibly infinite (which happens if N is INT_MAX) - which then disables these important loop optimizations. This particularly affects 64-bit platforms since so much code uses "int" as induction variables.

It is worth noting that unsigned overflow is guaranteed to be defined as 2's complement (wrapping) overflow, so you can always use them. The cost to making signed integer overflow defined is that these sorts of optimizations are simply lost (for example, a common symptom is a ton of sign extensions inside of loops on 64-bit targets). Both Clang and GCC accept the "-fwrapv" flag which forces the compiler to treat signed integer overflow as defined (other than divide of INT_MIN by -1).

bob · Post by **bob** » Wed Dec 11, 2013 2:55 am

wgarvin wrote:For some optimizations they don't have to know it occurs. When there's a program construct for which some possible executions have defined behavior and others have UB, they can ignore the UB possibility and just translate it into something that works correctly for the defined cases. If they always had to handle the undefined cases too they would often be forced to generate less efficient code. e.g. for a signed int X, they will translate "X*2 / 2" into "X" even though this will change the value of the expression if integer overflow occurred during the multiply expression X*2.

This was one of a bunch of examples given in this blog post by Chris Lattner that I linked, and which you claimed to have read:
Signed integer overflow: If arithmetic on an 'int' type (for example) overflows, the result is undefined. One example is that "INT_MAX+1" is not guaranteed to be INT_MIN. This behavior enables certain classes of optimizations that are important for some code. For example, knowing that INT_MAX+1 is undefined allows optimizing "X+1 > X" to "true". Knowing the multiplication "cannot" overflow (because doing so would be undefined) allows optimizing "X*2/2" to "X". While these may seem trivial, these sorts of things are commonly exposed by inlining and macro expansion. A more important optimization that this allows is for "<=" loops like this:

for (i = 0; i <= N; ++i) { ... }

In this loop, the compiler can assume that the loop will iterate exactly N+1 times if "i" is undefined on overflow, which allows a broad range of loop optimizations to kick in. On the other hand, if the variable is defined to wrap around on overflow, then the compiler must assume that the loop is possibly infinite (which happens if N is INT_MAX) - which then disables these important loop optimizations. This particularly affects 64-bit platforms since so much code uses "int" as induction variables.

It is worth noting that unsigned overflow is guaranteed to be defined as 2's complement (wrapping) overflow, so you can always use them. The cost to making signed integer overflow defined is that these sorts of optimizations are simply lost (for example, a common symptom is a ton of sign extensions inside of loops on 64-bit targets). Both Clang and GCC accept the "-fwrapv" flag which forces the compiler to treat signed integer overflow as defined (other than divide of INT_MIN by -1).

I did read that. Had read it previously in fact.

But as an example, let's take our good friend x86.

Done the "normal way" x * 2 / 2 will never overflow. It is trivial to prevent.

mov eax, x
imul 2
idiv 2

end up with x, thanks to imul putting product low order 32 bits in eax, and the high order bits in edx. The idiv takes it right back down to x.

Of course, if they choose to get "cute" and do the sloppier form that doesn't use a register pair to prevent overflow, then they just made a poor choice. By doing the right thing, they get the right answer. If they assume x * 2 / 2 = x, they are exactly right. And all is well.

Their assumption might break on a machine that doesn't do the two-register product, but most machines over the years have had that as an option. VAX used even/odd registers like that if you wanted. Etc. But again, if they do the right thing, we all will get what we expect. If my target machine(s) can only do 32 bit multiplies I am ALREADY constantly thinking about the possible effects of multiplying two values and overflowing and getting the wrong answer. They seem to take the approach of "let's define this stuff for the most brain-dead machine around and that is the 'smartest' machine we will use for our code output templates. If some machines offer greater capability for dealing with such things, too bad, we are not going to consider using it.

wgarvin · Post by **wgarvin** » Wed Dec 11, 2013 4:01 am

bob wrote: Of course, if they choose to get "cute" and do the sloppier form that doesn't use a register pair to prevent overflow, then they just made a poor choice. By doing the right thing, they get the right answer. If they assume x * 2 / 2 = x, they are exactly right. And all is well.

Their assumption might break on a machine that doesn't do the two-register product, but most machines over the years have had that as an option. VAX used even/odd registers like that if you wanted. Etc. But again, if they do the right thing, we all will get what we expect.

See, here we have the problem. None of the things you're talking about have anything to do with the C language.

In C, the product of a signed int with a signed int is again a signed int. See, with your 32-bit-int compiler you have there, if the result doesn't fit in that 32-bit signed integer, that's called integer overflow, and its undefined behavior which means you are never allowed to do it if you want your C program to have any predictable meaning according to the language standard. So if you happen to "get what you expect" from that construct, either your C compiler has nicely extended the language semantics to give it to you, or you were just lucky that it happened to do what you wanted. There's nothing in the C language requiring it to do the same thing next time.

You're a professor of computer science, you're supposed to know this stuff. You keep claiming that you do, and then you go on to ramble about how some ancient piece of hardware worked or how a compiler should just "do the right thing" when you feed it code with undefined semantics. I'm just shaking my head in confusion and dismay over here.

CERT has a nice page explaining some of the risks with integer overflow.

Signed integer overflow is undefined behavior 36. Consequently, implementations have considerable latitude in how they deal with signed integer overflow (see MSC15-C. Do not depend on undefined behavior). An implementation that defines signed integer types as being modulo, for example, need not detect integer overflow. Implementations may also trap on signed arithmetic overflows, or simply assume that overflows will never happen and generate object code accordingly. It is also possible for the same conforming implementation to emit code that exhibits different behavior in different contexts. For example, an implementation may determine that a signed integer loop control variable declared in a local scope cannot overflow and may emit efficient code on the basis of that determination, while the same implementation may determine that a global variable used in as [sic] similar context will wrap.

Edit: I also like their page MSC15-C. Do not depend on undefined behavior.

It says all the same things Rein, Ronald, I and others have been saying for the past two weeks. It rates the severity of this issue as 'High'.

bob · Post by **bob** » Wed Dec 11, 2013 5:09 am

wgarvin wrote:
bob wrote: Of course, if they choose to get "cute" and do the sloppier form that doesn't use a register pair to prevent overflow, then they just made a poor choice. By doing the right thing, they get the right answer. If they assume x * 2 / 2 = x, they are exactly right. And all is well.

Their assumption might break on a machine that doesn't do the two-register product, but most machines over the years have had that as an option. VAX used even/odd registers like that if you wanted. Etc. But again, if they do the right thing, we all will get what we expect.
See, here we have the problem. None of the things you're talking about have anything to do with the C language.

In C, the product of a signed int with a signed int is again a signed int. See, with your 32-bit-int compiler you have there, if the result doesn't fit in that 32-bit signed integer, that's called integer overflow, and its undefined behavior which means you are never allowed to do it if you want your C program to have any predictable meaning according to the language standard. So if you happen to "get what you expect" from that construct, either your C compiler has nicely extended the language semantics to give it to you, or you were just lucky that it happened to do what you wanted. There's nothing in the C language requiring it to do the same thing next time.

You're a professor of computer science, you're supposed to know this stuff. You keep claiming that you do, and then you go on to ramble about how some ancient piece of hardware worked or how a compiler should just "do the right thing" when you feed it code with undefined semantics. I'm just shaking my head in confusion and dismay over here.

X86 is ancient hardware? X86_64 is ancient hardware? I miss exactly what you are talking about. X86/X86_64 is about 99.99999% of the computer processors on planet earth. They can do this flawlessly. So why can't C? I don't care about the underlying "abstract machine". At SOME point the compiler actually has to produce code for a specific machine. Why would it not produce code that would make that "undefined result" go away at least for multiplies using the double-reg trick, or for add/sub using normal x86 instructions? Given the choice of a more restrictive instruction that both loses part of the power of the actual processor and which also causes the "undefined behavior" for the multiply example, why not emit instructions that eliminate a few of those? Just because they want to chant "undefined behavior, must have undefined behavior" over and over???

I understand the issues. I understand the architecture. And I understand the "right thing" to do when converting a C to asm, just as surely as when I hand-code something in asm I use whatever the hardware offers to make the results as accurate/fast as possible. The current compiler guys COULD do more of that. But they take "undefined behavior" as an "unlimited license to mangle programs" in an effort to inflict an optimization that is barely worthwhile, while breaking programs that are written by programmers that know what they are doing.

I don't see a THING wrong with writing a C program that is targeted specifically to X86_64. I don't care if there are things on X86_64 that won't work on x86 (64 bit adds and such). I don't care if there are things on X86_64 that won't work on a sun Sparc, or a MIPS, or a Power PC, or whatever. Portability is not a necessity for a program, unless the author wants that as a goal. The original Cray Blitz would not come close to running on a PC. No "vector merge" instruction, no "gather/scatter" indirect array references. We didn't care, we wrote Cray Blitz for the Cray architecture and it worked perfectly. Crafty is targeted for the X86. Nice that it works on others but that is not MY requirement.

So just compile the source to do as closely as possible what I ask for, and forget all the cute unsafe optimizations that run afoul of the "undefined behavior" ghost. Certainly integer overflow has no undefined behavior on the X86 target architecture. Yet because it is a problem on an old univac that used a bizarro binary representation, they insist that it won't work on the PC either. That's certainly to this "old computer scientist" the exact opposite of what I want. My compilers would map the source syntax into the best/most efficient/most accurate target machine instructions possible.

I get the purist point of view with the "abstract machine" and "crappy C specification" and such. But that does not FORCE a compiler writer to break programs that know how to use overflow on their target architecture. Yet they certainly act like it does. As far as your notes below, what does the X86 architecture say about signed integer overflow in the processor reference manual? It works perfectly, using the same behavior as unsigned (wrapping). But the compiler guys pretend the hardware can't do that and make it misbehave intentionally with unsafe optimizations. Do that for machines that actually can't handle overflow. But if the machine can deal with it, let it. Don't break it INTENTIONALLY. Yet that is what is being done.

CERT has a nice page explaining some of the risks with integer overflow.
Signed integer overflow is undefined behavior 36. Consequently, implementations have considerable latitude in how they deal with signed integer overflow (see MSC15-C. Do not depend on undefined behavior). An implementation that defines signed integer types as being modulo, for example, need not detect integer overflow. Implementations may also trap on signed arithmetic overflows, or simply assume that overflows will never happen and generate object code accordingly. It is also possible for the same conforming implementation to emit code that exhibits different behavior in different contexts. For example, an implementation may determine that a signed integer loop control variable declared in a local scope cannot overflow and may emit efficient code on the basis of that determination, while the same implementation may determine that a global variable used in as [sic] similar context will wrap.
Edit: I also like their page MSC15-C. Do not depend on undefined behavior.

It says all the same things Rein, Ronald, I and others have been saying for the past two weeks. It rates the severity of this issue as 'High'.

Choose a decent context. "programmer that understands the underlying architecture, programmer that is proficient in C." NOW what is the danger of signed overflow? Unsigned numbers wrap and produce wrong (but predictable / modulo) answers. But signed overflow is a severe bug? Only if you don't know what you are doing...

I chose to learn C a LONG time ago, because it did not try to be a pascal-like strongly typed, highly restrictive language. It seems to be losing a bit of that "edge" in recent compilers. It was never intended to be a baby-programmable language. We even used the "register" modifier to specifically request certain variables be allocated in a register if possible. That was its claim to fame. Close to the hardware, but high-level. Not so close to the hardware any longer, it would seem. I can certainly deal with overflows perfectly in asm on X86. But the compiler guys are beginning to get in the way in some cases by doing optimizations that don't help speed enough to measure, yet pries a large wedge between the programmer and the underlying target machine...

I never wanted to see "C for dummies". That's what Pascal and Ada (among others) are for.

wgarvin · Post by **wgarvin** » Wed Dec 11, 2013 6:09 am

Fine. You win the discussion, cuz I'm tired of going in circles over the same ground, over and over.

Its been a lot of fun to discuss, but I'll probably stop reading these threads so I don't get nerdrage after reading something nonsensical and feel obligated to respond. That happened to me several times lately, and I've been using entirely too much emphasis in some of those posts, and it just wears me out.

I'll try not to forget that Crafty is officially unsupported on everything except x86/x64.

Hopefully there will always be a compiler around that can mostly do what you want, even if most of it is not part of the C language.

I do share your wistful dream, of a better future where the compiler just faithfully translates anything I write to the underlying machine model AND gives me spectacularly good optimized code performance, for which I should certainly not have to give up my own unsound programming habits. (Long live -fno-strict-aliasing, and they can pry reinterpret_cast<> from my cold, dead fingers!

) I don't know how we get there from here, though.

How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?