How could a compiler break the lockless hashing method?
Posted: Sun Dec 08, 2013 7:46 pm
Over in the strcpy() thread, Hyatt's lockless hashing method was discussed. It is my understanding that this XOR trick, while beautiful and innovative as it was, exhibits a data race and therefore undefined behavior as defined by the C Standard. Furthermore, as the classic paper on Double-Checked Locking by Alexandrescu and Meyers shows, modern compilers go out of their way to optimize programs and will typically find such data races even if the offending code is spread out over separately compiled modules. Finally, Ronald de Man posted a paper by Hans Boehm on how such modern papers in the face of "benign" data races can have the most stunning effects. E.g. two threads simultaneously writing the same value to a global variable, can have the effect that this variably stays in its initial state! Bob, while not conceding that his lockless hashing even exhibits undefined behavior, or that a compiler could even detect it, also challenged me to show how a compiler could actually break the XOR trick. Now I am not knowledgeable enough about low-level compiler optimizations, and after doing some thinking on it, I came up empty.
So I am posing this question as a challenge to this forum: using compiler optimizations as shown in the Alexandrescu/Meyers and Boehm papers, how could the lockless hashing be broken by an optimizing compiler that assumes that a conforming program does not exhibit undefined behavior?
NOTE: please only serious attempts to answer this technical question. Please, no debate here on whether it is undefined behavior, or whether it could be detected. The Standard and the quoted papers IMO leave no doubt about that. And even if you don't agree with that, consider that point settled for the purposes of this thread. I want to find an answer to Bob's challenge because it has merit.
So I am posing this question as a challenge to this forum: using compiler optimizations as shown in the Alexandrescu/Meyers and Boehm papers, how could the lockless hashing be broken by an optimizing compiler that assumes that a conforming program does not exhibit undefined behavior?
NOTE: please only serious attempts to answer this technical question. Please, no debate here on whether it is undefined behavior, or whether it could be detected. The Standard and the quoted papers IMO leave no doubt about that. And even if you don't agree with that, consider that point settled for the purposes of this thread. I want to find an answer to Bob's challenge because it has merit.