How could a compiler break the lockless hashing method?

Rein Halbersma · Post by **Rein Halbersma** » Sun Dec 08, 2013 9:24 pm

syzygy wrote:Unless I'm mistaken, the C standard allows the compiler to load a value multiple times. Since the C99 standard does not know about threads, it may load the hash entry, do the xor-trick to verify that the entry can be used, and load the hash entry again. Between the two loads the hash entry may change its content due to other threads. There you go.

I agree, but an extra load is something I would hope an optimizing compiler would avoid. Therefore I prefer Ricardo's XOR elimination. In any case, this kind of reasoning is really surprising to me (thanks again for posting that Boehm paper!) but I can see that it is very powerful and that in multithreaded code nothing is like it seems.

mar · Post by **mar** » Sun Dec 08, 2013 9:26 pm

Rein Halbersma wrote: I think it's easy enough to trigger, assuming something like
Code: Select all
void insert&#40;Entry e&#41;
&#123;
    hash&#91;e&#93;.key_value = e.key ^ e.value;
    hash&#91;e&#93;.value = e.value;
&#125;

bool matching_signature&#40;Entry e&#41;
&#123;
    return e.key == &#40;hash&#91;e&#93;.key_value ^ hash&#91;e&#93;.value&#41;;
&#125;
a code fragment like this
Code: Select all
insert&#40;e&#41;;
// ... some non hash related stuff in between
if &#40;matching_signature&#40;e&#41;) // optimized to true as if in single-threaded code
   read_and_process&#40;e&#41;;    // data race if other thread has written
would only need simple inlining and common expression elimination to make matching_signature always return true in that fragment. A write from another thread in the same hash entry would then corrupt it.

The problem with this is that the compiler would write different values into memory. In order to be able to that it must ensure that there is no other piece of code that relies on that.
It doesn't know what the code is supposed to do.
The compiler would optimize A=>B into A=>C, because B and C (output, in this case hashtable) are now different after the optimization.
What if I have another function like this:

Code: Select all

void verify_checksum&#40;)
&#123;
    compute checksum over entries
    if ( checksum != some_magic ) erase_harddrive&#40;);
&#125;

Of course it's nonsense but it can be present. The compiler must detect and change this magic value too. But it can't be done (assume magic isn't known at compile time). So it would be required to "un-optimize" checksum from
checksum += key
to
checksum += key ^ value

because it's nice that A^B + C^D + ... = E
A + C + ... = F, how do you transform E into F if B, D, ... can be arbitrary values? (I only hope I'm not wrong on this one

wgarvin · Post by **wgarvin** » Sun Dec 08, 2013 9:28 pm

syzygy wrote:Unless I'm mistaken, the C standard allows the compiler to load a value multiple times. Since the C99 standard does not know about threads, it may load the hash entry, do the xor-trick to verify that the entry can be used, and load the hash entry again. Between the two loads the hash entry may change its content due to other threads. There you go.

Yes. Even copying the hash entry into a local variable isn't sufficient, because if it knows that local contains "the same value" as the original hash entry from its (non-threaded) dataflow analysis, it might choose to spill the local and then rematerialize it in a register by reading it again from the original hash entry.

In general we can get away with stuff like this today in C / C++, because the compilers aren't sufficiently clever to figure out how to break it. But there's always the risk that someday they might be.

kbhearn · Post by **kbhearn** » Sun Dec 08, 2013 9:28 pm

The suggestions so far seem to be fixed by marking the hash table as volatile in order to prevent such assumptions as the memory address still stores the same value between the time of a store and a load or two loads. Does marking the hash as volatile also have undesirable side effects?

Rein Halbersma · Post by **Rein Halbersma** » Sun Dec 08, 2013 9:39 pm

mar wrote: Of course it's nonsense but it can be present. The compiler must detect and change this magic value too. But it can't be done (assume magic isn't known at compile time). So it would be required to "un-optimize" checksum from
checksum += key
to
checksum += key ^ value

because it's nice that A^B + C^D + ... = E
A + C + ... = F, how do you transform E into F if B, D, ... can be arbitrary values? (I only hope I'm not wrong on this one

The optimization here is entirely locally, say within the search() function. You are right that it could not perform that optimization if it detected other code also accessing the hash table. But that doesn't change the fact that it can do the optimization locally under these circumstances.

Remember I am talking about possibilities of breaking the lockless hashing, I am not at all saying it will break it for sure under any circumstance. However, as it is undefined behavior, even your example is not a sure way of guaranteeing safety.

rbarreira · Post by **rbarreira** » Sun Dec 08, 2013 9:50 pm

Rein Halbersma wrote:
syzygy wrote:Unless I'm mistaken, the C standard allows the compiler to load a value multiple times. Since the C99 standard does not know about threads, it may load the hash entry, do the xor-trick to verify that the entry can be used, and load the hash entry again. Between the two loads the hash entry may change its content due to other threads. There you go.
I agree, but an extra load is something I would hope an optimizing compiler would avoid. Therefore I prefer Ricardo's XOR elimination. In any case, this kind of reasoning is really surprising to me (thanks again for posting that Boehm paper!) but I can see that it is very powerful and that in multithreaded code nothing is like it seems.

But do note that volatile should avoid the scenario I posted, because then the compiler cannot assume the values loaded were written by the program itself.

kbhearn wrote:The suggestions so far seem to be fixed by marking the hash table as volatile in order to prevent such assumptions as the memory address still stores the same value between the time of a store and a load or two loads. Does marking the hash as volatile also have undesirable side effects?

volatile prevents some optimizations, but mostly it prevents the exact optimizations you don't want it to do. Other than that I don't know of any undesirable side effects.

Henk · Post by **Henk** » Sun Dec 08, 2013 10:00 pm

Never let multiple threads write to and read from the same address simultaneously or you get unpredicted behavior. So each thread should first ask for permission to write to an address. If he gets it, he locks the address and others threads get no permission to write or read on the locked address until the former thread is ready and releases the read/write lock.

I don't understand the other trick. My chess program does not support multiple threads so I don't have experience with locks and chess programs.

mar · Post by **mar** » Sun Dec 08, 2013 10:05 pm

Rein Halbersma wrote:The optimization here is entirely locally, say within the search() function. You are right that it could not perform that optimization if it detected other code also accessing the hash table. But that doesn't change the fact that it can do the optimization locally under these circumstances.

Remember I am talking about possibilities of breaking the lockless hashing, I am not at all saying it will break it for sure under any circumstance. However, as it is undefined behavior, even your example is not a sure way of guaranteeing safety.

That example was not about safety, it was about the fact that the compiler must check for all possibilites and thus verify that something like that doesn't happen in the whole program, namely that other pieces of code don't rely on particular values stored in memory.
It must ensure that there is absolutely no possibility that turning input A into output C instead of B won't break things, which would be expensive.
These would be very high level optimizations and I doubt present compilers are anywhere near that.
What if you want to load that data from a file for example, generated by another program that doesn't have such clever compiler?
You would have to detect that too.
I'm not sure I want a compiler doing that.
Compiler should IMHO really only optimize code in a way that it produces the same output given the same input.

kbhearn · Post by **kbhearn** » Sun Dec 08, 2013 10:09 pm

The other trick being discussed is accepting unpredicted behavior occasionally under the condition that it can be detected (and ignored) when you go to use a corrupted value. which works with quite high probability assuming the compiler does what you want it to.

Henk · Post by **Henk** » Sun Dec 08, 2013 10:16 pm

Maybe add a piece of assembler code that implements the trick only.

How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?