How could a compiler break the lockless hashing method?

syzygy · Post by **syzygy** » Mon Dec 09, 2013 12:10 am

Rein Halbersma wrote:
syzygy wrote:
5.1.2.3 ad 2 wrote:Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.)
So I don't think eliminating the xor-operations is allowed by the C standard.
It doesn't have to eliminate the XOR when storing into the hash table (which indeed is a side effect), all it needs to do is to cancel this XOR against the other XOR when computing the expression inside matching_signature. In other words, the compiler could optimize matching_signature(e) to always return true, even without touching any of the side effects.

Ok, but in the case of Crafty's HashProbe signatures will usually not match even in the single threaded case, because you will usually not have a hash hit.

syzygy · Post by **syzygy** » Mon Dec 09, 2013 12:32 am

syzygy wrote:
bob wrote:
syzygy wrote:Unless I'm mistaken, the C standard allows the compiler to load a value multiple times. Since the C99 standard does not know about threads, it may load the hash entry, do the xor-trick to verify that the entry can be used, and load the hash entry again. Between the two loads the hash entry may change its content due to other threads. There you go.
Doesn't break a thing, however.

Think about it.
I can only assume you did not read.

A compiler can legally rewrite:
Code: Select all
    word1 = htable->word1;
    word2 = htable->word2 ^ word1;
    if &#40;word2 == temp_hashkey&#41;
      break;
as
Code: Select all
  word2 = htable->word2 ^ htable->word1;
  if &#40;word2 == temp_hashkey&#41;
    break;
  word1 = htable->word1;
See the problem?

It can also choose to read htable->word1 at a later point in HashProbe() instead of accessing the local variable word1.

Oops, my rewritten code is incorrect. The "word1 = htable->word1;" statement should be the first statement immediately after if (entry < 4) {.

It could also come in front of if (word2 == temp_hashkey).

rbarreira · Post by **rbarreira** » Mon Dec 09, 2013 1:09 am

hgm wrote:I don't think it can optimize anything away if Search() is in a separate file search.c. The hash table is a global variable, and it has no idea whether after the search some other routine will start reading the hash table. So it must leave the values there that the program specifies.

If the hash table is accessed through a pointer (as is usual for allocated memory), it cannot even know if the pointer points to memory that is shared with other processes. Therefore it can never mess with the stores, and store anything other than the exact thing that the program specified.

You are forgetting that many modern compilers use whole-program optimization. Even if they didn't it wouldn't matter for this discussion, as we're talking about what a compiler could do, as opposed to what current compilers do.

Sharing the memory with other processes can only be done correctly by using volatile variables (which have already been mentioned in the first page of this thread). If a memory access is not volatile, the compiler is free to do optimizations which assume no one else will touch that memory.

rbarreira · Post by **rbarreira** » Mon Dec 09, 2013 1:34 am

syzygy wrote: The C standard in fact agrees with you:
5.1.2.3 ad 2 wrote:Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.)
So I don't think eliminating the xor-operations is allowed by the C standard.

What does "modifying an object" mean exactly?

Look at the following transcript of a shell session I just did. Notice that even though I write two values to a, only one of them is actually written.

Code: Select all

$ cat a.c
#include <stdio.h>

static int a;

int main &#40;void&#41;
&#123;
        a = 0;
        a = 1;
        return a;
&#125;

$ gcc -O3 a.c -o a

$ gdb a
GNU gdb 6.8-debian
Copyright &#40;C&#41; 2008 Free Software Foundation, Inc.
License GPLv3+&#58; GNU GPL version 3 or later <http&#58;//gnu.org/licenses/gpl.html>
This is free software&#58; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...
&#40;no debugging symbols found&#41;
&#40;gdb&#41; disassemble main
Dump of assembler code for function main&#58;
0x08048380 <main+0>&#58;    lea    0x4&#40;%esp&#41;,%ecx
0x08048384 <main+4>&#58;    and    $0xfffffff0,%esp
0x08048387 <main+7>&#58;    pushl  -0x4&#40;%ecx&#41;
0x0804838a <main+10>&#58;   mov    $0x1,%eax
0x0804838f <main+15>&#58;   push   %ebp
0x08048390 <main+16>&#58;   mov    %esp,%ebp
0x08048392 <main+18>&#58;   push   %ecx
0x08048393 <main+19>&#58;   movl   $0x1,0x8049580
0x0804839d <main+29>&#58;   pop    %ecx
0x0804839e <main+30>&#58;   pop    %ebp
0x0804839f <main+31>&#58;   lea    -0x4&#40;%ecx&#41;,%esp
0x080483a2 <main+34>&#58;   ret
End of assembler dump.
&#40;gdb&#41;

syzygy · Post by **syzygy** » Mon Dec 09, 2013 2:17 am

rbarreira wrote:
syzygy wrote: The C standard in fact agrees with you:
5.1.2.3 ad 2 wrote:Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.)
So I don't think eliminating the xor-operations is allowed by the C standard.
What does "modifying an object" mean exactly?

An object is an area of memory. It's somewhere in the standard.

Look at the following transcript of a shell session I just did. Notice that even though I write two values to a, only one of them is actually written.

There is no sequence point between the two stores.

rbarreira · Post by **rbarreira** » Mon Dec 09, 2013 2:24 am

syzygy wrote:
rbarreira wrote:
syzygy wrote: The C standard in fact agrees with you:
5.1.2.3 ad 2 wrote:Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.)
So I don't think eliminating the xor-operations is allowed by the C standard.
What does "modifying an object" mean exactly?
An object is an area of memory. It's somewhere in the standard.

Look at the following transcript of a shell session I just did. Notice that even though I write two values to a, only one of them is actually written.
There is no sequence point between the two stores.

Are you sure? Seems to me there is, according to what I've read on sequence points.

bob · Post by **bob** » Mon Dec 09, 2013 3:13 am

syzygy wrote:
bob wrote:I compile search.c, hash.c and other files SEPARATELY. How can the compiler be certain there are no locks around HashStore() when it is called from search (which it can't see when compiling hash.c) and around HashProbe() when it is called?
Apart from this being a non-argument given that compilers nowadays can easily optimise across compilation units, you have apparently forgotten about crafty.c:
Code: Select all
/* last modified 01/18/09 */
/*
 *******************************************************************************
 *                                                                             *
 *   This module is designed for the most efficient compiling, as it includes  *
 *   all of the source files into one large wad so that the compiler can see   *
 *   all function calls and inline whatever is appropriate.  The includes are  *
 *   loosely ordered so that the most common functions occur first, to help    *
 *   with cache layout when the code is actually loaded.                       *
 *                                                                             *
 *******************************************************************************
 */
#include "search.c"
#include "thread.c"
#include "repeat.c"
#include "next.c"
#include "killer.c"
#include "quiesce.c"
#include "evaluate.c"
#include "movgen.c"
#include "make.c"
#include "unmake.c"
#include "hash.c"
...

Please grow up. Can you do that to Linux? Give it a whirl and report back. 99% of my crafty compiles are individual files for obvious speed reasons.

Taking one example where the compiler CAN see everything does not prove this works for ALL examples. And in fact, I don't believe ANY existing compiler today will look at Crafty's source, trying to figure out what is done where, and have a chance to see that NextMove() could have a race, or HashProbe()/HashStore(). It would take hours to compile doing that, something no one would tolerate. This is borderline "path coverage" testing. with 1024 ifs you need 2^1024 test cases. intractable.

bob · Post by **bob** » Mon Dec 09, 2013 3:19 am

rbarreira wrote:
hgm wrote:I don't think it can optimize anything away if Search() is in a separate file search.c. The hash table is a global variable, and it has no idea whether after the search some other routine will start reading the hash table. So it must leave the values there that the program specifies.

If the hash table is accessed through a pointer (as is usual for allocated memory), it cannot even know if the pointer points to memory that is shared with other processes. Therefore it can never mess with the stores, and store anything other than the exact thing that the program specified.
You are forgetting that many modern compilers use whole-program optimization. Even if they didn't it wouldn't matter for this discussion, as we're talking about what a compiler could do, as opposed to what current compilers do.

Sharing the memory with other processes can only be done correctly by using volatile variables (which have already been mentioned in the first page of this thread). If a memory access is not volatile, the compiler is free to do optimizations which assume no one else will touch that memory.

That's false. The compiler can do optimizations BETWEEN points where that piece of code does something, but it can't pretend nobody else accesses that data, volatile or not. There are lots of ways to do shared memory. My program writes, the binary has to write. If the same procedure writes twice, and there is no escape between the two writes, the optimizer can reduce that to one write. But it MUST do one write.

bob · Post by **bob** » Mon Dec 09, 2013 3:22 am

Rein Halbersma wrote:
bob wrote: As a concluding remark, cite one reasonable example where optimizing an overflow away is a good idea.
Easy: http://www.airs.com/blog/archives/120
Code: Select all
int f&#40;int i&#41; &#123; int j, k = 0; for &#40;j = i; j < i + 10; ++j&#41; ++k; return k; &#125;
f() should return 10 for every value of i, agreed? Well, under the assumption that signed overflow cannot occur, the entire loop will be optimized and the constant 10 will be folded into the definition of f(). Not so with -fwrapv, because then the compiler has to emit special code for i = 0x7ffffff8.

Remember, compiler writers are worrying about generating the best possible code for the entire universe of correct programs, not for those programs that use undefined behavior and hope that their sins will never catch up with them.

I would rephrase that to be "they are trying to optimize those rare occurrences at the expense of all the other programs that don't do such silly loops.

bob · Post by **bob** » Mon Dec 09, 2013 3:23 am

syzygy wrote:
Rein Halbersma wrote:
syzygy wrote:
5.1.2.3 ad 2 wrote:Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.)
So I don't think eliminating the xor-operations is allowed by the C standard.
It doesn't have to eliminate the XOR when storing into the hash table (which indeed is a side effect), all it needs to do is to cancel this XOR against the other XOR when computing the expression inside matching_signature. In other words, the compiler could optimize matching_signature(e) to always return true, even without touching any of the side effects.
Ok, but in the case of Crafty's HashProbe signatures will usually not match even in the single threaded case, because you will usually not have a hash hit.

Middlegame sig matches maybe 30% of the time, although draft is not sufficient to use the information in about 1/2 of those. In fine 70 or other king and pawn endgames, hash hits are enormous.

How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?

Re: How could a compiler break the lockless hashing method?