c++11 std::atomic and memory_order_relaxed

Rein Halbersma · Post by **Rein Halbersma** » Wed Apr 09, 2014 9:48 am

bob wrote: Seems to DIRECTLY fix the issue... So I fail to see why it is "so hard to fix" when the above directly fixes it. The hardware can't possibly do the above by itself, it already understands the problem and solves it correctly. So we are left with the compiler specification being so poorly written that the above could be done by the compiler, producing a bogus result. The above change simply closes the door. It wasn't THAT hard either.

The Note you quote is only an illustration of the general problem of coming up with a satisfactory definition of dependency. It is not sufficient for the Standard to come up with a partial list of such examples.

Just read N3710, it explains in excruciating detail that while "normal" hardware like x86 and Itanium already enforces your intuition at the hardware level, this is not the case for ARM and NVidia GPU cards. The paper goes on to explain that the problem is coming up with a definition of dependency that prohibits "out-of-thin-air" results while still not imposing undue overhead on those other architectures. These overheads are introduced because an optimizing compiler could otherwise (i.e. in the absence of the overhead) reorder expressions unless it knows there is a dependency within the particular memory model. For memory_order_relaxed, this is just very hard to define.

bob · Post by **bob** » Wed Apr 09, 2014 9:13 pm

Rein Halbersma wrote:
bob wrote: Seems to DIRECTLY fix the issue... So I fail to see why it is "so hard to fix" when the above directly fixes it. The hardware can't possibly do the above by itself, it already understands the problem and solves it correctly. So we are left with the compiler specification being so poorly written that the above could be done by the compiler, producing a bogus result. The above change simply closes the door. It wasn't THAT hard either.
The Note you quote is only an illustration of the general problem of coming up with a satisfactory definition of dependency. It is not sufficient for the Standard to come up with a partial list of such examples.

Just read N3710, it explains in excruciating detail that while "normal" hardware like x86 and Itanium already enforces your intuition at the hardware level, this is not the case for ARM and NVidia GPU cards. The paper goes on to explain that the problem is coming up with a definition of dependency that prohibits "out-of-thin-air" results while still not imposing undue overhead on those other architectures. These overheads are introduced because an optimizing compiler could otherwise (i.e. in the absence of the overhead) reorder expressions unless it knows there is a dependency within the particular memory model. For memory_order_relaxed, this is just very hard to define.

I've programmed the Arm. I have not seen any case where if I write something like this:

if (c)
x = 42;

where x can be set to 42 if c is false. A compiler could break that, but I've not seen the Arm do so. In fact, I have NEVER seen a cpu that violates a basic control dependency, because it causes all sorts of problems, even in a single thread program

if (c)
x = x / c;

as but one example.

Defining "control dependency" is not exactly hard. My Hennessy and Patterson architecture book does that quite concisely. I accept compilers modifying my code to make it faster, so long as it remains correct. I do not EVER accept compilers modifying my code to make it faster AND produce wrong/impossible results...

Their problem seems to be that they want to define control dependencies with a loophole where a compiler could violate the dependency if it "thinks" it is safe. Compilers don't really need to try to "think"...

c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed