c++11 std::atomic and memory_order_relaxed

kbhearn · Post by **kbhearn** » Thu Apr 03, 2014 7:37 pm

Have to be a damn lawyer to read these specs. ok so having read the explanation for the earlier example on another site that was more clearly explained:

from http://en.cppreference.com/w/cpp/atomic/memory_order

Explanation
Relaxed ordering

Atomic operations tagged std::memory_order_relaxed are not synchronization operations, they do not order memory. They only guarantee atomicity and modification order consistency.

For example, with x and y initially zero,

// Thread 1:
r1 = y.load(memory_order_relaxed); // A
x.store(r1, memory_order_relaxed); // B
// Thread 2:
r2 = x.load(memory_order_relaxed); // C
y.store(42, memory_order_relaxed); // D

is allowed to produce r1 == r2 == 42 because, although A is sequenced-before B and C is sequenced before D, nothing prevents D from appearing before A in the modification order of y, and B from appearing before C in the modification order of x.

Typical use for relaxed memory ordering is updating counters, such as the reference counters of std::shared_ptr, since this only requires atomicity, but not ordering or synchronization.

So marking up the first example:
Thread 1:
r1 = x.load( memory_order_relaxed ); // A
if ( r1 == 42 ) y.store( r1, memory_order_relaxed ); // B
Thread 2:
r2 = y.load( memory_order_relaxed ); // C
if ( r2 == 42 ) x.store( 42, memory_order_relaxed ); // D

the same explanation still holds as the single-thread sequencing was apparently already established without the conditional, but absent a total order the loads to r1 and r2 are apparently allowed to see the future of the other thread, just not their own and thus contain 42... which while i think is ridiculous, apparently the only fallacy here with memory_order_relaxed and a malicious compiler is that two different memory locations are being used and reliant on each other and synchronisation would be needed to cover interactions between them in this manner.

hgm · Post by **hgm** » Thu Apr 03, 2014 8:19 pm

kbhearn wrote:the same explanation still holds as the single-thread sequencing was apparently already established without the conditional,

To me that is pure nonsense. The standard might allow reordering of memory operations, but you surely cannot allow swapping the order of testing a condition with execution of the code to be executed only when the condition is true. That would alter the semantics of any code. So the store of 42 must certainly occur after r2 == 42 evaluates as true. And this is obviously not possible, as it would require a 42 to appear out of thin air.

Are you sure the note on this in the first reference is even meant seriously? Other notes (like "atomic variables are neither active, nor radio-active" and "atomic variables do not decay") make me sort of suspicious...

kbhearn · Post by **kbhearn** » Thu Apr 03, 2014 8:52 pm

The first reference is linked to from gcc's c++11 support page for the atomic feature. I do find the note that atomic variables are not radioactive a little out of place considering this is apparently an official proposal that was accepted for atomic variable support under the standard, perhaps lawyers were involved :p

kbhearn · Post by **kbhearn** » Thu Apr 03, 2014 9:47 pm

One further note i found:

https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

so lock prefix on x86 only appears to be generated with the default seq_cst mode and merely dropping down to release and acquire seems to be good enough to not kill performance and strong enough not to worry about what the compiler might do for hypothetical optimisations.

syzygy · Post by **syzygy** » Thu Apr 03, 2014 11:28 pm

hgm wrote:
syzygy wrote:Speculation ends once the condition has been evaluated. If it is true, the store is committed. If it is false, the store is rolled back.

It works fine.
The condition cannot be evaluated if the operands are not yet known because they are based on a speculatively executed load.

You cannot have it both ways. Either reading a speculatively written value from an other core puts you in the speculative state (and then that state would only be resolved when that write was confirmed or retracted in that other core, and not because you used it in some other instruction), or you continue 'business as usual' and treat the compromised value you read as if it were real.

In the case we have here the two threads must synchronise to see whether both conditions hold true. This is a bit annoying and complicating matters, but not a theoretical obstacle.

Your code example is not a problem since the compiler (or processor) can simply decide not to attempt to execute it speculatively.

syzygy · Post by **syzygy** » Thu Apr 03, 2014 11:31 pm

bob wrote:There's one thing wrong with this. "memory order relaxed" does NOT cover what is known as a "control dependency". if (c) a is a classic control dependency where a is not executed unless c is true. The alpha is a classic "relaxed memory order" architecture. but it does NOT include violating control dependencies. To do so violates the basic premise of programming, in fact, that the program ALWAYS produces the same results as when it is executed one instruction at a time, in the order they are written. It would seem this might even produce a WAR hazard, where the write gets speculatively done before a previous read, which would kill most any program on the planet.

This really is NOT going to happen architecturally.

I agree this single threaded execution will always behave as one would expect it to behave, but of course only from the point of view of that single thread. This principle is not violated.

Also note that it could be the compiler that is inserting the speculative store (knowing that it will not affect single-threaded execution AND that it will not violate the requirements on multithreaded execution imposed by the C++11 standard).

syzygy · Post by **syzygy** » Thu Apr 03, 2014 11:43 pm

hgm wrote:
kbhearn wrote:the same explanation still holds as the single-thread sequencing was apparently already established without the conditional,
To me that is pure nonsense. The standard might allow reordering of memory operations, but you surely cannot allow swapping the order of testing a condition with execution of the code to be executed only when the condition is true. That would alter the semantics of any code.

It does not alter single-threaded semantics.
It does not violate multithreaded semantics as defined by the C11/C++11 standard, which simply allows a lot of possible executions.

There is no other "semantics" of multithreaded code than that defined by the C11/C++11 standard (if we're talking C11/ C++11).

bob · Post by **bob** » Fri Apr 04, 2014 2:30 am

syzygy wrote:
bob wrote:There's one thing wrong with this. "memory order relaxed" does NOT cover what is known as a "control dependency". if (c) a is a classic control dependency where a is not executed unless c is true. The alpha is a classic "relaxed memory order" architecture. but it does NOT include violating control dependencies. To do so violates the basic premise of programming, in fact, that the program ALWAYS produces the same results as when it is executed one instruction at a time, in the order they are written. It would seem this might even produce a WAR hazard, where the write gets speculatively done before a previous read, which would kill most any program on the planet.

This really is NOT going to happen architecturally.
I agree this single threaded execution will always behave as one would expect it to behave, but of course only from the point of view of that single thread. This principle is not violated.

Also note that it could be the compiler that is inserting the speculative store (knowing that it will not affect single-threaded execution AND that it will not violate the requirements on multithreaded execution imposed by the C++11 standard).

That was my original guess, if you recall. I don't see how hardware will EVER try to do this. Once something is written to cache it is over, you now have every level of cache AND main memory that might need "rolling back".

But the compiler can do pretty much whatever the standard allows, which includes a few things I would personally not do if I were in charge of that particular compiler project. I think it is a pretty poor "standard" that allows such nonsensical compiler behavior, but then the C standard has been lousy since it left K&R's private lab, and became a "public language governed by a committee." In general committees do far more harm than good.

bob · Post by **bob** » Fri Apr 04, 2014 2:33 am

syzygy wrote:
hgm wrote:
kbhearn wrote:the same explanation still holds as the single-thread sequencing was apparently already established without the conditional,
To me that is pure nonsense. The standard might allow reordering of memory operations, but you surely cannot allow swapping the order of testing a condition with execution of the code to be executed only when the condition is true. That would alter the semantics of any code.
It does not alter single-threaded semantics.
It does not violate multithreaded semantics as defined by the C11/C++11 standard, which simply allows a lot of possible executions.

There is no other "semantics" of multithreaded code than that defined by the C11/C++11 standard (if we're talking C11/ C++11).

However, the NORMAL purpose of a standard is to explicitly disambiguate features of a language so that there is only one reasonable interpretation producing one reasonable result. This appears to be a standard that explicitly includes ambiguity, for reasons unknown to me as a rational programmer.

syzygy · Post by **syzygy** » Fri Apr 04, 2014 8:46 am

bob wrote:
syzygy wrote:
bob wrote:There's one thing wrong with this. "memory order relaxed" does NOT cover what is known as a "control dependency". if (c) a is a classic control dependency where a is not executed unless c is true. The alpha is a classic "relaxed memory order" architecture. but it does NOT include violating control dependencies. To do so violates the basic premise of programming, in fact, that the program ALWAYS produces the same results as when it is executed one instruction at a time, in the order they are written. It would seem this might even produce a WAR hazard, where the write gets speculatively done before a previous read, which would kill most any program on the planet.

This really is NOT going to happen architecturally.
I agree this single threaded execution will always behave as one would expect it to behave, but of course only from the point of view of that single thread. This principle is not violated.

Also note that it could be the compiler that is inserting the speculative store (knowing that it will not affect single-threaded execution AND that it will not violate the requirements on multithreaded execution imposed by the C++11 standard).
That was my original guess, if you recall.

Not really. This was about compiler/hardware combinations doing stuff all along. To do the actual speculation you need hardware support. You said hardware won't do that, now you suddenly insert "try to". But this is just getting tiring again.

About semantics of multithreaded programming: it is completely irrational to expect a language to allow only one result from a multithreaded program. It would mean you can't efficiently program a multithreaded alpha-beta anymore.

c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed

Re: c++11 std::atomic and memory_order_relaxed