syzygy wrote:bob wrote:syzygy wrote:bob wrote:There's one thing wrong with this. "memory order relaxed" does NOT cover what is known as a "control dependency". if (c) a is a classic control dependency where a is not executed unless c is true. The alpha is a classic "relaxed memory order" architecture. but it does NOT include violating control dependencies. To do so violates the basic premise of programming, in fact, that the program ALWAYS produces the same results as when it is executed one instruction at a time, in the order they are written. It would seem this might even produce a WAR hazard, where the write gets speculatively done before a previous read, which would kill most any program on the planet.
This really is NOT going to happen architecturally.
I agree this single threaded execution will always behave as one would expect it to behave, but of course only from the point of view of that single thread. This principle is not violated.
Also note that it could be the compiler that is inserting the speculative store (knowing that it will not affect single-threaded execution AND that it will not violate the requirements on multithreaded execution imposed by the C++11 standard).
That was my original guess, if you recall.
Not really. This was about compiler/hardware combinations doing stuff all along. To do the actual speculation you need hardware support. You said hardware won't do that, now you suddenly insert "try to". But this is just getting tiring again.
About semantics of multithreaded programming: it is completely irrational to expect a language to allow only one result from a
multithreaded program. It would mean you can't efficiently program a multithreaded alpha-beta anymore.
You don't need hardware support to do speculation. A compiler can change this:
if (c) x=y
else x=0
However, if most of the time, via PGO, it notices that c is true, it could well choose to do this:
x=y
if (!c) x=0
That's software speculation. That was not uncommon when static branch prediction was used. Speculatively make the assignment then undo it if wrong.
So, I did NOT say hardware speculation was needed. In fact, I said, and still maintain, that speculation that would do a write to L1 cache is badly broken and no program will work correctly there, nor will it be possible to "roll back". The instant a speculative store hits L1, it can be forwarded to another L1, and one cycle later that cache might make the decision to replace that block in cache, which means it gets written back to memory. There is no practical way to unroll any of that once the speculative store hits L1. So it won't happen.
A compiler is free to move things around and do all the speculative stores it wants, knowing that other threads will see those speculative stores instantly.
My take is that the C++11 standard is simply broken. Again, the purpose is to REMOVE ambiguities, NOT add 'em.
Here is my FIRST response on the topic, since you don't seem to remember it:
I think your first guess was correct, that r1=r2=42 is legal, even if not possible, given existing (or future) hardware.
I suppose a compiler could change this (I am using normal memory reads to avoid excessive typing with the relaxed order stuff):
I think that is post #4 from the top of the thread. I snipped the compiler-modified source example to keep the above quote short. I said no hardware, present or future would do that. I stick by that. A compiler can do most anything it wants, and if the standard is written ambiguously, the compiler writers will apparently be happy to hide behind the "but the standard says this is ok.." even if it really should NOT be.
I don't expect a compiler to produce the same results every time on a multi-threaded program. However, if that were guaranteed, alpha/beta would still work just fine, speedups would still be there, the compiler would somehow magically eliminate all the races.
What I DO expect the compiler to do, is something rational. For example, given this simple bit of code:
x = 0;
if (c) x=1;
I expect x to be zero or one. NOTHING else. Yet the C++11 standard you are quoting says that this is too much to ask. I think that is ridiculous.
BTW this argument IS ridiculous. You are taking a LOUSY standard, and trying to justify it by inventing hardware that will never exist. Better to just say "the standard is broken here".