Not unless they are broken. Executing instructions that cannot be undone is not 'speculative execution'. It is just a broken architecture that doesn't work according to the specs of the machine language.
This story about speculative stores visible by other agents is absolute gobledegook. If any architecture would do that it would make it totally useless for SMP. Because it is just coincidental that in the example the speculative execution was dependent on a branch and test on something read from memory first. No CPU could be smart enough to know that. So if it would do these kind of speculative stores, it would be constantly flooding the caches of all other CPUs with totally invalid data that never should have been calculated, from all stores that never should have been executed, but happened to be on a path of a branch misprediction concering data that was totally local to that core.
c++11 std::atomic and memory_order_relaxed
Moderators: hgm, Rebel, chrisw
-
- Posts: 27817
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: c++11 std::atomic and memory_order_relaxed
Last edited by hgm on Wed Apr 02, 2014 11:07 pm, edited 1 time in total.
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: c++11 std::atomic and memory_order_relaxed
Speculative implies that they can be undone.hgm wrote:Not unless they are broken. Executing instructions that cannot be undone is not 'speculative execution'. It is just a broken architecture that doesn't work according to the specs of the machine language.
They would need to be undone if the condition turns out to be false. If the condition turns out to be true, there is no need to undo them.
The complication is that if thread 2 is allowed to see the speculative store of thread 1 before it is finally committed, then also thread 2 must be speculatively executing. But why not, if both happen to get into speculative mode at the same time?
Everything would work according to the specs.
I don't know what you mean by (the specs of) "the machine language".
-
- Posts: 27817
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: c++11 std::atomic and memory_order_relaxed
Effects cannot be undone when other cores can have already seen, and acted on them.
Specs of the machine language define the effect of instructions. Like for instance that there is a flow of control that decides which instructions are executed and which not, and that branch instructions steer that flow of control one way, and not both ways.
An architecture where one core could cause other cores to see effects of store instructions that are not supposed to be executed would be totally useless. Even if you were running two independent pieces of code on the CPUs they would be constantly corrupting each other's memory accesses with invalid data written to unintended memory addresses.
Specs of the machine language define the effect of instructions. Like for instance that there is a flow of control that decides which instructions are executed and which not, and that branch instructions steer that flow of control one way, and not both ways.
An architecture where one core could cause other cores to see effects of store instructions that are not supposed to be executed would be totally useless. Even if you were running two independent pieces of code on the CPUs they would be constantly corrupting each other's memory accesses with invalid data written to unintended memory addresses.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: c++11 std::atomic and memory_order_relaxed
I agree. This has gotten badly mangled within the intel context of speculative memory writes. These are instructions like movd $1, x, where the instruction is executed but not retired, which means the actual write that makes it out to L1 is NOT done until the instruction is retired, if it ever is. It is only retired when the speculation has proven to be correct and the instruction is actually retired and the write to L1 is then done.hgm wrote:Effects cannot be undone when other cores can have already seen, and acted on them.
Specs of the machine language define the effect of instructions. Like for instance that there is a flow of control that decides which instructions are executed and which not, and that branch instructions steer that flow of control one way, and not both ways.
An architecture where one core could cause other cores to see effects of store instructions that are not supposed to be executed would be totally useless. Even if you were running two independent pieces of code on the CPUs they would be constantly corrupting each other's memory accesses with invalid data written to unintended memory addresses.
I don't call that "speculative write" at all since the write is not done so long as it remains speculative as to whether or not it should be done. If it makes it to L1, the world ends as we know it in computer architecture and programming, because once it gets there, there is no "changing your mind and undoing the write."
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: c++11 std::atomic and memory_order_relaxed
As I said, the other core would have to be in speculative execution mode as well and the results of it would have to be rolled back as well. The two cores would be in some kind of "joint speculative mode".hgm wrote:Effects cannot be undone when other cores can have already seen, and acted on them.
As I said, I don't know any hardware that supports this, and I don't see any clear benefit of hardware that would support it, but that does not mean there is no benefit (that I just can't think of right now) and it certainly does not mean there will never be such hardware. It is certainly quite feasible to construct such hardware.
I know, but what machine language are you talking about that this architecture would violate the specs of....Specs of the machine language define the effect of instructions. Like for instance that there is a flow of control that decides which instructions are executed and which not, and that branch instructions steer that flow of control one way, and not both ways.
See above and see my earlier posts.An architecture where one core could cause other cores to see effects of store instructions that are not supposed to be executed would be totally useless.
If the compiler somehow has predicted that the condition of an if-statement will almost always be true, then it may make sense to speculatively execute a store that is conditional on this condition being true before the condition is evaluated. If another core at the same time accesses the memory location being written to, it is not completely unreasonable to switch that core to speculative execution as well. After all, we are pretty sure that the condition will turn out to be true, so most likely everything will just turn out fine. Should the condition unexpectedly evaluate to false, then obviously these two cores/threads have both to be rolled back.
-
- Posts: 27817
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: c++11 std::atomic and memory_order_relaxed
If it would be rolled back in both of them, none of the memory variables could end up at 42.syzygy wrote:As I said, the other core would have to be in speculative execution mode as well and the results of it would have to be rolled back as well. The two cores would be in some kind of "joint speculative mode".
Any machine language that would specify programs written in it had a defined effect, rather than always resulting in completely undefined behavior...I know, but what machine language are you talking about that this architecture would violate the specs of....
If reading a speculatively written value would bring a core in a speculative state, swapping the order of reads and speculative writes in the example (what you would need to get 42) would lead to a deadlock. The cores would never get out of the speculative state. Only a read of a non-speculative value could resolve the branch, and none would be scheduled anymore.If the compiler somehow has predicted that the condition of an if-statement will almost always be true, then it may make sense to speculatively execute a store that is conditional on this condition being true before the condition is evaluated. If another core at the same time accesses the memory location being written to, it is not completely unreasonable to switch that core to speculative execution as well. After all, we are pretty sure that the condition will turn out to be true, so most likely everything will just turn out fine. Should the condition unexpectedly evaluate to false, then obviously these two cores/threads have both to be rolled back.
It just doesn't work.
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: c++11 std::atomic and memory_order_relaxed
No, both conditions end up true so no need to roll back anything.hgm wrote:If it would be rolled back in both of them, none of the memory variables could end up at 42.syzygy wrote:As I said, the other core would have to be in speculative execution mode as well and the results of it would have to be rolled back as well. The two cores would be in some kind of "joint speculative mode".
The speculation was conditional on (r1 == 42) and (r2 == 42) evaluating to true.
The reads and loads get reordered in a ridiculous way, but that's what memory_order_relaxed allows.
Single-threaded execution is completely predictable. Useful multi-threaded execution is never completely predictable. Here the results are rather surprising, because they appear to violate causality.Any machine language that would specify programs written in it had a defined effect, rather than always resulting in completely undefined behavior...I know, but what machine language are you talking about that this architecture would violate the specs of....
Speculation ends once the condition has been evaluated. If it is true, the store is committed. If it is false, the store is rolled back.If reading a speculatively written value would bring a core in a speculative state, swapping the order of reads and speculative writes in the example (what you would need to get 42) would lead to a deadlock.
It works fine.
-
- Posts: 27817
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: c++11 std::atomic and memory_order_relaxed
The condition cannot be evaluated if the operands are not yet known because they are based on a speculatively executed load.syzygy wrote:Speculation ends once the condition has been evaluated. If it is true, the store is committed. If it is false, the store is rolled back.
It works fine.
You cannot have it both ways. Either reading a speculatively written value from an other core puts you in the speculative state (and then that state would only be resolved when that write was confirmed or retracted in that other core, and not because you used it in some other instruction), or you continue 'business as usual' and treat the compromised value you read as if it were real. The latter leads to undefined behavior for every multithreaded program, which doesn't seem a very good idea. If one thread executes
Code: Select all
int a[1<<24];
vod f() {
unsigned int i, j;
while(1) {
i = random();
if(i < (1<<24)) a[i] = j;
j = i;
}
}
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: c++11 std::atomic and memory_order_relaxed
There's one thing wrong with this. "memory order relaxed" does NOT cover what is known as a "control dependency". if (c) a is a classic control dependency where a is not executed unless c is true. The alpha is a classic "relaxed memory order" architecture. but it does NOT include violating control dependencies. To do so violates the basic premise of programming, in fact, that the program ALWAYS produces the same results as when it is executed one instruction at a time, in the order they are written. It would seem this might even produce a WAR hazard, where the write gets speculatively done before a previous read, which would kill most any program on the planet.syzygy wrote:No, both conditions end up true so no need to roll back anything.hgm wrote:If it would be rolled back in both of them, none of the memory variables could end up at 42.syzygy wrote:As I said, the other core would have to be in speculative execution mode as well and the results of it would have to be rolled back as well. The two cores would be in some kind of "joint speculative mode".
The speculation was conditional on (r1 == 42) and (r2 == 42) evaluating to true.
The reads and loads get reordered in a ridiculous way, but that's what memory_order_relaxed allows.
Single-threaded execution is completely predictable. Useful multi-threaded execution is never completely predictable. Here the results are rather surprising, because they appear to violate causality.Any machine language that would specify programs written in it had a defined effect, rather than always resulting in completely undefined behavior...I know, but what machine language are you talking about that this architecture would violate the specs of....
Speculation ends once the condition has been evaluated. If it is true, the store is committed. If it is false, the store is rolled back.If reading a speculatively written value would bring a core in a speculative state, swapping the order of reads and speculative writes in the example (what you would need to get 42) would lead to a deadlock.
It works fine.
This really is NOT going to happen architecturally.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: c++11 std::atomic and memory_order_relaxed
I don't see how it could work either. A and B are both in speculative mode. How can BOTH get out, one has to get out first. In any case, discussing the idea of a complete core being in "speculative mode" is a pointless exercise since it will never happen.hgm wrote:The condition cannot be evaluated if the operands are not yet known because they are based on a speculatively executed load.syzygy wrote:Speculation ends once the condition has been evaluated. If it is true, the store is committed. If it is false, the store is rolled back.
It works fine.
You cannot have it both ways. Either reading a speculatively written value from an other core puts you in the speculative state (and then that state would only be resolved when that write was confirmed or retracted in that other core, and not because you used it in some other instruction), or you continue 'business as usual' and treat the compromised value you read as if it were real. The latter leads to undefined behavior for every multithreaded program, which doesn't seem a very good idea. If one thread executes
any memory read by any other thread, including its code fetches, could deliver a totally random value, as mispredictions of the if-statement will cause writes of any conceivable value in any conceivable memory location. And the other thread would treat them all as if they were real data.Code: Select all
int a[1<<24]; vod f() { unsigned int i, j; while(1) { i = random(); if(i < (1<<24)) a[i] = j; j = i; } }
I can see this kind of stuff happening if the compilers continue to forge ahead into optimizations that are not safe. I don't see the hardware EVER getting into this, there is nothing to gain and a zillion transistors of complexity to be avoided.