Move time and compiler optimization

bob · Post by **bob** » Thu Mar 27, 2014 9:20 pm

AlvaroBegue wrote:
bob wrote:
mar wrote:Hello Edmund,
as far as I know function calls (whether inline or not) are sequence points (as defined by the standard) so I wouldn't worry there (plus arguments are guaranteed to be evaluated before function body is entered).

A real-life example that happened to me some time ago:
Code: Select all
    	nodes&#91;res&#93;.index = addNode&#40; n->front );
where nodes is a vector that can be changed by subsequent call to recursive method addNode.
What happened is that the compiler cached pointer to nodes before calling addNode which pushes back some elements (sure I might have preallocated nodes in advance by using pre-traversal).
It's because the order of evaluation of assignment operator is not guaranteed (it's not a sequence point unlike && and || operators which are guaranteed to evaluate left to right),
so the compiler is free to evaluate left side first (read where to store the result, not the store itself).

Fixing it was simple enough:
Code: Select all
        int tmp = addNode&#40; n->front );
        nodes&#91;res&#93;.index = tmp;
One has to be careful sometimes
From what I've read sequence points are also automatically "inserted" at the end of each expression statement.

Hope it helps.
That statement leaves a lot to be desired. "each expression statement" means what when a compiler has taken the instructions from several consecutive source statements and interlaced them like a drunken programmer? The concept "sequence point" is extremely vague when you get down to actual programming and the compiler. Fortunately an accurate data flow graph (which any optimizing compiler produces as a natural work product) takes care of this funny stuff.
Didn't you use to teach this language in college? And you think there is something vague about sequence points? Good lord...

Vague in the context I mentioned. It is easy to follow looking at C source directly. But after the compiler has done its thing, it is not so clear any longer when you look at what it did, what it rearranged, etc. So saying "there is a sequence point at a function call might not be correct, since the function call might just go away...

Didn't say the concept of a sequence point itself was incorrect, just that it MIGHT not mean exactly what you think in some cases. IE after inlining, the compiler has no idea there IS a function call involved, it doesn't see it, just the final code with the function inlined.

mar · Post by **mar** » Thu Mar 27, 2014 9:30 pm

I think the point is that the compiler can do whatever optimizations it likes an long as the optimized program behaves exactly the same as if it was synchronized at sequence points.

syzygy · Post by **syzygy** » Fri Mar 28, 2014 1:40 am

bob wrote:Vague in the context I mentioned. It is easy to follow looking at C source directly. But after the compiler has done its thing, it is not so clear any longer when you look at what it did, what it rearranged, etc. So saying "there is a sequence point at a function call might not be correct, since the function call might just go away...

Don't you get it at all?

The C language standard defines that there is a "sequence point" between two full expressions.

But not everybody has a facility for abstraction.

mar wrote:I think the point is that the compiler can do whatever optimizations it likes an long as the optimized program behaves exactly the same as if it was synchronized at sequence points.

Correct, where "exactly the same" refers to the "side effects". And again the C standard defines what these "side effects" are.

Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression in general includes both value computations and initiation of side effects. Value computation for an lvalue expression includes determining the identity of the designated object.

Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread, which induces a partial order among those evaluations. Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B. (Conversely, if A is sequenced before B, then B is sequenced after A.) If A is not sequenced before or after B, then A and B are unsequenced. Evaluations A and B are indeterminately sequenced when A is sequenced either before or after B, but it is unspecified which. 13) The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B. (A summary of the sequence points is given in annex C.)

The following are the sequence points described in 5.1.2.3:
(...)
— Between the evaluation of a full expression and the next full expression to be
evaluated. The following are full expressions: an initializer that is not part of a
compound literal (6.7.9); the expression in an expression statement (6.8.3); (...)
(...)

bob · Post by **bob** » Fri Mar 28, 2014 6:39 pm

mar wrote:I think the point is that the compiler can do whatever optimizations it likes an long as the optimized program behaves exactly the same as if it was synchronized at sequence points.

No argument with one exception. After the inlining stage of the compiler is done, some of the procedure calls are now missing. The compiler sees the inlined code for the actual compilation step, and it is no longer aware of a procedure call, nor of that specific sequence point. Certainly the inlined code will have its own set of sequence points. My comment was simply that the "sequence points" you see MIGHT not exist at all, in reality. Saying "there is a sequence point right here" can, therefore, be a bit inaccurate. In fact, in some cases, a procedure might be removed completely AND not inlined either, if it is found to be useless code. Hence my comment. Nothing more, nothing less.

bob · Post by **bob** » Fri Mar 28, 2014 6:41 pm

syzygy wrote:
bob wrote:Vague in the context I mentioned. It is easy to follow looking at C source directly. But after the compiler has done its thing, it is not so clear any longer when you look at what it did, what it rearranged, etc. So saying "there is a sequence point at a function call might not be correct, since the function call might just go away...
Don't you get it at all?

The C language standard defines that there is a "sequence point" between two full expressions.

I certainly "get it." But it is certainly possible that one of those expressions is completely removed. Making the concept a bit more vague. As I said. Some other sequence point will eventually "fill in" there, but not exactly the one visualized.

But not everybody has a facility for abstraction.

mar wrote:I think the point is that the compiler can do whatever optimizations it likes an long as the optimized program behaves exactly the same as if it was synchronized at sequence points.
Correct, where "exactly the same" refers to the "side effects". And again the C standard defines what these "side effects" are.
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression in general includes both value computations and initiation of side effects. Value computation for an lvalue expression includes determining the identity of the designated object.

Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread, which induces a partial order among those evaluations. Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B. (Conversely, if A is sequenced before B, then B is sequenced after A.) If A is not sequenced before or after B, then A and B are unsequenced. Evaluations A and B are indeterminately sequenced when A is sequenced either before or after B, but it is unspecified which. 13) The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B. (A summary of the sequence points is given in annex C.)

The following are the sequence points described in 5.1.2.3:
(...)
— Between the evaluation of a full expression and the next full expression to be
evaluated. The following are full expressions: an initializer that is not part of a
compound literal (6.7.9); the expression in an expression statement (6.8.3); (...)
(...)

syzygy · Post by **syzygy** » Fri Mar 28, 2014 8:30 pm

bob wrote:
syzygy wrote:
bob wrote:Vague in the context I mentioned. It is easy to follow looking at C source directly. But after the compiler has done its thing, it is not so clear any longer when you look at what it did, what it rearranged, etc. So saying "there is a sequence point at a function call might not be correct, since the function call might just go away...
Don't you get it at all?

The C language standard defines that there is a "sequence point" between two full expressions.
I certainly "get it." But it is certainly possible that one of those expressions is completely removed.

What the compiler does has nothing to do with it. Sequence points are defined at the level of the C grammar.

They are used in defining requirements that the compiled code should comply with, i.e. in the definition of the contract between the programmer and the compiler. The compiler needs to ensure inter alia that side effects are handled correctly at sequence points. The programmer can then rely on this.

This is why the OP does not have to worry about scenario 1. The language specification simply forbids it. A compiler that would generate such code is simply buggy.

(Yeah buggy compilers exist blab balbla)

bob · Post by **bob** » Sat Mar 29, 2014 3:13 am

syzygy wrote:
bob wrote:
syzygy wrote:
bob wrote:Vague in the context I mentioned. It is easy to follow looking at C source directly. But after the compiler has done its thing, it is not so clear any longer when you look at what it did, what it rearranged, etc. So saying "there is a sequence point at a function call might not be correct, since the function call might just go away...
Don't you get it at all?

The C language standard defines that there is a "sequence point" between two full expressions.
I certainly "get it." But it is certainly possible that one of those expressions is completely removed.
What the compiler does has nothing to do with it. Sequence points are defined at the level of the C grammar.

They are used in defining requirements that the compiled code should comply with, i.e. in the definition of the contract between the programmer and the compiler. The compiler needs to ensure inter alia that side effects are handled correctly at sequence points. The programmer can then rely on this.

This is why the OP does not have to worry about scenario 1. The language specification simply forbids it. A compiler that would generate such code is simply buggy.

(Yeah buggy compilers exist blab balbla)

I never said compilers do it wrongly. I simply said "the concept of a sequence point is a bit vague, because the compiler transforms the code in many ways, so pointing to _RIGHT HERE_ at a source line might find you pointing at a point that doesn't even exist in a program. Trying to identify the "sequence" point in the asm is even more vague due to interlaced instructions and such." I would call a sequence point more of an abstract entity, you know where it is in the source, you know where it would be if you don't allow any compiler optimization or anything else, but in the actual thing that runs, it is a bit harder to pin down precisely.

You always are looking to make a mountain out of a molehill. I find it imprecise to say "there is a sequence point at this place in the program" when that point might not even exist in the final executable.

That was MY point, and it wasn't a "sequence point" just a "technical point."

Michel · Post by **Michel** » Sat Mar 29, 2014 3:20 am

I find it imprecise to say "there is a sequence point at this place in the program" when that point might not even exist in the final executable.

Can we please stop this. As several people have pointed out multiple times already: sequence points are defined by the C standard. The compiler has nothing to do with it.

mcostalba · Post by **mcostalba** » Sat Mar 29, 2014 9:05 am

bob wrote: I never said compilers do it wrongly. I simply said "the concept of a sequence point is a bit vague, because the compiler transforms the code in many ways, so pointing to _RIGHT HERE_ at a source line might find you pointing at a point that doesn't even exist in a program. Trying to identify the "sequence" point in the asm is even more vague due to interlaced instructions and such."

Compiler doesn't have to identify the "sequence" point in the asm.

Sequence points are associated with source code and compiler has to respect that: how it translates the sources to asm is not developer business. For the developer is enough to know that the compiler will respect this...and it will do otherwise is a compiler bug.

Not to beat a dead horse, but this is really amazingly easy to understand. The desperate aim to try to have always the last word sometime brings you to ridiculize yourself.

lucasart · Post by **lucasart** » Sat Mar 29, 2014 10:16 am

mar wrote: A real-life example that happened to me some time ago:
Code: Select all
    	nodes&#91;res&#93;.index = addNode&#40; n->front );
where nodes is a vector that can be changed by subsequent call to recursive method addNode.

I think there is a serious problem of design in code like that. How is anyone (besides you) going to understand that code? The addNode() function has a side effect on something that is not even in its argument list, which you then assign the result to!

While it can be useful to know the arcanes of the C standard, it is more useful to design code in a KISS-compliant manner, so that you never have to ask yourself such questions.

Rethinking the design to make it more clear would most likely avoid the problem completely IMO. I cannot say for sure, because I don't know what the context in your code is, and what you're doing with this. But I know I've never had to write anything like that in my life.

Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization