Move time and compiler optimization

mcostalba · Post by **mcostalba** » Sat Mar 29, 2014 10:38 am

lucasart wrote:
mar wrote: A real-life example that happened to me some time ago:
Code: Select all
    	nodes&#91;res&#93;.index = addNode&#40; n->front );
where nodes is a vector that can be changed by subsequent call to recursive method addNode.
I think there is a serious problem of design in code like that.

I fully second Lucas here, the correct fix is to change addNode signature to make it clear it is going to modify nodes array, so something like:

Code: Select all

addNode&#40; n->front, nodes, res );

mar · Post by **mar** » Sat Mar 29, 2014 12:04 pm

lucasart wrote: I cannot say for sure, because I don't know what the context in your code is, and what you're doing with this. But I know I've never had to write anything like that in my life.

Fair enough. The original code takes a BSP tree where nodes are linked with pointers and translates it into a more compact structure (where it's linked with indices).
It's done only once before the map is dumped to a file so I didn't feel the urge to come up with something clever (and more complicated).
The problem is given below (for simplicity I omit planes and leafs):
(note that this is how it looked like originally before I fixed the problem)

Code: Select all

struct BSPNode
&#123;
    BSPNode *front, *back;
&#125;;

struct MapNode
&#123;
    int front, back;
&#125;;

class Map
&#123;
    std&#58;&#58;vector< MapNode > nodes;
public&#58;
    int addNode&#40; const BSPNode *node );
&#125;;

int Map&#58;&#58;addNode&#40; const BSPNode *node )
&#123;
    if ( !node )
        return -1;
    int res = &#40;int&#41;nodes.size&#40;);
    nodes.push_back&#40;MapNode&#40;));
    nodes&#91;res&#93;.front = addNode&#40; node->front );
    nodes&#91;res&#93;.back  = addNode&#40; node->back );
    return res;
&#125;


// somewhere later&#58;
index = map.addNode&#40; root );

I guess it makes sense now that I've put it into context.
(nitpicks should imagine const * const and static_cast<int>)

bob · Post by **bob** » Sat Mar 29, 2014 6:38 pm

Michel wrote:
I find it imprecise to say "there is a sequence point at this place in the program" when that point might not even exist in the final executable.
Can we please stop this. As several people have pointed out multiple times already: sequence points are defined by the C standard. The compiler has nothing to do with it.

Yes, and as I said, they can be a bit vague or imprecise, because the code you point to might not actually exist in the final product.

That was ALL I wrote.

bob · Post by **bob** » Sat Mar 29, 2014 6:40 pm

mcostalba wrote:
bob wrote: I never said compilers do it wrongly. I simply said "the concept of a sequence point is a bit vague, because the compiler transforms the code in many ways, so pointing to _RIGHT HERE_ at a source line might find you pointing at a point that doesn't even exist in a program. Trying to identify the "sequence" point in the asm is even more vague due to interlaced instructions and such."
Compiler doesn't have to identify the "sequence" point in the asm.

Sequence points are associated with source code and compiler has to respect that: how it translates the sources to asm is not developer business. For the developer is enough to know that the compiler will respect this...and it will do otherwise is a compiler bug.

Not to beat a dead horse, but this is really amazingly easy to understand. The desperate aim to try to have always the last word sometime brings you to ridiculize yourself.

My statement was "sequence points are abstract at times". That is correct. Because you can look at the C standard, and at a piece of source code, and say "there is a sequence point here." I can look at that same piece of code and say "here doesn't even exist, the compiler removes it." That was my only point.

syzygy · Post by **syzygy** » Sat Mar 29, 2014 7:15 pm

bob wrote:
mcostalba wrote:
bob wrote: I never said compilers do it wrongly. I simply said "the concept of a sequence point is a bit vague, because the compiler transforms the code in many ways, so pointing to _RIGHT HERE_ at a source line might find you pointing at a point that doesn't even exist in a program. Trying to identify the "sequence" point in the asm is even more vague due to interlaced instructions and such."
Compiler doesn't have to identify the "sequence" point in the asm.

Sequence points are associated with source code and compiler has to respect that: how it translates the sources to asm is not developer business. For the developer is enough to know that the compiler will respect this...and it will do otherwise is a compiler bug.

Not to beat a dead horse, but this is really amazingly easy to understand. The desperate aim to try to have always the last word sometime brings you to ridiculize yourself.
My statement was "sequence points are abstract at times". That is correct. Because you can look at the C standard, and at a piece of source code, and say "there is a sequence point here." I can look at that same piece of code and say "here doesn't even exist, the compiler removes it." That was my only point.

What is "a bit vague" about this:

Sequence points are associated with source code and compiler has to respect that: how it translates the sources to asm is not developer business. For the developer is enough to know that the compiler will respect this...and it will do otherwise is a compiler bug.

Sequence points are a mental construct to define the semantics of source code. The compiler will produce code with the correct semantics. That's it. That's all. There's nothing more to it.

Talking about "sequence points in object code" is a fail.

wgarvin · Post by **wgarvin** » Sun Mar 30, 2014 12:40 am

bob wrote:
mar wrote:I think the point is that the compiler can do whatever optimizations it likes an long as the optimized program behaves exactly the same as if it was synchronized at sequence points.
No argument with one exception. After the inlining stage of the compiler is done, some of the procedure calls are now missing. The compiler sees the inlined code for the actual compilation step, and it is no longer aware of a procedure call, nor of that specific sequence point. Certainly the inlined code will have its own set of sequence points. My comment was simply that the "sequence points" you see MIGHT not exist at all, in reality. Saying "there is a sequence point right here" can, therefore, be a bit inaccurate. In fact, in some cases, a procedure might be removed completely AND not inlined either, if it is found to be useless code. Hence my comment. Nothing more, nothing less.

Sequence points are an abstract thing that exists in the original source program. The compiled program needs to behave "as if" any side effects from before a sequence point, occur before any side effects from after that sequence point. Of course the compiler can mix the instructions together for optimization purposes, as long as the partial ordering of side effects dictated by the sequence points is properly enforced (in single-threaded code, etc. etc.)

If a compiler performs inlining and then during subsequent optimizations, fails to preserve the semantics of the sequence points that originally existed in the source code, then that compiler is broken.

lucasart · Post by **lucasart** » Sun Mar 30, 2014 4:03 am

mar wrote:
lucasart wrote: I cannot say for sure, because I don't know what the context in your code is, and what you're doing with this. But I know I've never had to write anything like that in my life.
Fair enough. The original code takes a BSP tree where nodes are linked with pointers and translates it into a more compact structure (where it's linked with indices).
It's done only once before the map is dumped to a file so I didn't feel the urge to come up with something clever (and more complicated).
The problem is given below (for simplicity I omit planes and leafs):
(note that this is how it looked like originally before I fixed the problem)
Code: Select all
struct BSPNode
&#123;
    BSPNode *front, *back;
&#125;;

struct MapNode
&#123;
    int front, back;
&#125;;

class Map
&#123;
    std&#58;&#58;vector< MapNode > nodes;
public&#58;
    int addNode&#40; const BSPNode *node );
&#125;;

int Map&#58;&#58;addNode&#40; const BSPNode *node )
&#123;
    if ( !node )
        return -1;
    int res = &#40;int&#41;nodes.size&#40;);
    nodes.push_back&#40;MapNode&#40;));
    nodes&#91;res&#93;.front = addNode&#40; node->front );
    nodes&#91;res&#93;.back  = addNode&#40; node->back );
    return res;
&#125;


// somewhere later&#58;
index = map.addNode&#40; root );
I guess it makes sense now that I've put it into context.
(nitpicks should imagine const * const and static_cast<int>)

I see. But I maintain that only a wrong design can lead to this. addNode() should simply be a void member. Problem solved.

A function is something that takes arguments and returns a value. In terms of logic I don't think of this addNode() as a function. Your code is counter-intuitive for me. The way I think of it is that addNode() is a procedure (in Pascal terminology) that takes an argument and modifies the vector.

wgarvin · Post by **wgarvin** » Sun Mar 30, 2014 6:08 am

mcostalba wrote:
lucasart wrote:
mar wrote: A real-life example that happened to me some time ago:
Code: Select all
    	nodes&#91;res&#93;.index = addNode&#40; n->front );
where nodes is a vector that can be changed by subsequent call to recursive method addNode.
I think there is a serious problem of design in code like that.
I fully second Lucas here, the correct fix is to change addNode signature to make it clear it is going to modify nodes array, so something like:
Code: Select all
addNode&#40; n->front, nodes, res );

Okay, I'll play devil's advocate and disagree with you and Lucas.

This is an API method on a abstract data structure. Exposing the internal details of the vector in the signature of the API method is totally the wrong thing to do (plus it wouldn't fix this bug anyway).

The problem arose because the method calls itself recursively, and so its temporary reference can get invalidated. One solution is to not keep a temporary reference there, but just access the storage through operator[] each time. And the most important thing is to put a comment in the code reminding readers of why! Adding the vector to the signature of the API method would not capture this kind of nuance anyway--a comment is the best way to keep yourself (or anyone else) from coming across this code later and "optimizing" it into something incorrect.

lucasart · Post by **lucasart** » Sun Mar 30, 2014 6:52 am

wgarvin wrote: Okay, I'll play devil's advocate and disagree with you and Lucas. This is an API method on a abstract data structure. Exposing the internal details of the vector in the signature of the API method is totally the wrong thing to do (plus it wouldn't fix this bug anyway).

Please read my message again. It was Marco who suggested that.

And read my last message too. IMO, the correct solution is to make addNode a void, because does not represent a function (in a logical sense) but a procedure acting (recursively) on the internal data structure. No need to make things more complicated than they should be. The incorrect choice of function instead of procedure is how the programmer has managed to create a problem for himself, that essentially didn't exist.

mar · Post by **mar** » Sun Mar 30, 2014 9:26 am

lucasart wrote:I see. But I maintain that only a wrong design can lead to this. addNode() should simply be a void member. Problem solved.

A function is something that takes arguments and returns a value. In terms of logic I don't think of this addNode() as a function. Your code is counter-intuitive for me. The way I think of it is that addNode() is a procedure (in Pascal terminology) that takes an argument and modifies the vector.

We have a saying here: after the battle, everyone is a general

If you suggest that creating temporaries all over the place plus passing an extra parameter just to work around this is plain stupid.

Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization

Re: Move time and compiler optimization