forcing compilers to inline (or to not inline)

Discussion of chess software programming and technical issues.

Moderator: Ras

wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

forcing compilers to inline (or to not inline)

Post by wgarvin »

In the compiling for GCC thread, I wrote this:
wgarvin wrote:By the way, you can force MSVC to inline code with a compiler-specific extension:

Code: Select all

#define FORCE_INLINE __forceinline
FORCE_INLINE void Foo() { }
Here is a similar incantation which should work for GCC:

Code: Select all

#define FORCE_INLINE __inline__ __attribute__((always_inline))
FORCE_INLINE void Foo() { }
I usually put these macros in the same header as my base types and other simple platform- or compiler-specific declarations.

...Unfortunately, I don't know of a convenient way to force either compiler NOT to inline code, other than to put the function definition in another .cpp file which the compiler doesn't get to see before it sees the call site. So that's what I do when I need that behaviour, even though its a bit inconvenient.
Just today I've found out how to say to both MSVC and GCC, "do not inline this". I have not tried it yet with either compiler. The Microsoft syntax might work for ICC too, I have no idea..

For Microsoft:

Code: Select all

#define NO_INLINE __declspec(noinline)
For GCC (version 3.1 or newer?)

Code: Select all

#define NO_INLINE __attribute__((noinline))
Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: forcing compilers to inline (or to not inline)

Post by Gerd Isenberg »

wgarvin wrote: Just today I've found out how to say to both MSVC and GCC, "do not inline this". I have not tried it yet with either compiler. The Microsoft syntax might work for ICC too, I have no idea..

For Microsoft:

Code: Select all

#define NO_INLINE __declspec(noinline)
For GCC (version 3.1 or newer?)

Code: Select all

#define NO_INLINE __attribute__((noinline))
I wonder what would be a reason to overrule the compiler not to inline?
__declspec(noinline) may be used only for C++ member-functions with java-like on-the-fly implementation inside the declaration. If one uses pointers to member-functions, there is one none-inlined incarnation of the referred function anyway - even if otherwise inlined.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: forcing compilers to inline (or to not inline)

Post by wgarvin »

Gerd Isenberg wrote:I wonder what would be a reason to overrule the compiler not to inline?
__declspec(noinline) may be used only for C++ member-functions with java-like on-the-fly implementation inside the declaration. If one uses pointers to member-functions, there is one none-inlined incarnation of the referred function anyway - even if otherwise inlined.
In the other thread, Aleks wrote:
Aleks Peshkov wrote:I use -Wdisabled-optimization -Winline for GCC and it helps much to control the shape of builded code.
As another example.. at work we had some code for the Cell SPU that was over 100 KB. Turns out that there were two constructors and two assignment operators that were each being expanded to about 500 bytes each time they were inlined. Moving those out of the header and into a separate .cpp reduced the total code size to 90 KB. It really helps because the SPU has only 256 KB of memory.
Aleks Peshkov
Posts: 894
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia

Re: forcing compilers to inline (or to not inline)

Post by Aleks Peshkov »

Gerd Isenberg wrote:I wonder what would be a reason to overrule the compiler not to inline?
For Visual C++ Express Edition it have little reason, but when targeting to GCC it is very important. I need to order to not inline the base function of chain recursive nested calls, because GCC gone mad in infinite inlining of inlined code. :)

In one of my earlier versions of Perft code GCC inlined recursive calls of search function with many local variables into single iterative function!
User avatar
Bo Persson
Posts: 257
Joined: Sat Mar 11, 2006 8:31 am
Location: Malmö, Sweden
Full name: Bo Persson

Re: forcing compilers to inline (or to not inline)

Post by Bo Persson »

Gerd Isenberg wrote:
wgarvin wrote: Just today I've found out how to say to both MSVC and GCC, "do not inline this". I have not tried it yet with either compiler. The Microsoft syntax might work for ICC too, I have no idea..

For Microsoft:

Code: Select all

#define NO_INLINE __declspec(noinline)
For GCC (version 3.1 or newer?)

Code: Select all

#define NO_INLINE __attribute__((noinline))
I wonder what would be a reason to overrule the compiler not to inline?
__declspec(noinline) may be used only for C++ member-functions with java-like on-the-fly implementation inside the declaration. If one uses pointers to member-functions, there is one none-inlined incarnation of the referred function anyway - even if otherwise inlined.
The compiler is sometimes stupid, and inlines initialization code at every place where it is *potentially* run. I know it should only run once!

Interestingly, MSVC also has a pragma

#pragma auto_inline(off)

that does about the same as the declspec.
Harald Johnsen

Re: forcing compilers to inline (or to not inline)

Post by Harald Johnsen »

wgarvin wrote: As another example.. at work we had some code for the Cell SPU that was over 100 KB. Turns out that there were two constructors and two assignment operators that were each being expanded to about 500 bytes each time they were inlined. Moving those out of the header and into a separate .cpp reduced the total code size to 90 KB. It really helps because the SPU has only 256 KB of memory.
Constructors and other functions are inlined because you put them in the class definition. The compiler did not inline because the functions were small but because you implicitly told him to do that (when defining the functions in the class definition). Put them in a separate file rather than using exotics compiler switch p.


HJ.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: forcing compilers to inline (or to not inline)

Post by wgarvin »

Harald Johnsen wrote:Constructors and other functions are inlined because you put them in the class definition. The compiler did not inline because the functions were small but because you implicitly told him to do that (when defining the functions in the class definition). Put them in a separate file rather than using exotics compiler switch p.
Putting them in the class definition is only a hint to the compiler. Just like using the inline or __inline keywords on a function definition is only a hint. (Actually putting it in the class definition is equivalent to putting it outside but tagging it with the "inline" keyword, as long as there are no uses before the actual definition). There are some semantics around it to do with linkage (can you take the address, for example?) but basically its up to the compiler to decide what to inline or not inline.

Which is why this declspec/attribute stuff is occasionally useful. It's a more forceful, compiler-specific hint about the result you want. By wrapping these declspec things in a macro, you can apply this more forceful hint when using a compiler that supports it, and when using some other compiler you can fall back on its default behaviour or its __inline keyword behaviour.

Usually I prefer to let the compiler do what it thinks is appropriate. But sometimes you hit cases where it doesn't know that the particular bit of code is really hot or cold (without PGO it has no way to know this), so it ends up not inlining something performance-critical, or it ends up inlining tonnes of constructors or other things that cause code bloat for no appreciable performance gain. And that's when you want a macro like this to override the compiler's decision-making for that specific case.

I've never used the NO_INLINE one before, I just move them to another .cpp file. But that can mess up a nice tidy codebase. In the future I'll try the macro instead and see how it goes.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: forcing compilers to inline (or to not inline)

Post by bob »

Aleks Peshkov wrote:
Gerd Isenberg wrote:I wonder what would be a reason to overrule the compiler not to inline?
For Visual C++ Express Edition it have little reason, but when targeting to GCC it is very important. I need to order to not inline the base function of chain recursive nested calls, because GCC gone mad in infinite inlining of inlined code. :)

In one of my earlier versions of Perft code GCC inlined recursive calls of search function with many local variables into single iterative function!
How can it possibly inline recursive code to an unknown depth???
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: forcing compilers to inline (or to not inline)

Post by wgarvin »

bob wrote:
Aleks Peshkov wrote:
Gerd Isenberg wrote:I wonder what would be a reason to overrule the compiler not to inline?
For Visual C++ Express Edition it have little reason, but when targeting to GCC it is very important. I need to order to not inline the base function of chain recursive nested calls, because GCC gone mad in infinite inlining of inlined code. :)

In one of my earlier versions of Perft code GCC inlined recursive calls of search function with many local variables into single iterative function!
How can it possibly inline recursive code to an unknown depth???
I think he just meant that it is pretty aggressive. If you have A calling B calling C calling A, and something else D calling A, then it might inline A into D, and then inline B into D, and then inline C into D. Suppose A was your main search function and you were calling it in a couple of different places... that might be a lot of code inlined into each of those places, for no good reason.

Apple's GCC documentation here mentions the options --param max-inline-insns-recursive and --param max-inline-insns-recursive-auto which might help control this behaviour.

Turning off -foptimize-sibling-calls might also be useful.
Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: forcing compilers to inline (or to not inline)

Post by Gerd Isenberg »

wgarvin wrote:
bob wrote: How can it possibly inline recursive code to an unknown depth???
I think he just meant that it is pretty aggressive. If you have A calling B calling C calling A, and something else D calling A, then it might inline A into D, and then inline B into D, and then inline C into D. Suppose A was your main search function and you were calling it in a couple of different places... that might be a lot of code inlined into each of those places, for no good reason.
How does gcc handle this recursive findDeBruijn function from:
http://chessprogramming.wikispaces.com/ ... +Generator

Code: Select all

   //============================================
   // recursive search
   //============================================
   void findDeBruijn(U64 seq, int depth, int unique) {
      if ( (m_Lock & pow2[unique]) == 0 && unique != 32) {
         if ( depth == 0 ) {
            if ( ++m_dBCount == m_Match4nth )
               bitScanRoutineFound(seq);
         } else {
            m_Lock ^= pow2[unique];
            if ( depth > 2 && unique == 31 ) {
                findDeBruijn(seq | pow2[depth-1], depth-2, 62);
            } else {
                if ( depth > 1 )
                   findDeBruijn(seq, depth-1, (unique*2)&63);
                findDeBruijn(seq | pow2[depth-1], depth-1, (unique*2+1)&63);
            }
            m_Lock ^= pow2[unique];
         }
      }
   }
Is gcc able to make it iterative? MSVC2005 was not able to keep m_Lock inside a register.