Page 1 of 4

Problem with functions not inlining

Posted: Wed Nov 04, 2009 1:26 am
by Greg Strong
I'm using SSE2 functions for what was supposed to be very fast pawn structure analysis, and it has turned out to be quite slow. Now I think I've tracked down the problem, but I don't know what to do about it. I think the problem is that even the smallest trivial functions are not being inlined, despite my best efforts (like making sure that agressive inlining is turned on in the project settings, using the __inline keyword, etc.)

I've encapsulated the __128i data type for clarity. Here's a fragment of the class, with a very simple function that seems to me like it should inline...

Code: Select all

class xmm128i
{
  private:
	__m128i m_data;

  public:
	xmm128i( __m128i *source )
	{ m_data = _mm_load_si128( source ); }

	xmm128i( __m128i source ): m_data(source) { }

	xmm128i( xmm const &source )
	{ _mm_load_si128( &source.As128i ); }

	~xmm128i() { }

	operator __m128i()
	{ return m_data; }


	// very simple operator ...

	xmm128i operator &( __m128i const &other )
	{ return xmm128i( _mm_and_si128( m_data, other ) ); }

	// many more operators ...
}
The operator & is (intented to be) a totally trivial wrapper around the _mm_and_si128 compiler intrinsic. I just can't imagine why it refuses to inline, but with me using these operators frequently and every one adding the overhead of a function call, the code that should be fast is winding up being very, very slow...

Does anyone have any idea what's going on?!? I'm using Visual C++ 2008.

Thanks for any help you can provide. I'm rather frustrated :)

Re: Problem with functions not inlining

Posted: Wed Nov 04, 2009 1:36 am
by michiguel
Greg Strong wrote:I'm using SSE2 functions for what was supposed to be very fast pawn structure analysis, and it has turned out to be quite slow. Now I think I've tracked down the problem, but I don't know what to do about it. I think the problem is that even the smallest trivial functions are not being inlined, despite my best efforts (like making sure that agressive inlining is turned on in the project settings, using the __inline keyword, etc.)

I've encapsulated the __128i data type for clarity. Here's a fragment of the class, with a very simple function that seems to me like it should inline...

Code: Select all

class xmm128i
{
  private:
	__m128i m_data;

  public:
	xmm128i( __m128i *source )
	{ m_data = _mm_load_si128( source ); }

	xmm128i( __m128i source ): m_data(source) { }

	xmm128i( xmm const &source )
	{ _mm_load_si128( &source.As128i ); }

	~xmm128i() { }

	operator __m128i()
	{ return m_data; }


	// very simple operator ...

	xmm128i operator &( __m128i const &other )
	{ return xmm128i( _mm_and_si128( m_data, other ) ); }

	// many more operators ...
}
The operator & is (intented to be) a totally trivial wrapper around the _mm_and_si128 compiler intrinsic. I just can't imagine why it refuses to inline, but with me using these operators frequently and every one adding the overhead of a function call, the code that should be fast is winding up being very, very slow...

Does anyone have any idea what's going on?!? I'm using Visual C++ 2008.

Thanks for any help you can provide. I'm rather frustrated :)
Can you make a macro of the whole thing? Ugly, but...

Miguel

Re: Problem with functions not inlining

Posted: Wed Nov 04, 2009 2:46 am
by Dann Corbit
Greg Strong wrote:I'm using SSE2 functions for what was supposed to be very fast pawn structure analysis, and it has turned out to be quite slow. Now I think I've tracked down the problem, but I don't know what to do about it. I think the problem is that even the smallest trivial functions are not being inlined, despite my best efforts (like making sure that agressive inlining is turned on in the project settings, using the __inline keyword, etc.)

I've encapsulated the __128i data type for clarity. Here's a fragment of the class, with a very simple function that seems to me like it should inline...

Code: Select all

class xmm128i
{
  private:
	__m128i m_data;

  public:
	xmm128i( __m128i *source )
	{ m_data = _mm_load_si128( source ); }

	xmm128i( __m128i source ): m_data(source) { }

	xmm128i( xmm const &source )
	{ _mm_load_si128( &source.As128i ); }

	~xmm128i() { }

	operator __m128i()
	{ return m_data; }


	// very simple operator ...

	xmm128i operator &( __m128i const &other )
	{ return xmm128i( _mm_and_si128( m_data, other ) ); }

	// many more operators ...
}
The operator & is (intented to be) a totally trivial wrapper around the _mm_and_si128 compiler intrinsic. I just can't imagine why it refuses to inline, but with me using these operators frequently and every one adding the overhead of a function call, the code that should be fast is winding up being very, very slow...

Does anyone have any idea what's going on?!? I'm using Visual C++ 2008.

Thanks for any help you can provide. I'm rather frustrated :)
You can almost always make something inline with __forceinline:
http://msdn.microsoft.com/en-us/library/z8y1yy88.aspx

However, it isn't always good to inline. If the code spills the cache because if inlining, it may well run slower instead of faster.

Re: Problem with functions not inlining

Posted: Wed Nov 04, 2009 4:37 am
by jwes
You are compiling a release version.

Re: Problem with functions not inlining

Posted: Wed Nov 04, 2009 5:54 am
by Gerd Isenberg
Greg Strong wrote:I'm using SSE2 functions for what was supposed to be very fast pawn structure analysis, and it has turned out to be quite slow. Now I think I've tracked down the problem, but I don't know what to do about it. I think the problem is that even the smallest trivial functions are not being inlined, despite my best efforts (like making sure that agressive inlining is turned on in the project settings, using the __inline keyword, etc.)

I've encapsulated the __128i data type for clarity. Here's a fragment of the class, with a very simple function that seems to me like it should inline...

Code: Select all

class xmm128i
{
  private:
	__m128i m_data;

  public:
	xmm128i( __m128i *source )
	{ m_data = _mm_load_si128( source ); }

	xmm128i( __m128i source ): m_data(source) { }

	xmm128i( xmm const &source )
	{ _mm_load_si128( &source.As128i ); }

	~xmm128i() { }

	operator __m128i()
	{ return m_data; }


	// very simple operator ...

	xmm128i operator &( __m128i const &other )
	{ return xmm128i( _mm_and_si128( m_data, other ) ); }

	// many more operators ...
}
The operator & is (intented to be) a totally trivial wrapper around the _mm_and_si128 compiler intrinsic. I just can't imagine why it refuses to inline, but with me using these operators frequently and every one adding the overhead of a function call, the code that should be fast is winding up being very, very slow...

Does anyone have any idea what's going on?!? I'm using Visual C++ 2008.

Thanks for any help you can provide. I'm rather frustrated :)
I don't have that problem, everything inlines very well with vc2008 release version. I have memory layout in a base class, binary operators as friends, and combined assignment ops returning a reference and no explicit destructor:

Code: Select all

class mm128i : public DBB {

   friend xmm128i operator& (const xmm128i &a, const xmm128i &b) {
      return xmm128i (_mm_and_si128(a.m_data, b.m_data));
   }


   xmm128i & operator &=( mm128i const &other ) {
      m_data = _mm_and_si128( m_data, other.m_data ); 
      return *this;
   }
};
Also all parameter are all const ref xmm128i rather than __m128i.

Re: Problem with functions not inlining

Posted: Wed Nov 04, 2009 1:21 pm
by Greg Strong
I tried __forceinline and that didn't work either. Also, __forceinline is supposed to throw a warning if a function can't be inlined, but it doesn't do that either. I'm beginning to think my installation of Visual Studio is messed up.

Re: Problem with functions not inlining

Posted: Thu Nov 05, 2009 9:30 am
by Sven
Greg Strong wrote:I tried __forceinline and that didn't work either. Also, __forceinline is supposed to throw a warning if a function can't be inlined, but it doesn't do that either. I'm beginning to think my installation of Visual Studio is messed up.
Maybe not your installation but your project settings? Have you set inlining to "all suitable"?

Sven

Re: Problem with functions not inlining

Posted: Thu Nov 05, 2009 2:11 pm
by Greg Strong
Sven Schüle wrote:
Greg Strong wrote:I tried __forceinline and that didn't work either. Also, __forceinline is supposed to throw a warning if a function can't be inlined, but it doesn't do that either. I'm beginning to think my installation of Visual Studio is messed up.
Maybe not your installation but your project settings? Have you set inlining to "all suitable"?

Sven
Yes, it's set that way. And I did a clean install on another machine. That didn't work either. I just can't imagine what is going on.

Re: Problem with functions not inlining

Posted: Thu Nov 05, 2009 5:05 pm
by steffan
Are you certain your project settings are set for "release" rather than "debug"? From what I remember of Visual Studio, debug effectively turns off optimisations, including inlining.

Re: Problem with functions not inlining

Posted: Thu Nov 05, 2009 5:08 pm
by Greg Strong
Yes, definitely release build for X64 platform ...