| View previous topic :: View next topic |
| Author |
Message |
Ronald de Man
Joined: 28 Feb 2012 Posts: 847
|
Post subject: Re: using Popcount and Prefetch with SSE4 hardware support Posted: Tue May 22, 2012 8:15 pm |
|
|
From Wikipedia:
| Quote: |
Intel SSE4 consists of 54 instructions. A subset consisting of 47 instructions, referred to as SSE4.1 in some Intel documentation, is available in Penryn. Additionally, SSE4.2, a second subset consisting of the 7 remaining instructions, is first available in Nehalem-based Core i7. Intel credits feedback from developers as playing an important role in the development of the instruction set.
AMD supports 4 instructions from the SSE4 instruction set, but have also added four new SSE instructions, naming the group SSE4a. These instructions are not found in Intel's processors supporting SSE4.1 and AMD processors only started supporting Intel's SSE4.1 and SSE4.2 in the Bulldozer-based FX processors. Support was added for SSE4a for unaligned SSE load-operation instructions (which formerly required 16-byte alignment). |
If I understand this correctly, SSE4a does NOT implement all of Intel's SSE4.
The full set of Intel's SSE4 instructions can only be used on Bulldozer-based FX processors. These processors also implement SSE4.1 and SSE4.2.
If you are only using the POPCNT instruction, your program should run on all Intel processors that support SSE4.2 and on all AMD processors that support SSE4a.
This is how I understand it.
For what it's worth: for me, prefetching inside make_move() immediately after calculating the new hashkey improves performance by about 2%, but only if I prefetch with HINT_NTA. Using other flags or moving the prefetch closer to the actual read access either gave inconsistent results or made no difference compared with not prefetching. I did not yet try prefetching the tt entry before storing.
I think you need to use performance counters to properly test this (which I have not done). |
|
| Back to top |
|
 |
|
| Subject |
Author |
Date/Time |
using Popcount and Prefetch with SSE4 hardware support |
Engin Üstün |
Sat May 19, 2012 2:36 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Daniel Shawul |
Sat May 19, 2012 2:53 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Engin Üstün |
Sat May 19, 2012 3:05 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Gerd Isenberg |
Sat May 19, 2012 4:05 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Edmund Moshammer |
Sat May 19, 2012 5:10 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Gerd Isenberg |
Sat May 19, 2012 5:31 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Engin Üstün |
Tue May 22, 2012 8:04 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Robert Hyatt |
Wed May 23, 2012 3:41 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Daniel Shawul |
Tue Jul 10, 2012 12:09 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Ronald de Man |
Sat May 19, 2012 3:57 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Engin Üstün |
Sat May 19, 2012 5:54 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Ronald de Man |
Sat May 19, 2012 6:17 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Robert Hyatt |
Sun May 20, 2012 2:51 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Engin Üstün |
Tue May 22, 2012 7:54 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Engin Üstün |
Tue May 22, 2012 7:56 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Ronald de Man |
Tue May 22, 2012 8:15 pm |
Re: using Popcount and Prefetch with SSE4 hardware support |
Engin Üstün |
Wed May 23, 2012 10:19 pm |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|