There are compilers and there are compilers

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: There are compilers and there are compilers

Post by Vinvin »

matthewlai wrote:
bob wrote:
Vinvin wrote:
bob wrote:
Joost Buijs wrote:
ymatioun wrote:New Intel Xeon CPUs include something called "Transaction Synchronization Extensions".

I don't know exactly how they work, but those instructions were specifically designed to reduce synch overhead. Perhaps ICC emits those instructions, while GCC most certainly does not? That would explain why improvement only happens in parallel search.
TSX is disabled on all Haswell processors because there is an error in the implementation.
I think the E5-2660v2 is a Haswell processor, the performance increase Bob sees must have an other explanation.
BTW this is a v3 chip:

model name : Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
The v2 runs @2.2 GHz -> http://www.cpubenchmark.net/cpu.php?cpu ... 40+2.20GHz
and the v3 runs @2.6 GHz -> http://www.cpubenchmark.net/cpu.php?cpu ... 40+2.60GHz
Actually Intel understates the speed. Running all 20 cores at once sees a constant clock speed of 2.9ghz. Only when you enable hyper threading and run 40 threads will it slow down to 2.6ghz. Running 3-4-5 threads and you will see 3.3ghz non-stop...
The reason why Haswell's base clock is so low is mostly because of AVX.

The limit is probably either power or thermal, and when you are not using AVX, half of the CPUs' floating point units are idle, and therefore they draw much less power.

By switching from SSE2 (128-bit) to AVX (256-bit) I get almost double the matrix multiplication throughput (and of course, if you are not doing floating point stuff at all, all FPs are idle).

If you run 20 threads of AVX-intensive workload, it will probably go down to the base clocks.

People (overclockers) have done a lot of tests to show that AVX very significantly increases power draw and temperature for Haswells.
O.T. : A nice app to see the raw power of your CPU : http://www.numberworld.org/y-cruncher/#Download
It can use : AVX2, AVX, SSE4.1, SSE3, ...

Some records to break : http://www.numberworld.org/y-cruncher/#FastestTimes
OneTrickPony
Posts: 157
Joined: Tue Apr 30, 2013 1:29 am

Re: There are compilers and there are compilers

Post by OneTrickPony »

What CPU did you compile for?
i7 3770 (so not so new quad), march=core-avx-i in GCC