An AMD compiling hunch

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

An AMD compiling hunch

Post by Dann Corbit »

I don't have my machine yet (though I could probably test it on the one I bought for my youngest son, but then I would have to set up the whole compiler infrastructure and it's just a Windows box, etc.) but I think that the new AMD 3xxx CPUs are probably having most binaries compiled badly, though not intentionally.
I guess that most people are compiling with -march=native which is possibly a bad idea for these CPUs. I noticed, for instance, on the Phoronix tests, the Stockfish performance dropped (significantly) when the -march=native switch was chosen. My theory of the problem goes as follows:
The GCC compiler probably does a simple inquiry on the native machine to understand the available instruction set and uses it if possible. In the example of AMD 3xxx, this is probably a bad idea because the AVX BMI compiles are not as fast as the SSE3 compiles. So I suspect that with any third generation AMD CPU, the right choice for architecture is:
-march=znver2

If anyone has one setup already, I would be interested to know if my hunch is correct.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
abulmo2
Posts: 433
Joined: Fri Dec 16, 2016 11:04 am
Location: France
Full name: Richard Delorme

Re: An AMD compiling hunch

Post by abulmo2 »

You can check what options are activated with the following commands:

Code: Select all

gcc -march=native -Q --help=target
gcc -march=znver2 -Q --help=target
On my computer, (a first generation ryzen cpu), native and znver1 are identical. And the executable is significantly faster than with -march=generic.
Richard Delorme
Sesse
Posts: 300
Joined: Mon Apr 30, 2018 11:51 pm

Re: An AMD compiling hunch

Post by Sesse »

It isn't really GCC's fault. Stockfish simply goes and asks the compiler (well, preprocessor) “does the CPU I'm compiling for support BMI2?”, and if so, sends BMI2 code to the compiler. GCC has no say in the matter once it has given a truthful answer. (Not that there's a good way to ask “does the CPU I'm compiling for have _fast_ BMI2?”…)
User avatar
Eelco de Groot
Posts: 4561
Joined: Sun Mar 12, 2006 2:40 am
Full name:   

Re: An AMD compiling hunch

Post by Eelco de Groot »

If you compile with -modern instead of -BMI2 is that not the simplest, for Stockfish I mean? You avoid the bad PEXT/PDEP instructions that Gian-Carlo mentions in another post here, they are implemented (in ZEN I suppose, not all recent AMD processors?) but Stockfish has better solution doing it with software, and you don't confuse the compiler. It has worked so far with other AMD processors.
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: An AMD compiling hunch

Post by Dann Corbit »

Yes, the modern flag makes the fastest code, apparently.
I suspect that the CPU does have other capabilities that native would enable.
Too bad there is a phony-baloney PEXT in the mix.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
DustyMonkey
Posts: 61
Joined: Wed Feb 19, 2014 10:11 pm

Re: An AMD compiling hunch

Post by DustyMonkey »

Are we also going to call instructions like XLAT "phony baloney" also because it perform poorly (even on Intel?)

The idea that GCC doesnt have a say is wrong. The goal of the compiler should be to produce the fastest binary given the information it has. If the information spawns from the "native" switch, then it should be doing more than just asking what instruction sets are supported, and the fact that it doesnt is proof that its not a very good optimizing compiler which is why asmfish beats it significantly.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: An AMD compiling hunch

Post by mar »

DustyMonkey wrote: Thu Dec 12, 2019 5:54 am The idea that GCC doesnt have a say is wrong. The goal of the compiler should be to produce the fastest binary given the information it has. If the information spawns from the "native" switch, then it should be doing more than just asking what instruction sets are supported, and the fact that it doesnt is proof that its not a very good optimizing compiler which is why asmfish beats it significantly.
if you tell the compiler to use pext via an intrinsic, then it will. that's what intrinsics are for
gcc is still the best optimizing compiler out there on average, whether you like it or not

the fact that hand-optimized assembly performs better shouldn't be surprising (how much faster is asmfish really, 30%?),
but the cost comes with a lot of extra effort involved (and always lagging behind latest SF patches)
considering that even the difference among the best optimizing compilers themselves can be around 5-10%, this is not a bad result at all
Martin Sedlak
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: An AMD compiling hunch

Post by Dann Corbit »

I expect at some time AMD will put in real PEXT type BMI instructions.
And I can control which things to include or exclude manually for now.
No big deal either way.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
schack
Posts: 172
Joined: Thu May 27, 2010 3:32 am

Re: An AMD compiling hunch

Post by schack »

Sorry to ask a dumb question, but how could I compile Stockfish with this flag? I only know the typical ways to do -bmi2 and -modern.
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: An AMD compiling hunch

Post by Dann Corbit »

-bmi uses pext
-modern only requires SSE instructions
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.