Ryzen 2 and BMI2?

Joost Buijs · Post by **Joost Buijs** » Sat May 30, 2020 4:58 pm

Ozymandias wrote: ↑Sat May 30, 2020 9:59 am
Joost Buijs wrote: ↑Sat May 30, 2020 8:36 amZen2 is a nice processor, unfortunately it has some weaknesses.
Biggest weak spot so far: price.

Black Friday 2018: Ryzen 7 1700 for 165.99€ at Amazon.

Balck Friday 2019: Ryzen 7 2700 for 149.99€ at Amazon.

Black Friday 2020: Ryzen 7 3700x for a similar price? If so, weakness removed.

Actually I don't care about price!

Gerd Isenberg · Post by **Gerd Isenberg** » Sat May 30, 2020 6:21 pm

Joost Buijs wrote: ↑Sat May 30, 2020 4:53 pm
Gerd Isenberg wrote: ↑Sat May 30, 2020 10:00 am
Gian-Carlo Pascutto wrote: ↑Fri May 29, 2020 10:40 pm
I tried to emulate PEXT in software and that runs faster as the native CPU instruction.
Do you have a fast implementation that you'd want to make public domain?
https://www.chessprogramming.org/BMI2

The serial implementation of PEXT and PDEP look quite similar.

Very well possible, but I'm sure I didn't get it from CPW because I never look there.

About five years ago I got this specific algorithm from somebody who has no connection with computer-chess at all and claimed to be the original author, so I wonder what it's origins are.

I only meant to say that the PEXT implementation of the Zen 2 is so bad that even software emulation runs faster. I'ts a pity because on the AMD I have to replace PEXT with a series of mask and shifts which performs clearly worse.

The routines are so obvious with some bit-twiddling expecience - I would not claim ownership.

We had this discussion here in 2013, where the slightly modified pext/pdep routines came up
http://www.talkchess.com/forum3/viewtop ... 20&start=1
and were mentioned in wikispaces cpw in July 2013:
https://web.archive.org/web/20130706111 ... s.com/BMI2

I agree it is a shame, that Zen is so slow with pext. AMD has to spent some transistors for a fast hardware pext in one cycle!

Best regards,
Gerd

Ozymandias · Post by **Ozymandias** » Sat May 30, 2020 6:23 pm

Joost Buijs wrote: ↑Sat May 30, 2020 4:58 pmActually I don't care about price!

But you'll agree most people do.

BTW, hi there.

Joost Buijs · Post by **Joost Buijs** » Sat May 30, 2020 7:34 pm

Gerd Isenberg wrote: ↑Sat May 30, 2020 6:21 pm
Joost Buijs wrote: ↑Sat May 30, 2020 4:53 pm
Gerd Isenberg wrote: ↑Sat May 30, 2020 10:00 am
Gian-Carlo Pascutto wrote: ↑Fri May 29, 2020 10:40 pm
I tried to emulate PEXT in software and that runs faster as the native CPU instruction.
Do you have a fast implementation that you'd want to make public domain?
https://www.chessprogramming.org/BMI2

The serial implementation of PEXT and PDEP look quite similar.

Very well possible, but I'm sure I didn't get it from CPW because I never look there.

About five years ago I got this specific algorithm from somebody who has no connection with computer-chess at all and claimed to be the original author, so I wonder what it's origins are.

I only meant to say that the PEXT implementation of the Zen 2 is so bad that even software emulation runs faster. I'ts a pity because on the AMD I have to replace PEXT with a series of mask and shifts which performs clearly worse.
The routines are so obvious with some bit-twiddling expecience - I would not claim ownership.

We had this discussion here in 2013, where the slightly modified pext/pdep routines came up
http://www.talkchess.com/forum3/viewtop ... 20&start=1
and were mentioned in wikispaces cpw in July 2013:
https://web.archive.org/web/20130706111 ... s.com/BMI2

I agree it is a shame, that Zen is so slow with pext. AMD has to spent some transistors for a fast hardware pext in one cycle!

Best regards,
Gerd

Sorry, but I didn't knew that. I got it from somebody at a C++ programming forum, I think it was a few months after I build my 5960X PC somewhere late in 2014.

Recently I build a 3970X PC and was surprised to see how bad my engine performed on this machine. I use PEXT in several locations, move-generation, some parts of the evaluation function, and to calculate the index for my material-balance table. The engine ran (single core) about twice as slow as on my Intel 6950X PC. So I started fiddling, first with the PEXT emulation routine, this already gave somewhat better results (also very surprising BTW), finally I had to replace everything with dedicated routines to (almost) get the old performance back.

Joost Buijs · Post by **Joost Buijs** » Sat May 30, 2020 7:43 pm

Ozymandias wrote: ↑Sat May 30, 2020 6:23 pm
Joost Buijs wrote: ↑Sat May 30, 2020 4:58 pmActually I don't care about price!
But you'll agree most people do.

BTW, hi there.

Hi, of course I do care about price, but I don't care if CPU-A costs 400 euro and CPU-B costs 500 euro, I'll just take the one that suits me the best.

Dann Corbit · Post by **Dann Corbit** » Sat May 30, 2020 9:48 pm

Maybe so, but unless it needs AVX512, I guess that my 3970x can keep up math wise with just about any IBM chip.
On the other hand, the BMI2 instructions sure would be nice for chess if they were done in silicon.
What they have done is worse than doing nothing. By that, I mean publishing through API that their hardware supports (for instance) PEXT, the compilers will generate it. That is why Stockfish builds for the native architecture STINK and SSE target builds perform well. Because AMD wanted a stupid little check box:
[x] BMI
So that people would think it is just as good as IBM hardware for that, but that lie is far, far worse than doing nothing.
They should either RIP OUT the terrible microcode they put in to "support" BMI, or they should implement it correctly in silicon. Nothing in-between

yurikvelo · Post by **yurikvelo** » Sun May 31, 2020 4:02 pm

By that, I mean publishing through API that their hardware supports (for instance) PEXT, the compilers will generate it.

There are many software vendors (proprietary software with binaries) which just ignore existence of non-Intel CPU.
They used to look for string "GenuineIntel", but after some court decisions where forced to check only instruction set, not CPU brand name

syzygy · Post by **syzygy** » Mon Jun 01, 2020 2:26 am

Dann Corbit wrote: ↑Sat May 30, 2020 9:48 pm Maybe so, but unless it needs AVX512, I guess that my 3970x can keep up math wise with just about any IBM chip.

Don't confuse Intel with IBM...

On the other hand, the BMI2 instructions sure would be nice for chess if they were done in silicon.
What they have done is worse than doing nothing. By that, I mean publishing through API that their hardware supports (for instance) PEXT, the compilers will generate it. That is why Stockfish builds for the native architecture STINK and SSE target builds perform well.

I doubt that a compiler like gcc generates pdep/pext instructions when compiling for the Ryzen architecture (in fact, I doubt that gcc ever generates them except where the code explicitly invokes them).

I do agree there is problem though. Builds that use the pext/pdep instructions do work on Ryzen but perform horribly. It would be better if they crashed immediately so the user would know he should use another build.

I guess in theory an engine compiled to use pext/pdep could check whether it is running on Ryzen and exit with an error message...

mvanthoor · Post by **mvanthoor** » Mon Jun 01, 2020 3:31 am

This problem is the reason I'm not going to build 3950x computer on a higher-end B550 board. I'd rather have a 109?0X series Intel (at least the 12 core version) on the old X299 chipset, but the 10X series CPU's are nowhere in sight in the Netherlands, last time I looked. Especially the 10980X.

Still, it's the old story with AMD.... They do something incredible, and then have some massive handicap that prevents me from even considering them. (Same with ATI... In that respect, they're good for one another.) I had one AMD/VIA based computer almost 20 years ago (Thunderbird 1400 CPU), and it took me half a year with just THAT driver for the sound card, SUCH driver for the main board and THIS setting to get it to run stable. And even then it ran incredibly hot and needed a positively huge cooler.

Since then I never had an AMD computer again.

Ryzen is tempting with its price for speed, but I refuse, because support for BMI2 and PEXT is so bad; and I know I'll eventually want to go and use those instructions.

I also wonder if Intel is going to release a successor for the x299, keeping the socket so the 10X cpu's can be used, or if they introduce a new socket, chipset, and 11X series in 2021. In that case, the 10X series may actually turn out to be a paper launch.

Dann Corbit · Post by **Dann Corbit** » Mon Jun 01, 2020 3:41 am

syzygy wrote: ↑Mon Jun 01, 2020 2:26 am
I do agree there is problem though. Builds that use the pext/pdep instructions do work on Ryzen but perform horribly. It would be better if they crashed immediately so the user would know he should use another build.

If you compile Stockfish using gcc with -znver2 it runs much slower if you use that flag.
Also mtune=native is OK, but march=native is not.
The best gcc build for my machine is to give it the SSE flags and that is all.
Though I admit I only tried it with the older version 9 compiler, I guess that it might be fixed now wit version 10.

Ryzen 2 and BMI2?

Re: Ryzen 2 and BMI2?

Re: Ryzen 2 and BMI2?

Re: Ryzen 2 and BMI2?

Re: Ryzen 2 and BMI2?

Re: Ryzen 2 and BMI2?

Re: Ryzen 2 and BMI2?

Re: Ryzen 2 and BMI2?

Re: Ryzen 2 and BMI2?

Re: Ryzen 2 and BMI2?

Re: Ryzen 2 and BMI2?