Pigeon now using opportunistic SIMD

StuartRiffle · Post by **StuartRiffle** » Mon Apr 11, 2016 2:04 am

By "opportunistic SIMD" I just mean that in the loop where I iterate through all the child positions, I process them (n) at a time instead of individually, and do these things in SIMD:

- apply the child moves
- calculate new hashes
- evaluate child positions
- detect legal moves for child positions

Then for the next (n-1) iterations through the loop, that work is already done. But if the loop exits early, some of that work is wasted: up to (n-1) child positions worth. The idea is that (hopefully) the SIMD path is more efficient, even after paying for the wasted calculations. So does it work?

Yes

, but not nearly as well as I'd hoped

I have only done a few minutes of testing, but comparing the scalar x64 path to the 2-way SSE path:
- the SSE path discards about 10% of its work because of early loop exits (or odd move list length)
- even so, the search is about 10% faster overall

The next test will be 4-way SIMD using AVX2. I have no idea where this one will land.

If you'd like to see the (now uglier) code, it's on GitHub here. Look in src/engine.h, around line 450, the "while( movesTried < ..." loop in NegaMax().

(The Linux build is broken again, I have to go fiddle with some declarations)

Cheers,