Comparison of all known Sliding lookup algorithms

Mergi · Post by **Mergi** » Sun Jun 05, 2022 2:26 pm

12th Gen Intel(R) Core(TM) i5-12600
Verify Engines...OK
Algorithm                                Million Queen/s

Implementation Comparison
Pext NoInline                            207.99
Pext Delegate                            186.19
Pext Inlined                             428.51
Pext Manual Inlining                     507.43
Pext Dynamic Compilation                 122.04
Pext Pinvoke C++                         436.45

CSharp Native Code
Switch                                   19.92
ObstructionDiff                          189.55
Leorik                                   208.32
HyperbolaQsc                             145.39
Pext Inlined                             352.81

The variance is all over the place though.

dangi12012 · Post by **dangi12012** » Sun Jun 05, 2022 3:06 pm

Mergi wrote: ↑Sun Jun 05, 2022 2:26 pm The variance is all over the place though.

Agreed will push updates today that increases sampling because 428 and 352 are not in the same ballpark. (Same code)
Also fancy magic will be there soon too.
Thx for sharing Intel results

dangi12012 · Post by **dangi12012** » Sun Jun 05, 2022 7:43 pm

Mergi wrote: ↑Sun Jun 05, 2022 2:26 pm The variance is all over the place though.

These numbers seem very high. Could you try again with latest commit?
I added a build.bat which does dotNET ahead of time compilation.

Added FancyMagic - Consistent Results

With this I now get these results:

Code: Select all

AMD Ryzen 9 5950X 16-Core Processor
Verify Engines...OK
Algorithm                                Million Queen/s

CSharp Native Code
Switch                                   28,53
ObstructionDiff                          179,56
Leorik                                   200,71
HyperbolaQsc                             242,35
FancyMagic                               260,71
Pext Inlined                             247,15

Fancy Magic is fastest now for me. I did not add /unsafe for now but then we can do raw pointer shenanigans which should help a lot for fancy and pext. The findings bring me more than expected - just comparing the speed of two languages alone would be quite boring indeed.

Please share your results! (sampling size of 1 should be taken with caution)
https://github.com/Gigantua/Chess_Movegen

dangi12012 · Post by **dangi12012** » Sun Jun 05, 2022 8:26 pm

UPDATE! - Unsafe Variations of FancyMagic and PEXT

Unsafe as per the C# spec where this is the name for raw pointer access which is uncommon in C#.

This should also be a first for C# (in terms of performance and variety) and now yields these results:

Code: Select all

AMD Ryzen 9 5950X 16-Core Processor
Verify Engines...OK
Algorithm                                Million Queen/s

CSharp Native Code
Switch                                   26,47
ObstructionDiff                          183,22
Leorik                                   183,49
HyperbolaQsc                             237,71
FancyMagic                               260,60
FancyMagic Unsafe                        305,44
Pext                                     249,18
Pext Unsafe                              311,25

The results are very reproducable (+-3%)

So each unsafe variant is a good 20% faster than the safe code - but still less than half the speed of native C++.
I want to reiterate that speed is not a metric of the language itself but a metric for the compiler.

In fact C# could be faster than C++ since runtime configuration in JIT could become a constexpr for the compiler. C++ ahead of time compilers cannot infer what a uci option will be for example. A JIT language could compile the changed methods on the fly with march=native and turn variables into inline x86 constants. Another example: A program launch configfile that set something to zero or false. All the disabled codeparts would be zero cost.

Unsafe in C# allows us to remove one addition that usually is needed:

Code: Select all

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static unsafe ulong Rook(int sq, ulong occupy)
{
    return Entries[sq].atk_rook[Bmi2.X64.ParallelBitExtract(occupy, Entries[sq].mask_rook)];
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static ulong Bishop(int sq, ulong occupy)
{
   return Attacks[Entries[sq].offset_bish + Bmi2.X64.ParallelBitExtract(occupy, Entries[sq].mask_bish)];
}

Mergi · Post by **Mergi** » Sun Jun 05, 2022 8:42 pm

dangi12012 wrote: ↑Sun Jun 05, 2022 7:43 pm These numbers seem very high. Could you try again with latest commit?

Code: Select all

12th Gen Intel(R) Core(TM) i5-12600
Verify Engines...OK
Algorithm                                Million Queen/s

CSharp Native Code
Switch                                   21.82
ObstructionDiff                          207.48
Leorik                                   216.13
HyperbolaQsc                             246.81
FancyMagic                               398.16
FancyMagic Unsafe                        470.46
Pext                                     374.46
Pext Unsafe                              494.03

Here you go, seems to be the same. Building through the .bat file didn't seem to do anything for the performance though, was the same if i just built it through visual studio.

dangi12012 · Post by **dangi12012** » Sun Jun 05, 2022 9:00 pm

UPDATE! - Unsafe Variations of FancyMagic and PEXT V2

I just realized - everytime a [] indexer is used the C# runtime does a bounds check. Now moving the masks to unsafe memory as well as forcing sequential layout:

Code: Select all

[StructLayout(LayoutKind.Sequential)]
        public struct PextEntry
        {
            public ulong* atk_rook;
            public ulong* atk_bish;
            public ulong mask_rook;
            public ulong mask_bish;
         }

New reference performance can be achieved!

Since the main idea of all fast sliding algorithms is to be branchless the C# runtime does lay stones in our way by default by branching on every simple indexer...

Code: Select all

CSharp Native Code
Switch                                   26,29
ObstructionDiff                          183,91
Leorik                                   182,70
HyperbolaQsc                             237,76
FancyMagic                               265,46
FancyMagic Unsafe                        430,12
Pext                                     248,55
Pext Unsafe                              515,80

Here it can be seen how above ideas translate into C#
https://github.com/Gigantua/Chess_Moveg ... 40ad3f5beb

So Manual inlining - Direct memory access - Bypassing the C# runtime now gives us something that is many times faster than anything previously!

Mergi · Post by **Mergi** » Sun Jun 05, 2022 9:17 pm

Code: Select all

12th Gen Intel(R) Core(TM) i5-12600
Verify Engines...OK
Algorithm                                Million Queen/s

CSharp Native Code
Switch                                   21.97
ObstructionDiff                          208.04
Leorik                                   217.45
HyperbolaQsc                             244.37
FancyMagic                               378.90
FancyMagic Unsafe                        707.59
Pext                                     376.52
Pext Unsafe                              801.99

Ok, here's the updated results. That last patch was a nice performance bump!

Mergi · Post by **Mergi** » Sun Jun 05, 2022 9:19 pm

I wonder how the difference especially for "fancy magic unsafe" is so huge between the AMD and Intel processor though.

And it's not like i got a top of the line model either, the 12600 is more of a budget mid range one.

Ras · Post by **Ras** » Sun Jun 05, 2022 10:02 pm

Mergi wrote: ↑Sun Jun 05, 2022 9:19 pmI wonder how the difference especially for "fancy magic unsafe" is so huge between the AMD and Intel processor though.

Because Intel is more into unsafe processors to begin with?

lithander · Post by **lithander** » Mon Jun 06, 2022 1:19 pm

Code: Select all

AMD Ryzen 5 3600 6-Core Processor
Verify Engines...OK
Algorithm                                Million Queen/s

CSharp Native Code
Switch                                   16,55
ObstructionDiff                          142,80
Leorik                                   152,03
HyperbolaQsc                             200,27
FancyMagic                               212,88
FancyMagic Unsafe                        381,78
Pext                                     33,87
Pext Unsafe                              37,89

Comparison of all known Sliding lookup algorithms

Re: Comparison of all known Sliding lookup algorithms

Re: Comparison of all known Sliding lookup algorithms

Re: Comparison of all known Sliding lookup algorithms

Re: Comparison of all known Sliding lookup algorithms

Re: Comparison of all known Sliding lookup algorithms

Re: Comparison of all known Sliding lookup algorithms

Re: Comparison of all known Sliding lookup algorithms

Re: Comparison of all known Sliding lookup algorithms

Re: Comparison of all known Sliding lookup algorithms

Re: Comparison of all known Sliding lookup algorithms