Kogge Stone, Vector Based

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
smatovic
Posts: 611
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Kogge Stone, Vector Based

Post by smatovic » Tue Jan 22, 2013 11:30 pm

Maybe it is of interest for someone,
the 8 directions for Kogge-Stone are done via an ulong8 vector datatype,
on my gpu it performs as fast as magic bitboards...

Code: Select all


        // generate pseudo legal moves

        while( bbWork )

            bbTemp = 0;
            bbMoves = 0;

            // pop 1st bit
            pos     = pop_1st_bit(bbWork);

            piece   = getPiece(board, pos);


            // all 8 directions in parrallel via vector ulong8
            pro8 = ~bbBlockers;
            gen8 = SetMaskBB[pos];
            r8 = shift8[piece];
            pro8 &= avoidWrap8[piece];

            // do kogge stone
            gen8 |= pro8 & (&#40;gen8 << r8&#41; | &#40;gen8 >> &#40;64-r8&#41;));
            pro8 &=       (&#40;pro8 << r8&#41; | &#40;pro8 >> &#40;64-r8&#41;));
            gen8 |= pro8 & (&#40;gen8 << 2*r8&#41; | &#40;gen8 >> &#40;64-2*r8&#41;));
            pro8 &=       (&#40;pro8 << 2*r8&#41; | &#40;pro8 >> &#40;64-2*r8&#41;));
            gen8 |= pro8 & (&#40;gen8 << 4*r8&#41; | &#40;gen8 >> &#40;64-4*r8&#41;));

            // Shift one for Captures
            gen8 = (&#40;gen8 << r8&#41; | &#40;gen8 >> &#40;64-r8&#41;)) & avoidWrap8&#91;piece&#93;;
            bbTemp = gen8.s0 | gen8.s1 | gen8.s2 | gen8.s3 | gen8.s4 | gen8.s5 | gen8.s6 | gen8.s7;     

            // Verify
            bbTemp &= ( piece == PAWN&#41; ? &#40;PawnAttackTables&#91;som*64+pos&#93; & bbOpposite&#41;  | &#40;PawnAttackTables&#91;&#40;som+2&#41;*64+pos&#93; & ~bbBlockers&#41;  &#58; AttackTables&#91;som*6*64+piece*64+pos&#93; ;

            // Pawn double square
            if ( piece == PAWN && ( ( som == WHITE && &#40;pos>>3&#41; == 1&#41; || &#40;som == BLACK && &#40;pos>>3&#41; == 6 ) ) ) &#123;
                to  = &#40;som == BLACK&#41;? pos-8  &#58; pos+8;
                cpt = &#40;som == BLACK&#41;? pos-16 &#58; pos+16;
                if (   (~bbBlockers & SetMaskBB&#91;to&#93;) && (~bbBlockers & SetMaskBB&#91;cpt&#93;) )
                    bbTemp |= SetMaskBB&#91;cpt&#93;;
            &#125;

            // Captures
            bbMoves  = &#40;bbTemp & bbOpposite&#41;;
            // Non cpatures
            bbMoves |= &#40;qs == 0&#41;? bbTemp & ~bbBlockers &#58; 0 ;


full code:
https://github.com/smatovic/Zeta/tree/z ... koggestone


--
Srdja

Gerd Isenberg
Posts: 2113
Joined: Wed Mar 08, 2006 7:47 pm
Location: Hattingen, Germany

Re: Kogge Stone, Vector Based

Post by Gerd Isenberg » Wed Jan 23, 2013 4:10 pm

Wow!

You process a vector of 8 bitboards with individial shifts per SIMD instruction, i.e. {9, 1, -7, -8, -9, -1, 7, 8} for a queen - positive amounts shifting left, negative shifting right, but you still need the generalized Kogge-Stone with rotate, since shifting left by negative amount is not shifting right, but shifing left with amount and 63 (mod 64).

Do you have any links to the concrete SIMD instruction set?

Gerd

smatovic
Posts: 611
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: Kogge Stone, Vector Based

Post by smatovic » Wed Jan 23, 2013 6:19 pm

the idea for the generalized shift is taken from the chessprogramming wiki,
Do you have any links to the concrete SIMD instruction set?
For Nvidia there is the PTX instruction set:

www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf

for AMD CAL

http://coachk.cs.ucf.edu/courses/CDA693 ... amming.pdf

--
Srdja

smatovic
Posts: 611
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: Kogge Stone, Vector Based

Post by smatovic » Wed Jan 23, 2013 6:49 pm

Just tested the idea with the AMD OpenCL compiler,
it doesn't work.

Code: Select all

Error&#58; Building Program &#40;clBuildProgram&#41;
, status&#58;-11buildlog&#58; "/tmp/OCLfv1xCw.cl", line 1357&#58; error&#58; vector operation requires identical
          element types
              gen8 |= pro8 & (&#40;gen8 << r8&#41; | &#40;gen8 >> &#40;64-r8&#41;));
                                    ^

--
Srdja

Gerd Isenberg
Posts: 2113
Joined: Wed Mar 08, 2006 7:47 pm
Location: Hattingen, Germany

Re: Kogge Stone, Vector Based

Post by Gerd Isenberg » Wed Jan 23, 2013 7:37 pm

smatovic wrote: For Nvidia there is the PTX instruction set:

www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf
Thanks. Pity - PTX has no rotate instruction. It is called SIMT (single-instruction, multiple-thread) rather than SIMD. So let say a shl.b64 a, b instruction is executed in 8 treads, each operating on corresponding elements of the vectors?

smatovic
Posts: 611
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: Kogge Stone, Vector Based

Post by smatovic » Wed Jan 23, 2013 7:50 pm

So let say a shl.b64 a, b instruction is executed in 8 treads, each operating on corresponding elements of the vectors?
Afaik, yes.

On Nvidia Fermi devices 32 threads are executed in SIMT fashion in one "Warp" over 4 clock cycles.

--
Srdja

smatovic
Posts: 611
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: Kogge Stone, Vector Based, 2*4 directions, ulong4

Post by smatovic » Mon Jan 28, 2013 5:03 pm

another solution is to split left and right shifts in 2*4 directions via ulong4.....


Code: Select all

__constant ulong4 shift4&#91;&#93; =

&#123;
&#123;     0, 0, 0, 0 &#125;,                      // EMTPY
&#123;     9, 0, 7, 8 &#125;,                     // PAWN
&#123;    17,10, 6,15 &#125;,                    // KNIGTH
&#123;     9, 1, 7, 8 &#125;,                   // KING
&#123;     9, 0, 7, 0 &#125;,                  // BISHOP
&#123;     0, 1, 0, 8 &#125;,                 // ROOK
&#123;     9, 1, 7, 8 &#125;                 // QUEEN
&#125;;    

Code: Select all

     
            // 2*4 directions via ulong4, left and right shifting
            s4 = shift4&#91;&#40;piece>>1&#41;&#93;;

            // 1*4 directions, left shift, in parrallel via vector ulong4
            pro4 = ~bbBlockers;
            gen4 = SetMaskBB&#91;pos&#93;;

            // do kogge stone
            gen4 |= pro4 & &#40;gen4 << s4&#41;;
            pro4 &=        &#40;pro4 << s4&#41;;
            gen4 |= pro4 & &#40;gen4 << 2*s4&#41;;
            pro4 &=        &#40;pro4 << 2*s4&#41;;
            gen4 |= pro4 & &#40;gen4 << 4*s4&#41;;

            // Shift one further
            gen4 =  &#40;gen4 << s4&#41;;
            bbTemp = gen4.s0 | gen4.s1 | gen4.s2 | gen4.s3;     

            // 1*4 directions, right shift, in parrallel via vector ulong4
            pro4 = ~bbBlockers;
            gen4 = SetMaskBB&#91;pos&#93;;

            // do kogge stone
            gen4 |= pro4 & &#40;gen4 >> s4&#41;;
            pro4 &=        &#40;pro4 >> s4&#41;;
            gen4 |= pro4 & &#40;gen4 >> 2*s4&#41;;
            pro4 &=        &#40;pro4 >> 2*s4&#41;;
            gen4 |= pro4 & &#40;gen4 >> 4*s4&#41;;

            // Shift one further
            gen4 =  &#40;gen4 >> s4&#41;;
            bbTemp |= gen4.s0 | gen4.s1 | gen4.s2 | gen4.s3;     

            // Verify
            bbTemp &= ( &#40;piece>>1&#41; == PAWN&#41; ? &#40;PawnAttackTables&#91;som*64+pos&#93; & bbOpposite&#41;  | &#40;PawnAttackTables&#91;&#40;som+2&#41;*64+pos&#93; & ~bbBlockers&#41;  &#58; AttackTables&#91;&#40;som*7*64&#41;+(&#40;piece>>1&#41;*64&#41;+pos&#93; ;

            // Captures
            bbMoves  = &#40;bbTemp & bbOpposite&#41;;
            // Non cpatures
            bbMoves |= &#40;qs == 0&#41;? bbTemp & ~bbBlockers &#58; 0 ;


--
Srdja

smatovic
Posts: 611
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: Kogge Stone, Vector Based, 2*4 directions, ulong4

Post by smatovic » Wed Jan 30, 2013 1:08 pm

argh, i forgot the wraps....but the example shows how the idea works...

--
Srdja

smatovic
Posts: 611
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: Kogge Stone, Vector Based, 2*4 directions...update

Post by smatovic » Mon Jun 06, 2016 5:04 pm

just for the record...

the different shifts did not perform well on the gpu,
so i redesigned the move gen a bit,
the same code is run for all pieces,
except castle, en passant and pawn promotion moves.

OepnCL pseudo code:

Code: Select all

    // ####################################################################
    // ### generate all moves except castle, en passant and pawn promo  ### 
    // ####################################################################
  __private bool qs = false;
  __private Bitboard bbTemp;
  __private Bitboard bbWork;
  __private Bitboard bbMove;
  __private Bitboard bbBlockers;
  __private Bitboard bbMe;
  __private Bitboard bbOpp;
  __private ulong4 bbWrap4;
  __private ulong4 bbPro4;
  __private ulong4 bbGen4;
...

for each $piece on $sq do

    bbWork  = SETMASKBB&#40;sq&#41;;
    // get koggestone wraps
    bbWrap4 = wraps4&#91;0&#93;;
    // generator and propagator &#40;piece and empty squares&#41;
    bbGen4  = &#40;ulong4&#41;bbWork;
    bbPro4  = &#40;ulong4&#41;(~bbBlockers&#41;;
    // kogge stone shift left via ulong4 vector
    bbPro4 &= bbWrap4;
    bbGen4 |= bbPro4  & &#40;bbGen4 << shift4&#41;;
    bbPro4 &=           &#40;bbGen4 << shift4&#41;;
    bbGen4 |= bbPro4  & &#40;bbGen4 << 2*shift4&#41;;
    bbPro4 &=           &#40;bbGen4 << 2*shift4&#41;;
    bbGen4 |= bbPro4  & &#40;bbGen4 << 4*shift4&#41;;
    bbGen4  = bbWrap4 & &#40;bbGen4 << shift4&#41;;
    bbTemp |= bbGen4.s0|bbGen4.s1|bbGen4.s2|bbGen4.s3;
    // get koggestone wraps
    bbWrap4 = wraps4&#91;1&#93;;
    // set generator and propagator &#40;piece and empty squares&#41;
    bbGen4  = &#40;ulong4&#41;bbWork;
    bbPro4  = &#40;ulong4&#41;(~bbBlockers&#41;;
    // kogge stone shift right via ulong4 vector
    bbPro4 &= bbWrap4;
    bbGen4 |= bbPro4  & &#40;bbGen4 >> shift4&#41;;
    bbPro4 &=           &#40;bbGen4 >> shift4&#41;;
    bbGen4 |= bbPro4  & &#40;bbGen4 >> 2*shift4&#41;;
    bbPro4 &=           &#40;bbGen4 >> 2*shift4&#41;;
    bbGen4 |= bbPro4  & &#40;bbGen4 >> 4*shift4&#41;;
    bbGen4  = bbWrap4 & &#40;bbGen4 >> shift4&#41;;
    bbTemp |= bbGen4.s0|bbGen4.s1|bbGen4.s2|bbGen4.s3;
    // consider knights
    bbTemp  = &#40;piece==KNIGHT&#41;?BBFULL&#58;bbTemp;
    // verify captures
    n       = &#40;piece==PAWN&#41;?stm&#58;piece;
    bbWork  = AttackTables&#91;n*64+sq&#93;;
    bbMove  = bbWork&bbTemp&bbOpp;
    // verify non captures
    bbWork  = &#40;piece==PAWN&#41;?&#40;AttackTablesPawnPushes&#91;stm*64+sq&#93;)&#58;bbWork;
    bbMove |= &#40;qs&#41;?BBEMPTY&#58;&#40;bbWork&bbTemp&~bbBlockers&#41;;

end

here the needed consta&#324;t tables

Code: Select all

// move generator costants
// wraps for kogge stone shifts
__constant ulong4 shift4 = &#40;ulong4&#41;( 9, 1, 7, 8&#41;;

__constant ulong4 wraps4&#91;2&#93; =
&#123;
  // wraps shift left
  &#40;ulong4&#41;(
            0xfefefefefefefe00, // <<9
            0xfefefefefefefefe, // <<1 */
            0x7f7f7f7f7f7f7f00, // <<7 */
            0xffffffffffffff00  // <<8 */
          ),

  // wraps shift right
  &#40;ulong4&#41;(
            0x007f7f7f7f7f7f7f, // >>9
            0x7f7f7f7f7f7f7f7f, // >>1 
            0x00fefefefefefefe, // >>7
            0x00ffffffffffffff  // >>8
          )
&#125;;
__constant Bitboard AttackTablesPawnPushes&#91;2*64&#93; = 
&#123;
  /* white pawn pushes */
  0x100,0x200,0x400,0x800,0x1000,0x2000,0x4000,0x8000,0x1010000,0x2020000,0x4040000,0x8080000,0x10100000,0x20200000,0x40400000,0x80800000,0x1000000,0x2000000,0x4000000,0x8000000,0x10000000,0x20000000,0x40000000,0x80000000,0x100000000,0x200000000,0x400000000,0x800000000,0x1000000000,0x2000000000,0x4000000000,0x8000000000,0x10000000000,0x20000000000,0x40000000000,0x80000000000,0x100000000000,0x200000000000,0x400000000000,0x800000000000,0x1000000000000,0x2000000000000,0x4000000000000,0x8000000000000,0x10000000000000,0x20000000000000,0x40000000000000,0x80000000000000,0x100000000000000,0x200000000000000,0x400000000000000,0x800000000000000,0x1000000000000000,0x2000000000000000,0x4000000000000000,0x8000000000000000,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,
  /* black pawn pushes */
  0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x1,0x2,0x4,0x8,0x10,0x20,0x40,0x80,0x100,0x200,0x400,0x800,0x1000,0x2000,0x4000,0x8000,0x10000,0x20000,0x40000,0x80000,0x100000,0x200000,0x400000,0x800000,0x1000000,0x2000000,0x4000000,0x8000000,0x10000000,0x20000000,0x40000000,0x80000000,0x100000000,0x200000000,0x400000000,0x800000000,0x1000000000,0x2000000000,0x4000000000,0x8000000000,0x10100000000,0x20200000000,0x40400000000,0x80800000000,0x101000000000,0x202000000000,0x404000000000,0x808000000000,0x1000000000000,0x2000000000000,0x4000000000000,0x8000000000000,0x10000000000000,0x20000000000000,0x40000000000000,0x80000000000000
&#125;;
/* piece attack tables */
__constant Bitboard AttackTables&#91;7*64&#93; = 
&#123;
  /* white pawn */
  0x200,0x500,0xa00,0x1400,0x2800,0x5000,0xa000,0x4000,0x20000,0x50000,0xa0000,0x140000,0x280000,0x500000,0xa00000,0x400000,0x2000000,0x5000000,0xa000000,0x14000000,0x28000000,0x50000000,0xa0000000,0x40000000,0x200000000,0x500000000,0xa00000000,0x1400000000,0x2800000000,0x5000000000,0xa000000000,0x4000000000,0x20000000000,0x50000000000,0xa0000000000,0x140000000000,0x280000000000,0x500000000000,0xa00000000000,0x400000000000,0x2000000000000,0x5000000000000,0xa000000000000,0x14000000000000,0x28000000000000,0x50000000000000,0xa0000000000000,0x40000000000000,0x200000000000000,0x500000000000000,0xa00000000000000,0x1400000000000000,0x2800000000000000,0x5000000000000000,0xa000000000000000,0x4000000000000000,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,
  /* black pawn */
  0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x2,0x5,0xa,0x14,0x28,0x50,0xa0,0x40,0x200,0x500,0xa00,0x1400,0x2800,0x5000,0xa000,0x4000,0x20000,0x50000,0xa0000,0x140000,0x280000,0x500000,0xa00000,0x400000,0x2000000,0x5000000,0xa000000,0x14000000,0x28000000,0x50000000,0xa0000000,0x40000000,0x200000000,0x500000000,0xa00000000,0x1400000000,0x2800000000,0x5000000000,0xa000000000,0x4000000000,0x20000000000,0x50000000000,0xa0000000000,0x140000000000,0x280000000000,0x500000000000,0xa00000000000,0x400000000000,0x2000000000000,0x5000000000000,0xa000000000000,0x14000000000000,0x28000000000000,0x50000000000000,0xa0000000000000,0x40000000000000,
  /* knight */
  0x20400,0x50800,0xa1100,0x142200,0x284400,0x508800,0xa01000,0x402000,0x2040004,0x5080008,0xa110011,0x14220022,0x28440044,0x50880088,0xa0100010,0x40200020,0x204000402,0x508000805,0xa1100110a,0x1422002214,0x2844004428,0x5088008850,0xa0100010a0,0x4020002040,0x20400040200,0x50800080500,0xa1100110a00,0x142200221400,0x284400442800,0x508800885000,0xa0100010a000,0x402000204000,0x2040004020000,0x5080008050000,0xa1100110a0000,0x14220022140000,0x28440044280000,0x50880088500000,0xa0100010a00000,0x40200020400000,0x204000402000000,0x508000805000000,0xa1100110a000000,0x1422002214000000,0x2844004428000000,0x5088008850000000,0xa0100010a0000000,0x4020002040000000,0x400040200000000,0x800080500000000,0x1100110a00000000,0x2200221400000000,0x4400442800000000,0x8800885000000000,0x100010a000000000,0x2000204000000000,0x4020000000000,0x8050000000000,0x110a0000000000,0x22140000000000,0x44280000000000,0x88500000000000,0x10a00000000000,0x20400000000000,
  /* king */
  0x302,0x705,0xe0a,0x1c14,0x3828,0x7050,0xe0a0,0xc040,0x30203,0x70507,0xe0a0e,0x1c141c,0x382838,0x705070,0xe0a0e0,0xc040c0,0x3020300,0x7050700,0xe0a0e00,0x1c141c00,0x38283800,0x70507000,0xe0a0e000,0xc040c000,0x302030000,0x705070000,0xe0a0e0000,0x1c141c0000,0x3828380000,0x7050700000,0xe0a0e00000,0xc040c00000,0x30203000000,0x70507000000,0xe0a0e000000,0x1c141c000000,0x382838000000,0x705070000000,0xe0a0e0000000,0xc040c0000000,0x3020300000000,0x7050700000000,0xe0a0e00000000,0x1c141c00000000,0x38283800000000,0x70507000000000,0xe0a0e000000000,0xc040c000000000,0x302030000000000,0x705070000000000,0xe0a0e0000000000,0x1c141c0000000000,0x3828380000000000,0x7050700000000000,0xe0a0e00000000000,0xc040c00000000000,0x203000000000000,0x507000000000000,0xa0e000000000000,0x141c000000000000,0x2838000000000000,0x5070000000000000,0xa0e0000000000000,0x40c0000000000000,
  /* bishop */
  0x8040201008040200,0x80402010080500,0x804020110a00,0x8041221400,0x182442800,0x10204885000,0x102040810a000,0x102040810204000,0x4020100804020002,0x8040201008050005,0x804020110a000a,0x804122140014,0x18244280028,0x1020488500050,0x102040810a000a0,0x204081020400040,0x2010080402000204,0x4020100805000508,0x804020110a000a11,0x80412214001422,0x1824428002844,0x102048850005088,0x2040810a000a010,0x408102040004020,0x1008040200020408,0x2010080500050810,0x4020110a000a1120,0x8041221400142241,0x182442800284482,0x204885000508804,0x40810a000a01008,0x810204000402010,0x804020002040810,0x1008050005081020,0x20110a000a112040,0x4122140014224180,0x8244280028448201,0x488500050880402,0x810a000a0100804,0x1020400040201008,0x402000204081020,0x805000508102040,0x110a000a11204080,0x2214001422418000,0x4428002844820100,0x8850005088040201,0x10a000a010080402,0x2040004020100804,0x200020408102040,0x500050810204080,0xa000a1120408000,0x1400142241800000,0x2800284482010000,0x5000508804020100,0xa000a01008040201,0x4000402010080402,0x2040810204080,0x5081020408000,0xa112040800000,0x14224180000000,0x28448201000000,0x50880402010000,0xa0100804020100,0x40201008040201,
  /* rook */
  0x1010101010101fe,0x2020202020202fd,0x4040404040404fb,0x8080808080808f7,0x10101010101010ef,0x20202020202020df,0x40404040404040bf,0x808080808080807f,0x10101010101fe01,0x20202020202fd02,0x40404040404fb04,0x80808080808f708,0x101010101010ef10,0x202020202020df20,0x404040404040bf40,0x8080808080807f80,0x101010101fe0101,0x202020202fd0202,0x404040404fb0404,0x808080808f70808,0x1010101010ef1010,0x2020202020df2020,0x4040404040bf4040,0x80808080807f8080,0x1010101fe010101,0x2020202fd020202,0x4040404fb040404,0x8080808f7080808,0x10101010ef101010,0x20202020df202020,0x40404040bf404040,0x808080807f808080,0x10101fe01010101,0x20202fd02020202,0x40404fb04040404,0x80808f708080808,0x101010ef10101010,0x202020df20202020,0x404040bf40404040,0x8080807f80808080,0x101fe0101010101,0x202fd0202020202,0x404fb0404040404,0x808f70808080808,0x1010ef1010101010,0x2020df2020202020,0x4040bf4040404040,0x80807f8080808080,0x1fe010101010101,0x2fd020202020202,0x4fb040404040404,0x8f7080808080808,0x10ef101010101010,0x20df202020202020,0x40bf404040404040,0x807f808080808080,0xfe01010101010101,0xfd02020202020202,0xfb04040404040404,0xf708080808080808,0xef10101010101010,0xdf20202020202020,0xbf40404040404040,0x7f80808080808080,
  /* queen */
  0x81412111090503fe,0x2824222120a07fd,0x404844424150efb,0x8080888492a1cf7,0x10101011925438ef,0x2020212224a870df,0x404142444850e0bf,0x8182848890a0c07f,0x412111090503fe03,0x824222120a07fd07,0x4844424150efb0e,0x80888492a1cf71c,0x101011925438ef38,0x20212224a870df70,0x4142444850e0bfe0,0x82848890a0c07fc0,0x2111090503fe0305,0x4222120a07fd070a,0x844424150efb0e15,0x888492a1cf71c2a,0x1011925438ef3854,0x212224a870df70a8,0x42444850e0bfe050,0x848890a0c07fc0a0,0x11090503fe030509,0x22120a07fd070a12,0x4424150efb0e1524,0x88492a1cf71c2a49,0x11925438ef385492,0x2224a870df70a824,0x444850e0bfe05048,0x8890a0c07fc0a090,0x90503fe03050911,0x120a07fd070a1222,0x24150efb0e152444,0x492a1cf71c2a4988,0x925438ef38549211,0x24a870df70a82422,0x4850e0bfe0504844,0x90a0c07fc0a09088,0x503fe0305091121,0xa07fd070a122242,0x150efb0e15244484,0x2a1cf71c2a498808,0x5438ef3854921110,0xa870df70a8242221,0x50e0bfe050484442,0xa0c07fc0a0908884,0x3fe030509112141,0x7fd070a12224282,0xefb0e1524448404,0x1cf71c2a49880808,0x38ef385492111010,0x70df70a824222120,0xe0bfe05048444241,0xc07fc0a090888482,0xfe03050911214181,0xfd070a1222428202,0xfb0e152444840404,0xf71c2a4988080808,0xef38549211101010,0xdf70a82422212020,0xbfe0504844424140,0x7fc0a09088848281,
&#125;;
I implemented it also on cpu, but it is not competitive to magic bitboards...at least it should be SIMD friendly now...

--
Srdja

Post Reply