What's Strelka's Secret Sauce?

bob · Post by **bob** » Tue Jan 22, 2008 11:49 pm

Gerd Isenberg wrote:
bob wrote: I have all the bitboards indexed by wtm. IE pawns[wtm] and so forth. All other masks are indexed the same way in the evaluation. Random hash values are also done the same way... Makes things far cleaner since all the white/black stuff is replaced by one general case that will work with either side. I now have things like

score += EvaluateKnights(white) - EvaluateKnights(black)

same sort of thing for move generation, you just pass it the color and it uses the same code. Only tiny exception is for pawns since negative shifts won't work on the PC and I don't want to make it too ugly (yet). I am looking at ways to generate pawn moves without the special-case white/black code as well, although 99% of the pawn generation code is now not duplicated...

Bob
I see, the generalized shift by rotate-and can be combined with the wrap 'and' for set-wise pawn attacks. Thus the additional cost for a generalized << +-8 is one read plus 'and'.
Code: Select all
// positve left, negative right shifts
int shift[8] = {9, 1,-7,-8,-9,-1, 7, 8};
 
U64 avoidWrap[8] =
{
   0xfefefefefefefe00,
   0xfefefefefefefefe,
   0x00fefefefefefefe,
   0x00ffffffffffffff,
   0x007f7f7f7f7f7f7f,
   0x7f7f7f7f7f7f7f7f,
   0x7f7f7f7f7f7f7f00,
   0xffffffffffffff00,
};
 
U64 shiftOne (U64 b, int dir8) {
   return rotateLeft (b, shift[dir8]) & avoidWrap[dir8];
}

#if defined X86INTRINSICS
U64 rotateLeft (U64 x, int s) {return _rotl64(x, s);}
#else
// likely translated to rol by "smart" compilers
U64 rotateLeft (U64 x, int s) {return (x << s) | (x >> (64-s));}
#endif

As I said, I have not yet tackled that. I already have one kludge that I don't like, which is the need for LSB() for advanced white pieces, MSB() for advanced black pieces. I buried that in a macro to make the code read better. Bottom line (so far) is that 22.0 is faster than 21.7, yet the node counts match in every test position in my validation suite. The smaller size has helped, plus a little cleaning up here and there has also helped...

Never did understand why + shifts are ok, but - shifts are not. Most every other machine I have programmed would do it either way just fine...

Edsel Apostol · Post by **Edsel Apostol** » Wed Jan 23, 2008 2:31 am

Ok. I think I've got your point.

If those things could be fixed on Strelka then it could catch up with the latest Rybka.

By the way, when will be the next Crafty release?

Good luck with CCT. My engine will be playing there also.

bob · Post by **bob** » Wed Jan 23, 2008 5:38 pm

Edsel Apostol wrote:Ok. I think I've got your point.

If those things could be fixed on Strelka then it could catch up with the latest Rybka.

By the way, when will be the next Crafty release?

Good luck with CCT. My engine will be playing there also.

21.7 will be released immediately following CCT.

22.0 will not be that far behind, but there is a lot of rewriting going on to eliminate all the black/white code. It is painful, and will be even more painful when I get to the biggies which are things like EvaluatePawns(), EvaluatePassedPawns(), and such...

rreagan · Post by **rreagan** » Thu Jan 24, 2008 4:39 am

bob wrote:Only tiny exception is for pawns since negative shifts won't work on the PC and I don't want to make it too ugly (yet). I am looking at ways to generate pawn moves without the special-case white/black code as well, although 99% of the pawn generation code is now not duplicated...

Bob

I don't know how efficient this is, but here is something I have done to accomplish this.

Code: Select all

enum { white, black };
BitBoard shift_forward (BitBoard b, int color) {
    return (b << 8) >> (color << 4);
}
pawn_pushes = shift_forward(friendly_pawns, side_to_move);

Substituting a table lookup for (color << 4) would work also.

Gerd Isenberg · Post by **Gerd Isenberg** » Thu Jan 24, 2008 10:24 am

rreagan wrote: I don't know how efficient this is, but here is something I have done to accomplish this.
Code: Select all
enum { white, black };
BitBoard shift_forward (BitBoard b, int color) {
    return (b << 8) >> (color << 4);
}
pawn_pushes = shift_forward(friendly_pawns, side_to_move);
Substituting a table lookup for (color << 4) would work also.

This is fine for pawn pushes, with most significant rank mapping (rank bit 3-5, file bit 0-2) so that black pawns may not temporary disappear from the board - three shift instructions.

Code: Select all

shl  ecx, 4
shl  rax, 8
shr  rax, cl

wgarvin · Post by **wgarvin** » Fri Jan 25, 2008 12:34 am

I'm not sure if you are interested in platforms other than Intel, but... variable-length shifts are slow on certain platforms. The biggest surprise for me was the inorder PPC cores in the Xbox360--a variable-length shift instruction has a latency of 16 cycles on that thing.

Gerd Isenberg · Post by **Gerd Isenberg** » Fri Jan 25, 2008 8:59 am

wgarvin wrote:I'm not sure if you are interested in platforms other than Intel, but... variable-length shifts are slow on certain platforms. The biggest surprise for me was the inorder PPC cores in the Xbox360--a variable-length shift instruction has a latency of 16 cycles on that thing.

You are right - I implicitly assume variable shift by cl on core2 or K8/K10 x86-64 target cpus. core2 has 1 cycle latency, 0.5 cycles reciprocal throughput. K8 has 1 cycle latency, 1/3 cycles reciprocal throughput. Of course with 16 cycles latency, variable shift is a no no - and one better abjures general shift - or probably relies on something like this:

Code: Select all

U64 genShift(U64 x, int s) {
   return (s > 0) ? (x << s) : (x >> -s);
}

with some hope of speculative execution of both shifts with a conditional move instruction - or

Code: Select all

U64 genShift(U64 x, int s) {
   int isNegative = s >> 31; // arithmetical shift -> -1 if negative, else 0
   x >>= -s &  isNegative;
   x <<=  s & ~isNegative;
   return x;
}

wgarvin · Post by **wgarvin** » Fri Jan 25, 2008 11:17 pm

Another idea is to try and use rotate instructions instead of shift instructions (if the rest of the bitboard is already masked off, or something). The rotate would work if you give it a "negative" 2's complement number. It has some downsides though...it might be more expensive than a shift, and its hard to convince C compilers to generate rotate instructions on their own.

Anyway for this case you have here it seems likely that those 3 shifts are pretty good. You guys surely know more about the performance characteristics of these modern x86-64 chips than I do. The last x86 chip I knew a lot about was the P4. Times have changed, and I haven't been paying enough attention.

[Edit: reading up in the thread, seems you've already been over the rotate ground. Sorry about that.]

Gerd Isenberg · Post by **Gerd Isenberg** » Fri Jan 25, 2008 11:57 pm

wgarvin wrote:Another idea is to try and use rotate instructions instead of shift instructions (if the rest of the bitboard is already masked off, or something). The rotate would work if you give it a "negative" 2's complement number. It has some downsides though...it might be more expensive than a shift, and its hard to convince C compilers to generate rotate instructions on their own.

Anyway for this case you have here it seems likely that those 3 shifts are pretty good. You guys surely know more about the performance characteristics of these modern x86-64 chips than I do. The last x86 chip I knew a lot about was the P4. Times have changed, and I haven't been paying enough attention.

[Edit: reading up in the thread, seems you've already been over the rotate ground. Sorry about that.]

Hehe, no problem

Threads are hard to follow with this php-boards. I have the gut feeling this was discussed x-times before in the history of CCC (and elsewhere). One more reason to collect it in the wiki.
http://chessprogramming.wikispaces.com/ ... erations23
I will add Russel's proposal for a generalized pawn shift in the pawn push section.

ernest · Post by **ernest** » Sat Feb 02, 2008 7:55 pm

Guetti wrote: So lets try to summarize the most important Strelka secrets:
1. ...........
2. Implementation of material tables
3. ...........

I think it is these "material tables", which Strelka simply took and copied from Rybka 1.0 Beta, which bring about the great leap in strength (+100 Elo), compared to Fruit 2.1
Without them, we would only have a bitboard Fruit 2.1

Do we know something about these "material tables" ???

What's Strelka's Secret Sauce?

Re: What's Strelka's Secret Sauce?

Re: What's Strelka's Secret Sauce?

Re: What's Strelka's Secret Sauce?

Re: What's Strelka's Secret Sauce?

Re: What's Strelka's Secret Sauce?

Re: What's Strelka's Secret Sauce?

Re: What's Strelka's Secret Sauce?

Re: What's Strelka's Secret Sauce?

Re: What's Strelka's Secret Sauce?

Re: What's Strelka's Secret Sauce?