Best was to Recognize Know Endgames?

hgm · Post by **hgm** » Sat Aug 03, 2013 11:07 am

Note that the Spartacus approach is truly O(1). No search of any kind, just straightforward accesses to hash tables. To keep the size of the material table manageable, however, I have to use a more compact material index than just setting apart groups of bits for the counting. The drawback of that is that I cannot handle over-complete material combinations.

sje · Post by **sje** » Sat Aug 03, 2013 1:57 pm

Symbolic can recognize all of the 146 different 2-5 man endgame classes. A particular endgame class is identified by a binary search of a table of 292 (= 146 * 2) material distribution signatures (not hashes). The table search looks for both a direct and color reversed match at the same time and is very fast. This code is mostly used for the tablebase probing. (The Zobrist material hash is used for other stuff.)

With the thought that it may be useful to those writing their own chess program, here is Symbolic's lexicographically ordered endgame class name array:

Code: Select all

const std&#58;&#58;string Material&#58;&#58;EndgameNameVec&#91;EgkLen&#93; =
&#123;
  "KBBBK",
  "KBBK",
  "KBBKB",
  "KBBKN",
  "KBBKP",
  "KBBKQ",
  "KBBKR",
  "KBBNK",
  "KBBPK",
  "KBK",
  "KBKB",
  "KBKN",
  "KBKP",
  "KBNK",
  "KBNKB",
  "KBNKN",
  "KBNKP",
  "KBNKQ",
  "KBNKR",
  "KBNNK",
  "KBNPK",
  "KBPK",
  "KBPKB",
  "KBPKN",
  "KBPKP",
  "KBPKQ",
  "KBPKR",
  "KBPPK",
  "KK",
  "KNK",
  "KNKN",
  "KNKP",
  "KNNK",
  "KNNKB",
  "KNNKN",
  "KNNKP",
  "KNNKQ",
  "KNNKR",
  "KNNNK",
  "KNNPK",
  "KNPK",
  "KNPKB",
  "KNPKN",
  "KNPKP",
  "KNPKQ",
  "KNPKR",
  "KNPPK",
  "KPK",
  "KPKP",
  "KPPK",
  "KPPKB",
  "KPPKN",
  "KPPKP",
  "KPPKQ",
  "KPPKR",
  "KPPPK",
  "KQBBK",
  "KQBK",
  "KQBKB",
  "KQBKN",
  "KQBKP",
  "KQBKQ",
  "KQBKR",
  "KQBNK",
  "KQBPK",
  "KQK",
  "KQKB",
  "KQKN",
  "KQKP",
  "KQKQ",
  "KQKR",
  "KQNK",
  "KQNKB",
  "KQNKN",
  "KQNKP",
  "KQNKQ",
  "KQNKR",
  "KQNNK",
  "KQNPK",
  "KQPK",
  "KQPKB",
  "KQPKN",
  "KQPKP",
  "KQPKQ",
  "KQPKR",
  "KQPPK",
  "KQQBK",
  "KQQK",
  "KQQKB",
  "KQQKN",
  "KQQKP",
  "KQQKQ",
  "KQQKR",
  "KQQNK",
  "KQQPK",
  "KQQQK",
  "KQQRK",
  "KQRBK",
  "KQRK",
  "KQRKB",
  "KQRKN",
  "KQRKP",
  "KQRKQ",
  "KQRKR",
  "KQRNK",
  "KQRPK",
  "KQRRK",
  "KRBBK",
  "KRBK",
  "KRBKB",
  "KRBKN",
  "KRBKP",
  "KRBKQ",
  "KRBKR",
  "KRBNK",
  "KRBPK",
  "KRK",
  "KRKB",
  "KRKN",
  "KRKP",
  "KRKR",
  "KRNK",
  "KRNKB",
  "KRNKN",
  "KRNKP",
  "KRNKQ",
  "KRNKR",
  "KRNNK",
  "KRNPK",
  "KRPK",
  "KRPKB",
  "KRPKN",
  "KRPKP",
  "KRPKQ",
  "KRPKR",
  "KRPPK",
  "KRRBK",
  "KRRK",
  "KRRKB",
  "KRRKN",
  "KRRKP",
  "KRRKQ",
  "KRRKR",
  "KRRNK",
  "KRRPK",
  "KRRRK"
&#125;;

Don · Post by **Don** » Sat Aug 03, 2013 1:58 pm

Steve Maughan wrote:I'd like to add some basic endgame knowledge to Maverick. At first I'd like to add basic stuff - N + B + K vs. K etc. In the medium term I'd like to add knowledge which is a little more complex e.g. Philidor's endgame with P + R + K vs. R + K.

My question is - what's the best (fastest) way to recognize these positions?

The most obvious way is to do a series of "if then" statement:
Code: Select all
If &#40;WHITE_PAWNS <= 1 && BLACK_PAWNS <= 1&#41;&#123;
...
&#125;
Is there a better way?

Steve

I incrementally update a material only hash table and just use this table to score them. I have done this for a very long time but only recently have I added the general scoring for all positions - in other words the material only score for any position is also computed and hashed just like pawn structure.

The endgame table has a few fields because you don't always score each position the same way. You might for example only have a score and not want to use the normal evaluation, or you might want to compute a score with a specialized function just for that ending. We can also call the evaluation and modify what is returned in some cases - usually for endings we know are very drawish.

But we can add as many endings as we with without concern for the overhead.

syzygy · Post by **syzygy** » Sat Aug 03, 2013 1:59 pm

lucasart wrote:Yes, use material hash keys. You can construct them using a zobrist approach, but I prefer a more comprehensive method, that allows me to understand the values of the keys directly:
Code: Select all
uint64_t mat_key;

bits 0..3&#58; number of white pawns
bits 4..7&#58; number of black pawns
bits 8..11&#58; number of white knights
bits 12..15&#58; number of black knights
(...)
bits 32..35&#58; number of white queens
bits 36..39&#58; number of black queens

This is what I do as well (except for the order). I use the 16 highest bits to store random values which I use as an index into the material hash table.

So:

Code: Select all

uint64 mat_piece_key&#91;&#93; = &#123;
  0ULL,
  0x5ced000000000001ULL,
  0xe173000000000010ULL,
  0xd64d000000000100ULL,
  0xab88000000001000ULL,
  0x680b000000010000ULL,
  0x0000000000000000ULL,
  0ULL,
  0ULL,
  0xf209000000100000ULL,
  0xbb14000001000000ULL,
  0x58df000010000000ULL,
  0xa15f000100000000ULL,
  0x7c94001000000000ULL,
  0x0000000000000000ULL,
  0ULL
&#125;;

The material hash key is simply the sum of the individual keys for the type of each piece on the board (piece types ranging from 1-6 and 9-14). No need to separately keep track of the number of pieces of each type as e.g. Stockfish does (and needs to do in order to be able to update the material signature with Zobrist hashing).

lucasart · Post by **lucasart** » Sat Aug 03, 2013 2:11 pm

syzygy wrote:
lucasart wrote:Yes, use material hash keys. You can construct them using a zobrist approach, but I prefer a more comprehensive method, that allows me to understand the values of the keys directly:
Code: Select all
uint64_t mat_key;

bits 0..3&#58; number of white pawns
bits 4..7&#58; number of black pawns
bits 8..11&#58; number of white knights
bits 12..15&#58; number of black knights
(...)
bits 32..35&#58; number of white queens
bits 36..39&#58; number of black queens
This is what I do as well (except for the order). I use the 16 highest bits to store random values which I use as an index into the material hash table.

That's a very nice trick. It allows O(1) branching, and in a very elegant way!

Thanks for the tip. I will definitely use your method (16 highest bits for index) when I decide to implement a table lookup (for now only KPK and KBPK so the basic "if method" is fine).

sje · Post by **sje** » Sat Aug 03, 2013 2:17 pm

The endgame descriptor probe code is triggered mostly (maybe only) for tablebase access. No more than nine (= log2 N) probes per call are needed, and the time used is totally swamped by the ensuing system call to access a tablebase file.

Code: Select all

bool Position&#58;&#58;IdentifyEndgame&#40;Egk& egk, bool& isreversed&#41; const
&#123;
  egk = EgkNil;
  isreversed = false;

  if &#40;census.GetTotalCount&#40;) <= EgkMaxManLen&#41;
  &#123;
    const ui64 whitedist = census.GetDistribution&#40;ColorWhite&#41;;
    const ui64 blackdist = census.GetDistribution&#40;ColorBlack&#41;;
    const ui64 key = &#40;blackdist << 24&#41; | whitedist;
    si lo = 0, hi = EgkPairLen - 1;

    while &#40;IsEgkNil&#40;egk&#41; && &#40;lo <= hi&#41;)
    &#123;
      const si mi = &#40;lo + hi&#41; / 2;
      const ui64 tablekey = Material&#58;&#58;EndgameDescriptorVec&#91;mi&#93;.key;

      if &#40;key == tablekey&#41;
      &#123;
        egk = Material&#58;&#58;EndgameDescriptorVec&#91;mi&#93;.egk;
        isreversed = Material&#58;&#58;EndgameDescriptorVec&#91;mi&#93;.isreversed;
      &#125;
      else
      &#123;
        if &#40;key < tablekey&#41;
          hi = mi - 1;
        else
          lo = mi + 1;
      &#125;;
    &#125;;
  &#125;;

  return IsEgkNotNil&#40;egk&#41;;
&#125;

lucasart · Post by **lucasart** » Sat Aug 03, 2013 3:17 pm

lucasart wrote:
syzygy wrote:
lucasart wrote:Yes, use material hash keys. You can construct them using a zobrist approach, but I prefer a more comprehensive method, that allows me to understand the values of the keys directly:
Code: Select all
uint64_t mat_key;

bits 0..3&#58; number of white pawns
bits 4..7&#58; number of black pawns
bits 8..11&#58; number of white knights
bits 12..15&#58; number of black knights
(...)
bits 32..35&#58; number of white queens
bits 36..39&#58; number of black queens
This is what I do as well (except for the order). I use the 16 highest bits to store random values which I use as an index into the material hash table.
That's a very nice trick. It allows O(1) branching, and in a very elegant way!

Thanks for the tip. I will definitely use your method (16 highest bits for index) when I decide to implement a table lookup (for now only KPK and KBPK so the basic "if method" is fine).

Actually, there is always the back side of the coin in O(1) branching: memory wastage. Your method involves an array of 2^16 pointers of endgame functions (most are null and the rest point to the few endgame functions). That's 512 KB of RAM on x86-64 plateforms...

Also, since you use a 16 bit material zobrist, why do you need the 40 lower bits ?

In practice I think it's probably best to stick to the simple O(log(N)) method involving dichotomic search by material key. In practice it's only a few iterations, completely negligible...

Steve Maughan · Post by **Steve Maughan** » Sat Aug 03, 2013 5:17 pm

Thanks everyone - some great input.

Here's what I'm thinking of trying first. I'm going to try something like Steven Edwards' suggestion. I'll hash the material and then use the "n" smallest bits to lookup the hash value in a table and compare the key. This table will be small. If I have 20 endgames I'd like to recognize (40 valid possible keys), I imagine I'll be able to get it in a 64 entry table. The table entry will contain a pointer to the function which will evaluate the endgame.

Thanks again - some good discussion!

Steve

syzygy · Post by **syzygy** » Sat Aug 03, 2013 5:19 pm

lucasart wrote:Thanks for the tip. I will definitely use your method (16 highest bits for index) when I decide to implement a table lookup (for now only KPK and KBPK so the basic "if method" is fine).

You can use the entries in the material hash table also for storing the value of material (applying formulae as complicated as you like, no need to restrict to a simple sum of piece values), weights for king safety, weights for middle game / end game, parameters for adjusting reductions / extensions (in particular null move comes to mind), etc. So it doesn't have to be restricted to end games.

Using all 16 bits might be overkill. I think Stockfish uses material hash tables with 8192 entries or so. But that just means 16 bits is enough.

syzygy · Post by **syzygy** » Sat Aug 03, 2013 5:28 pm

lucasart wrote:Actually, there is always the back side of the coin in O(1) branching: memory wastage. Your method involves an array of 2^16 pointers of endgame functions (most are null and the rest point to the few endgame functions). That's 512 KB of RAM on x86-64 plateforms...

See my previous reply. Also note that if you only use it for a few endgame functions and only probe the table if there is little material on the board, most of the table will never be accessed, will never see you cache, and will do no harm whatsoever.

Btw, there is no real need to store an 8-byte function pointer. You can store a byte that you use as an index into an array of function pointers if non-zero. Or just do a switch() on the value of the byte if non-zero. The important thing is to reduce overhead as much as possible for the great majority of the positions for which you do not have special endgame knowledge.

Also, since you use a 16 bit material zobrist, why do you need the 40 lower bits ?

It is not Zobrist, because it is a sum and not a xor.

You need to detect collisions. The lower 40 bits unique identify the material, so give a perfect way to check for collisions. It may also be useful if somewhere in your engine you want to quickly test for a particular combination of material without having to store corresponding flags in the material hash table. It's quite flexible.

And again, 8192 entries might well be enough to have a 99.9% hit rate. I never bothered to test this. But I can already tell that there is not much if anything to lose by having a table that is too big, since the parts that are not accessed will simply not enter your cache.

In any event, my main point is that, as far as I can tell, my approach has clear advantages and no disadvantages over Steven's and Stockfish's way of calculating a material signature by xor'ing numbers. I just add numbers taken from a 1-dimensional array indexed by piece type. If you use xor, you have to take numbers from a 2-dimensional array indexed by piece type and by an index running from 1 to the number of pieces of a particular type and color. In addition, my hashing scheme is perfect, i.e. there are no collisions.

Best was to Recognize Know Endgames?

Re: Best was to Recognize Know Endgames?

A list of endgame classes

Re: Best was to Recognize Know Endgames?

Re: Best was to Recognize Know Endgames?

Re: Best was to Recognize Know Endgames?

And the table probe code

Re: Best was to Recognize Know Endgames?

Re: Best was to Recognize Know Endgames?

Re: Best was to Recognize Know Endgames?

Re: Best was to Recognize Know Endgames?