Black magic bitboards

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Volker Annuss
Posts: 180
Joined: Mon Sep 03, 2007 9:15 am

Re: Black magic bitboards

Post by Volker Annuss »

Gerd Isenberg wrote:How did you come to the idea of using black magics? Was it already used in Hermann or more recently tried in Arminius?
There were three things that came together for the idea.

1st I recently installed openSUSE 64 bit on my raspberry pi 3 and I wanted to test how fast my magic generator is on that machine. Optimizing the overlapping heavyly relies on testing if vectors of 64 bit numbers fit together. A set bit stands for a used table entry, a not set bit stands for an unused entry.

2nd there was this thread asking for dense magics that made me think about how magics work again.

And last but not least holidays, giving me some time to try it out. The changes to the optimizer were not that big, but it took some time to get rid of old optimized code that relied on the assumption that index == 0 is always used.

The changes are in Arminius since yesterday.
Volker Annuss
Posts: 180
Joined: Mon Sep 03, 2007 9:15 am

Re: Black magic bitboards

Post by Volker Annuss »

AlvaroBegue wrote:OK, now I finally understand the point. So it's a micro-optimization that will not change anything in practical terms for chess programming, but it's somewhat entertaining for some people as a challenge.
It was intended to get a smaller table, but since the size of the table does not drop significantly any more, I think you are right.[/quote]
Since you have the infrastructure to test these things, would you be able to test one variation for me? (((occ & mask) * magic) + 0x8000000000000000ull) >> (64 - n)

Whether a magic multiplier works or not doesn't depend on the presence of the constant, but the resulting offsets might be in a narrower range, because we have shifted 0 to the middle of the range.
The test of your variation is running. First results indicate, that it gives less good collisions to exploit and less big gaps of unused entries. Interesting enough to use some more CPU time on it to come closer to the optimum.
Volker Annuss
Posts: 180
Joined: Mon Sep 03, 2007 9:15 am

Re: Black magic bitboards

Post by Volker Annuss »

AlvaroBegue wrote:Since you have the infrastructure to test these things, would you be able to test one variation for me? (((occ & mask) * magic) + 0x8000000000000000ull) >> (64 - n)
That gave me a lookup table of size 90995*8. The sizes for the most compact per square tables without overlapping are 4767*8 for bishops and 101308*8 for rooks.

There are more gaps in the overlapped table than with the other methods.
niklasf
Posts: 42
Joined: Sat May 16, 2015 11:41 pm

Re: Black magic bitboards

Post by niklasf »

Very intresting idea. I tried shaving off a few bytes by spending more CPU time to find magics with minimal index range (and replacing them in your packing, or just tacking them on at the end).

It was barely worth it, saving only 519 entries, but here you go anyway:

Code: Select all

uint64_t lookup_table[87988];

bishop_magics[64] =
{
    { 0xa7020080601803d8ull, 60984 },
    { 0x13802040400801f1ull, 66046 },
    { 0x0a0080181001f60cull, 32910 },
    { 0x1840802004238008ull, 16369 },
    { 0xc03fe00100000000ull, 42115 },
    { 0x24c00bffff400000ull,   835 },
    { 0x0808101f40007f04ull, 18910 },
    { 0x100808201ec00080ull, 25911 },
    { 0xffa2feffbfefb7ffull, 63301 },
    { 0x083e3ee040080801ull, 16063 },
    { 0xc0800080181001f8ull, 17481 },
    { 0x0440007fe0031000ull, 59361 },
    { 0x2010007ffc000000ull, 18735 },
    { 0x1079ffe000ff8000ull, 61249 },
    { 0x3c0708101f400080ull, 68938 },
    { 0x080614080fa00040ull, 61791 },
    { 0x7ffe7fff817fcff9ull, 21893 },
    { 0x7ffebfffa01027fdull, 62068 },
    { 0x53018080c00f4001ull, 19829 },
    { 0x407e0001000ffb8aull, 26091 },
    { 0x201fe000fff80010ull, 15815 },
    { 0xffdfefffde39ffefull, 16419 },
    { 0xcc8808000fbf8002ull, 59777 },
    { 0x7ff7fbfff8203fffull, 16288 },
    { 0x8800013e8300c030ull, 33235 },
    { 0x0420009701806018ull, 15459 },
    { 0x7ffeff7f7f01f7fdull, 15863 },
    { 0x8700303010c0c006ull, 75555 },
    { 0xc800181810606000ull, 79445 },
    { 0x20002038001c8010ull, 15917 },
    { 0x087ff038000fc001ull,  8512 },
    { 0x00080c0c00083007ull, 73069 },
    { 0x00000080fc82c040ull, 16078 },
    { 0x000000407e416020ull, 19168 },
    { 0x00600203f8008020ull, 11056 },
    { 0xd003fefe04404080ull, 62544 },
    { 0xa00020c018003088ull, 80477 },
    { 0x7fbffe700bffe800ull, 75049 },
    { 0x107ff00fe4000f90ull, 32947 },
    { 0x7f8fffcff1d007f8ull, 59172 },
    { 0x0000004100f88080ull, 55845 },
    { 0x00000020807c4040ull, 61806 },
    { 0x00000041018700c0ull, 73601 },
    { 0x0010000080fc4080ull, 15546 },
    { 0x1000003c80180030ull, 45243 },
    { 0xc10000df80280050ull, 20333 },
    { 0xffffffbfeff80fdcull, 33402 },
    { 0x000000101003f812ull, 25917 },
    { 0x0800001f40808200ull, 32875 },
    { 0x084000101f3fd208ull,  4639 },
    { 0x080000000f808081ull, 17077 },
    { 0x0004000008003f80ull, 62324 },
    { 0x08000001001fe040ull, 18159 },
    { 0x72dd000040900a00ull, 61436 },
    { 0xfffffeffbfeff81dull, 57073 },
    { 0xcd8000200febf209ull, 61025 },
    { 0x100000101ec10082ull, 81259 },
    { 0x7fbaffffefe0c02full, 64083 },
    { 0x7f83fffffff07f7full, 56114 },
    { 0xfff1fffffff7ffc1ull, 57058 },
    { 0x0878040000ffe01full, 58912 },
    { 0x945e388000801012ull, 22194 },
    { 0x0840800080200fdaull, 70880 },
    { 0x100000c05f582008ull, 11140 },
};

rook_magics[64] =
{
    { 0x80280013ff84ffffull, 10890 },
    { 0x5ffbfefdfef67fffull, 50579 },
    { 0xffeffaffeffdffffull, 62020 },
    { 0x003000900300008aull, 67322 },
    { 0x0050028010500023ull, 80251 },
    { 0x0020012120a00020ull, 58503 },
    { 0x0030006000c00030ull, 51175 },
    { 0x0058005806b00002ull, 83130 },
    { 0x7fbff7fbfbeafffcull, 50430 },
    { 0x0000140081050002ull, 21613 },
    { 0x0000180043800048ull, 72625 },
    { 0x7fffe800021fffb8ull, 80755 },
    { 0xffffcffe7fcfffafull, 69753 },
    { 0x00001800c0180060ull, 26973 },
    { 0x4f8018005fd00018ull, 84972 },
    { 0x0000180030620018ull, 31958 },
    { 0x00300018010c0003ull, 69272 },
    { 0x0003000c0085ffffull, 48372 },
    { 0xfffdfff7fbfefff7ull, 65477 },
    { 0x7fc1ffdffc001fffull, 43972 },
    { 0xfffeffdffdffdfffull, 57154 },
    { 0x7c108007befff81full, 53521 },
    { 0x20408007bfe00810ull, 30534 },
    { 0x0400800558604100ull, 16548 },
    { 0x0040200010080008ull, 46407 },
    { 0x0010020008040004ull, 11841 },
    { 0xfffdfefff7fbfff7ull, 21112 },
    { 0xfebf7dfff8fefff9ull, 44214 },
    { 0xc00000ffe001ffe0ull, 57925 },
    { 0x4af01f00078007c3ull, 29574 },
    { 0xbffbfafffb683f7full, 17309 },
    { 0x0807f67ffa102040ull, 40143 },
    { 0x200008e800300030ull, 64659 },
    { 0x0000008780180018ull, 70469 },
    { 0x0000010300180018ull, 62917 },
    { 0x4000008180180018ull, 60997 },
    { 0x008080310005fffaull, 18554 },
    { 0x4000188100060006ull, 14385 },
    { 0xffffff7fffbfbfffull,     0 },
    { 0x0000802000200040ull, 38091 },
    { 0x20000202ec002800ull, 25122 },
    { 0xfffff9ff7cfff3ffull, 60083 },
    { 0x000000404b801800ull, 72209 },
    { 0x2000002fe03fd000ull, 67875 },
    { 0xffffff6ffe7fcffdull, 56290 },
    { 0xbff7efffbfc00fffull, 43807 },
    { 0x000000100800a804ull, 73365 },
    { 0x6054000a58005805ull, 76398 },
    { 0x0829000101150028ull, 20024 },
    { 0x00000085008a0014ull,  9513 },
    { 0x8000002b00408028ull, 24324 },
    { 0x4000002040790028ull, 22996 },
    { 0x7800002010288028ull, 23213 },
    { 0x0000001800e08018ull, 56002 },
    { 0xa3a80003f3a40048ull, 22809 },
    { 0x2003d80000500028ull, 44545 },
    { 0xfffff37eefefdfbeull, 36072 },
    { 0x40000280090013c1ull,  4750 },
    { 0xbf7ffeffbffaf71full,  6014 },
    { 0xfffdffff777b7d6eull, 36054 },
    { 0x48300007e8080c02ull, 78538 },
    { 0xafe0000fff780402ull, 28745 },
    { 0xee73fffbffbb77feull,  8555 },
    { 0x0002000308482882ull,  1009 },
};
Just in case they are useful for anyone: Here are all the dense magics (of which I ended up using only a few).

Code: Select all

// width without overlapping: 4606
bishops[64] = {
    0xa7020080601803d8ull,
    0x13802040400801f1ull,
    0x0a0080181001f60cull,
    0x1840802004238008ull,
    0x581fc40182094560ull,
    0xa0a01f0100824001ull,
    0x3ae7f04002008258ull,
    0x200c10101ec00080ull,
    0xb69d010040104803ull,
    0x833c802040102401ull,
    0xc0800080181001f8ull,
    0x0442008020048040ull,
    0x3253203c01880043ull,
    0xccf84020010081e4ull,
    0x3c0708101f400080ull,
    0x0d0614080fa00040ull,
    0xf300c0007e806002ull,
    0x41bea00040102002ull,
    0x53018080c00f4001ull,
    0xb00020820080104dull,
    0x0060200100080842ull,
    0x2100100021c58023ull,
    0xcc8808000fbf8002ull,
    0x1870060008300060ull,
    0x8800013e8300c030ull,
    0x0420009701806018ull,
    0x020003fa00802080ull,
    0x8700303010c0c006ull,
    0xc800181810606000ull,
    0x2a002038201c8010ull,
    0x20007004b80fc00eull,
    0x90c80c0c10083003ull,
    0x73600080cd02c040ull,
    0xdb60004066416020ull,
    0x83a80101f6010010ull,
    0x008b001c70080080ull,
    0xa00020c018003088ull,
    0x99001c600e001e30ull,
    0x84002007e0d7f078ull,
    0x0400003007f3f008ull,
    0x0800004100f90080ull,
    0x0220002080748040ull,
    0x2ba40080818300c0ull,
    0x1e23040080fb4080ull,
    0xa800003c80180030ull,
    0xc10000df80280050ull,
    0x8b8000200813f024ull,
    0x398000200405f812ull,
    0xc20000203e808182ull,
    0xc538000fa04032a5ull,
    0x800404000f808010ull,
    0x0005802808003f62ull,
    0xd0880001001fe05bull,
    0x72dd000040900a00ull,
    0x55003ee010200804ull,
    0xcd8000200febf209ull,
    0x20c000101ec10062ull,
    0x0a5f7c00101f4029ull,
    0x22048800000f8080ull,
    0x4a0c81005008003full,
    0x090a010001002020ull,
    0x945e388000801012ull,
    0xe73c000040400826ull,
    0xa500008060400fd8ull,
};

// width without overlapping: 97178
rooks[64] = {
    0x80500104d8000230ull,
    0x8018007400860002ull,
    0x8030010086000090ull,
    0x006006008120006cull,
    0x0050028010500023ull,
    0x003000c001900030ull,
    0x00300060003000c0ull,
    0x0058005806b00002ull,
    0x508028010114000aull,
    0x08000c0086000300ull,
    0x0000080401020008ull,
    0x0000200200200400ull,
    0x10003001c030000full,
    0x99001800c0180060ull,
    0x4f8018005fd00018ull,
    0x3580180030620018ull,
    0x10300018010c0008ull,
    0x2906000c00830004ull,
    0x8002000804010008ull,
    0x1804002002002004ull,
    0x0102002001002002ull,
    0x2211001000801040ull,
    0x4000004040008001ull,
    0x612c8007c007f020ull,
    0x1040200010080008ull,
    0x0100080010040011ull,
    0xa004010008020008ull,
    0x0000040020200200ull,
    0x2320020020010020ull,
    0x4af01f00078007c3ull,
    0x0a00008020200040ull,
    0x240fc00040000080ull,
    0x200008e800300031ull,
    0x0000080400100011ull,
    0x3400400880410010ull,
    0x4000200400200200ull,
    0x8800200100200200ull,
    0xa8001fc080070007ull,
    0x4000200080200040ull,
    0x2000802000200040ull,
    0x1c000202ec002800ull,
    0x0183000086000c00ull,
    0x703ffdc008004010ull,
    0x0020040002002020ull,
    0x290000302e803003ull,
    0x0400010000802020ull,
    0x140000100800a804ull,
    0x6054000a58005805ull,
    0x0829000101150028ull,
    0x1d00008a00850014ull,
    0x2b40007ec6030018ull,
    0xc020003da01e0020ull,
    0x0020002030018030ull,
    0x4080001800df8018ull,
    0xa3a80003f3a40048ull,
    0x9843d80000500028ull,
    0x000000804110102eull,
    0x52000080086eff41ull,
    0x150000564012ff8bull,
    0x024000100c080fd2ull,
    0x48300007e8080c02ull,
    0x2200000370040802ull,
    0x7100000400437802ull,
    0x00a2000828884302ull,
};
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: Black magic bitboards

Post by syzygy »

I have just implemented this in Cfish:
https://github.com/syzygy1/Cfish/commit ... 067c1c4995

Unfortunately it is slightly slower on my machine (i7-3930K Sandybridge) than the plain magic approach based on Volker's white magics. This is a bit surprising, since the array size goes down from 89524 to 87988 entries.

But now I tried on my laptop, and black magics wins there. Interesting.
Volker Annuss wrote:Now I have a table size of 88316*8 for black magics and 88772*8 for white ones.
Then I am interested in your latest white magics. Did you post them somewhere, or would you mind posting them here?

Results on my Kabylake laptop for "cfish bench 128 1 17 >/dev/null" (nps):

Code: Select all

fancy magics   2321090
white magics   2343727
black magics   2353732
plain bmi2     2355310
fancy bmi2     2364420
These are all the best results after several tries (and with cpu governor set to performance and the process bound to a particular cpu). Gaussian distribution does not apply here.

The fancy magics and plain bmi2 are essentially those from Stockfish. The fancy bmi2 implementation uses 200K rook tables.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: Black magic bitboards

Post by syzygy »

syzygy wrote:Results on my Kabylake laptop for "cfish bench 128 1 17 >/dev/null" (nps):

Code: Select all

fancy magics   2321090
white magics   2343727
black magics   2353732
plain bmi2     2355310
fancy bmi2     2364420
Redirecting stderr to a file seems to remove some further disturbances. I now get this:

Code: Select all

fancy magics   2337686
white magics   2362632
black magics   2362235
plain bmi2     2372801
fancy bmi2     2381846
Again I keep the best results (least disturbed by random OS stuff).

It seems white magics and black magics are virtually indistinguishable in performance.

BMI2 is still better, but Volker's overlapping-table approach gets pretty close.
Volker Annuss
Posts: 180
Joined: Mon Sep 03, 2007 9:15 am

Re: Black magic bitboards

Post by Volker Annuss »

Unfortunately compact is not the same as cache friendly. But compactness is much easier to optimize. When you have a access to a linux machine you may want to examine different lookup tables with cachegrind.

For what its worth, here are my most compact white magics.

Code: Select all

uint64_t lookup_table[88772];

struct
{
   uint64_t factor;
   int position;
}
bishop_magics[64] =
{
	{ 0x007fbfbfbfbfbfffu,   5378 },
	{ 0x0000a060401007fcu,   4093 },
	{ 0x0001004008020000u,   4314 },
	{ 0x0000806004000000u,   6587 },
	{ 0x0000100400000000u,   6491 },
	{ 0x000021c100b20000u,   6330 },
	{ 0x0000040041008000u,   5609 },
	{ 0x00000fb0203fff80u,  22236 },
	{ 0x0000040100401004u,   6106 },
	{ 0x0000020080200802u,   5625 },
	{ 0x0000004010202000u,  16785 },
	{ 0x0000008060040000u,  16817 },
	{ 0x0000004402000000u,   6842 },
	{ 0x0000000801008000u,   7003 },
	{ 0x000007efe0bfff80u,   4197 },
	{ 0x0000000820820020u,   7356 },
	{ 0x0000400080808080u,   4602 },
	{ 0x00021f0100400808u,   4538 },
	{ 0x00018000c06f3fffu,  29531 },
	{ 0x0000258200801000u,  45393 },
	{ 0x0000240080840000u,  12420 },
	{ 0x000018000c03fff8u,  15763 },
	{ 0x00000a5840208020u,   5050 },
	{ 0x0000020008208020u,   4346 },
	{ 0x0000804000810100u,   6074 },
	{ 0x0001011900802008u,   7866 },
	{ 0x0000804000810100u,  32139 },
	{ 0x000100403c0403ffu,  57673 },
	{ 0x00078402a8802000u,  55365 },
	{ 0x0000101000804400u,  15818 },
	{ 0x0000080800104100u,   5562 },
	{ 0x00004004c0082008u,   6390 },
	{ 0x0001010120008020u,   7930 },
	{ 0x000080809a004010u,  13329 },
	{ 0x0007fefe08810010u,   7170 },
	{ 0x0003ff0f833fc080u,  27267 },
	{ 0x007fe08019003042u,  53787 },
	{ 0x003fffefea003000u,   5097 },
	{ 0x0000101010002080u,   6643 },
	{ 0x0000802005080804u,   6138 },
	{ 0x0000808080a80040u,   7418 },
	{ 0x0000104100200040u,   7898 },
	{ 0x0003ffdf7f833fc0u,  42012 },
	{ 0x0000008840450020u,  57350 },
	{ 0x00007ffc80180030u,  22813 },
	{ 0x007fffdd80140028u,  56693 },
	{ 0x00020080200a0004u,   5818 },
	{ 0x0000101010100020u,   7098 },
	{ 0x0007ffdfc1805000u,   4451 },
	{ 0x0003ffefe0c02200u,   4709 },
	{ 0x0000000820806000u,   4794 },
	{ 0x0000000008403000u,  13364 },
	{ 0x0000000100202000u,   4570 },
	{ 0x0000004040802000u,   4282 },
	{ 0x0004010040100400u,  14964 },
	{ 0x00006020601803f4u,   4026 },
	{ 0x0003ffdfdfc28048u,   4826 },
	{ 0x0000000820820020u,   7354 },
	{ 0x0000000008208060u,   4848 },
	{ 0x0000000000808020u,  15946 },
	{ 0x0000000001002020u,  14932 },
	{ 0x0000000401002008u,  16588 },
	{ 0x0000004040404040u,   6905 },
	{ 0x007fff9fdf7ff813u,  16076 }
},
rook_magics[64] =
{
	{ 0x00280077ffebfffeu,  26304 },
	{ 0x2004010201097fffu,  35520 },
	{ 0x0010020010053fffu,  38592 },
	{ 0x0040040008004002u,   8026 },
	{ 0x7fd00441ffffd003u,  22196 },
	{ 0x4020008887dffffeu,  80870 },
	{ 0x004000888847ffffu,  76747 },
	{ 0x006800fbff75fffdu,  30400 },
	{ 0x000028010113ffffu,  11115 },
	{ 0x0020040201fcffffu,  18205 },
	{ 0x007fe80042ffffe8u,  53577 },
	{ 0x00001800217fffe8u,  62724 },
	{ 0x00001800073fffe8u,  34282 },
	{ 0x00001800e05fffe8u,  29196 },
	{ 0x00001800602fffe8u,  23806 },
	{ 0x000030002fffffa0u,  49481 },
	{ 0x00300018010bffffu,   2410 },
	{ 0x0003000c0085fffbu,  36498 },
	{ 0x0004000802010008u,  24478 },
	{ 0x0004002020020004u,  10074 },
	{ 0x0001002002002001u,  79315 },
	{ 0x0001001000801040u,  51779 },
	{ 0x0000004040008001u,  13586 },
	{ 0x0000006800cdfff4u,  19323 },
	{ 0x0040200010080010u,  70612 },
	{ 0x0000080010040010u,  83652 },
	{ 0x0004010008020008u,  63110 },
	{ 0x0000040020200200u,  34496 },
	{ 0x0002008010100100u,  84966 },
	{ 0x0000008020010020u,  54341 },
	{ 0x0000008020200040u,  60421 },
	{ 0x0000820020004020u,  86402 },
	{ 0x00fffd1800300030u,  50245 },
	{ 0x007fff7fbfd40020u,  76622 },
	{ 0x003fffbd00180018u,  84676 },
	{ 0x001fffde80180018u,  78757 },
	{ 0x000fffe0bfe80018u,  37346 },
	{ 0x0001000080202001u,    370 },
	{ 0x0003fffbff980180u,  42182 },
	{ 0x0001fffdff9000e0u,  45385 },
	{ 0x00fffefeebffd800u,  61659 },
	{ 0x007ffff7ffc01400u,  12790 },
	{ 0x003fffbfe4ffe800u,  16762 },
	{ 0x001ffff01fc03000u,      0 },
	{ 0x000fffe7f8bfe800u,  38380 },
	{ 0x0007ffdfdf3ff808u,  11098 },
	{ 0x0003fff85fffa804u,  21803 },
	{ 0x0001fffd75ffa802u,  39189 },
	{ 0x00ffffd7ffebffd8u,  58628 },
	{ 0x007fff75ff7fbfd8u,  44116 },
	{ 0x003fff863fbf7fd8u,  78357 },
	{ 0x001fffbfdfd7ffd8u,  44481 },
	{ 0x000ffff810280028u,  64134 },
	{ 0x0007ffd7f7feffd8u,  41759 },
	{ 0x0003fffc0c480048u,   1394 },
	{ 0x0001ffffafd7ffd8u,  40910 },
	{ 0x00ffffe4ffdfa3bau,  66516 },
	{ 0x007fffef7ff3d3dau,   3897 },
	{ 0x003fffbfdfeff7fau,   3930 },
	{ 0x001fffeff7fbfc22u,  72934 },
	{ 0x0000020408001001u,  72662 },
	{ 0x0007fffeffff77fdu,  56325 },
	{ 0x0003ffffbf7dfeecu,  66501 },
	{ 0x0001ffff9dffa333u,  14826 }
};
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: Black magic bitboards

Post by syzygy »

Volker Annuss wrote:For what its worth, here are my most compact white magics.
Thanks! On my desktop machine, these numbers beat the old ones.

Optimising for cache friendliness is certainly more complicated, but it does seem possible to predict how well a particular table will do given a set of access statistics. When I find some time I might try to calculate some cache-efficiency predictions for different cache-line alignments of the big overlapping table and see if the results match the predictions.

But first I am going to see how gcc is cache-line aligning the big table. (It is probably not doing that at all.)