Also throw in PEXT-PDEP, which could have much smaller tables (~210 kb?). It would have 3 extra cycles of latency, and you'd have to load one extra per-square mask into a register, but unlike "table indirection" techniques it wouldn't increase cache pressure or read from any data-dependent memory address. [other than the initial lookup, I mean... it wouldn't have a 2nd, indirect read]Gerd Isenberg wrote:Yes, a comparison of plain magics vs. fancy magics vs. fancy PEXT BB in a concrete chess program would be interesting. My guess in Sven's Attack64 program (no hashtables) ...wgarvin wrote:I wonder if BMI2 - PEXT is any faster than plain magic, on the chips that support it?

(My guess would be about the same)

1. PEXT BB (pext, 840K lookup)

## Comparison of bitboard attack-getter variants

**Moderators:** bob, hgm, Harvey Williamson

**Forum rules**

This textbox is used to restore diagrams posted with the [d] tag before the upgrade.

### Re: Comparison of bitboard attack-getter variants

### Re: Comparison of bitboard attack-getter variants

Skippers magic numbers for a rook:

a1 6421413255654250074 15

b1 13888033412225728670 14

c1 17993041225886426135 14

d1 8431303145914873310 14

e1 2312909536200034874 14

f1 7089559834495306050 14

g1 7089559834495306050 14

h1 9776329826411170775 15

a2 9776329826411170775 14

b2 10671919824395214238 13

c2 9584587689412585948 13

d2 12463247837821257675 13

e2 10933243730962491516 13

f2 2197513883701256999 13

g2 2197513883701256999 13

h2 2197513883701256999 14

a3 2197513883701256999 14

b3 7379459750028623798 13

c3 8536373658874911835 13

d3 9058873454551509880 13

e3 16978707682773041813 13

f3 6894603258312343199 13

g3 9772981940334238863 13

h3 3010061618700396727 14

a4 9138715848917472678 14

b4 8946825690166064044 13

c4 1741381422703840678 13

d4 13504618072415878493 13

e4 7717130143132042619 13

f4 10926547954513760397 13

g4 8500953853965918070 13

h4 10291852374957294340 14

a5 9204520239974666050 14

b5 12978770388625982873 13

c5 10553176281524637154 13

d5 7191134437589288727 13

e5 15814816524759944704 13

f5 17867320427295343372 13

g5 1211756347517578554 13

h5 8870414688888387961 14

a6 16598640229040220155 14

b6 7601734490633330279 13

c6 2187188739867550352 13

d6 2187188739867550352 13

e6 5396606555544269889 13

f6 5918972895738479454 13

g6 12490284533597592539 13

h6 13385874527286668706 14

a7 3493660270167347783 14

b7 6180148784847834684 13

c7 312818463456668716 13

d7 8675176639765049047 13

e7 131336951255288700 13

f7 3079430851774730831 13

g7 17117658685416164976 13

h7 12337379024030885569 14

a8 2445164764652897478 15

b8 2445164764652897478 14

c8 2445164764652897478 14

d8 2445164764652897478 14

e8 1810469178542962494 14

f8 9277518831844041090 14

g8 9277518831844041090 14

h8 280894568413730798 15

One bit less. But I did not get further than c1. These were computed using a sloppy implementation of trial and error algorithm.

a1 3829341177380969488 14

b1 3108190927689962429 13

c1 6380065235461944931 13

a1 6421413255654250074 15

b1 13888033412225728670 14

c1 17993041225886426135 14

d1 8431303145914873310 14

e1 2312909536200034874 14

f1 7089559834495306050 14

g1 7089559834495306050 14

h1 9776329826411170775 15

a2 9776329826411170775 14

b2 10671919824395214238 13

c2 9584587689412585948 13

d2 12463247837821257675 13

e2 10933243730962491516 13

f2 2197513883701256999 13

g2 2197513883701256999 13

h2 2197513883701256999 14

a3 2197513883701256999 14

b3 7379459750028623798 13

c3 8536373658874911835 13

d3 9058873454551509880 13

e3 16978707682773041813 13

f3 6894603258312343199 13

g3 9772981940334238863 13

h3 3010061618700396727 14

a4 9138715848917472678 14

b4 8946825690166064044 13

c4 1741381422703840678 13

d4 13504618072415878493 13

e4 7717130143132042619 13

f4 10926547954513760397 13

g4 8500953853965918070 13

h4 10291852374957294340 14

a5 9204520239974666050 14

b5 12978770388625982873 13

c5 10553176281524637154 13

d5 7191134437589288727 13

e5 15814816524759944704 13

f5 17867320427295343372 13

g5 1211756347517578554 13

h5 8870414688888387961 14

a6 16598640229040220155 14

b6 7601734490633330279 13

c6 2187188739867550352 13

d6 2187188739867550352 13

e6 5396606555544269889 13

f6 5918972895738479454 13

g6 12490284533597592539 13

h6 13385874527286668706 14

a7 3493660270167347783 14

b7 6180148784847834684 13

c7 312818463456668716 13

d7 8675176639765049047 13

e7 131336951255288700 13

f7 3079430851774730831 13

g7 17117658685416164976 13

h7 12337379024030885569 14

a8 2445164764652897478 15

b8 2445164764652897478 14

c8 2445164764652897478 14

d8 2445164764652897478 14

e8 1810469178542962494 14

f8 9277518831844041090 14

g8 9277518831844041090 14

h8 280894568413730798 15

One bit less. But I did not get further than c1. These were computed using a sloppy implementation of trial and error algorithm.

a1 3829341177380969488 14

b1 3108190927689962429 13

c1 6380065235461944931 13

- hgm
**Posts:**23613**Joined:**Fri Mar 10, 2006 9:06 am**Location:**Amsterdam**Full name:**H G Muller-
**Contact:**

### Re: Comparison of bitboard attack-getter variants

The last number on each line is the number of bits you have to keep to avoid collisions? It seems way to high. For a1 (and other corners) it should be 12, for edge squares like b1 or a2 it should be 11, and for inner squares like d3 10.

Are you sure you ignore the most-distant squares of any ray? The occupancy of the square where the ray hits the edge is irrelevant, as there would never be anything behind it that could be blocked. If you don't mask a8 and h1 away, it would be understandable why you have difficulty getting below 14 bits for a1.

So far doing it by hand seems a lot faster. My previous post already gave 12 keys (for b2-g2, b3-g3) that only need 10 bits, with a trivial prescription for making 24 more (for the inner squares)...

Are you sure you ignore the most-distant squares of any ray? The occupancy of the square where the ray hits the edge is irrelevant, as there would never be anything behind it that could be blocked. If you don't mask a8 and h1 away, it would be understandable why you have difficulty getting below 14 bits for a1.

So far doing it by hand seems a lot faster. My previous post already gave 12 keys (for b2-g2, b3-g3) that only need 10 bits, with a trivial prescription for making 24 more (for the inner squares)...

### Re: Comparison of bitboard attack-getter variants

Yes it ignores the squares on the edge.hgm wrote:The last number on each line is the number of bits you have to keep to avoid collisions? It seems way to high. For a1 (and other corners) it should be 12, for edge squares like b1 or a2 it should be 11, and for inner squares like d3 10.

Are you sure you ignore the most-distant squares of any ray? The occupancy of the square where the ray hits the edge is irrelevant, as there would never be anything behind it that could be blocked. If you don't mask a8 and h1 away, it would be understandable why you have difficulty getting below 14 bits for a1.

So far doing it by hand seems a lot faster. My previous post already gave 12 keys (for b2-g2, b3-g3) that only need 10 bits, with a trivial prescription for making 24 more (for the inner squares)...

For my sloppy trial and error algorithm I had to add four bits. Otherwise I can wait for inifinity before it completes. So that is three bits too much.

Code: Select all

```
int nKeys = StraightMovesDict.Count;
NBits = ((int)Math.Log(nKeys, 2)) + 4;
public ulong StraightOccupancy(CPosition pos)
{
ulong occupancy = (LeftMoves & ~pos.ChessBoard.ColumnsBitBoard[0])
| (RightMoves & ~pos.ChessBoard.ColumnsBitBoard[7])
| (UpMoves & ~pos.ChessBoard.RowsBitBoard[7])
| (DownMoves & ~pos.ChessBoard.RowsBitBoard[0]);
return occupancy;
}
```

### Re: Comparison of bitboard attack-getter variants

If I would have more patience I can compute them one bit shorter. This computation already takes at least five minutes. So I cancelled it.

a1 3439443032492089405 14

b1 3135786584331814519 13

c1 8619110724855342808 13

d1 7848207856287267137 13

e1 3811733587375915473 13

f1 8762675609952735468 13

g1 15700248209820441285 13

h1 17752752118909405489 14

a2 5695838162267683422 13

b2 7748342069098114922 12

c2 6396338132881246243 12

d2 17270652676284938484 12

e2 3451001275122215817 12

f2 7225251425809932768 12

g2 1503419863794000794 12

h2 14685175808191312949 13

a3 4158265960666958673 13

b3 4158265960666958673 12

c3 13032946816902618242 12

d3 14705383267963762421 12

e3 9241527724179543109 12

f3 5502914338636096040 12

g3 5429984680954171549 12

h3 683733530782468012 13

a4 9941963182310300702 13

b4 1206233330754133353 12

c4 11095247725808078876 12

a1 3439443032492089405 14

b1 3135786584331814519 13

c1 8619110724855342808 13

d1 7848207856287267137 13

e1 3811733587375915473 13

f1 8762675609952735468 13

g1 15700248209820441285 13

h1 17752752118909405489 14

a2 5695838162267683422 13

b2 7748342069098114922 12

c2 6396338132881246243 12

d2 17270652676284938484 12

e2 3451001275122215817 12

f2 7225251425809932768 12

g2 1503419863794000794 12

h2 14685175808191312949 13

a3 4158265960666958673 13

b3 4158265960666958673 12

c3 13032946816902618242 12

d3 14705383267963762421 12

e3 9241527724179543109 12

f3 5502914338636096040 12

g3 5429984680954171549 12

h3 683733530782468012 13

a4 9941963182310300702 13

b4 1206233330754133353 12

c4 11095247725808078876 12

- hgm
**Posts:**23613**Joined:**Fri Mar 10, 2006 9:06 am**Location:**Amsterdam**Full name:**H G Muller-
**Contact:**

### Re: Comparison of bitboard attack-getter variants

Indeed. That is why I said doing it by hand would be a lot faster. Even a lot faster thanHenk wrote:Otherwise I can wait for inifinity before it completes.

*writing*the program to do it automatically.

### Re: Comparison of bitboard attack-getter variants

Just encountered a bug for diagonal moves. Combination with all bits enabled was not generated so my engine crashed for that entry was not in the table.

Same should hold for straight moves too but strangely engine did not crash.

So I have to repair the bug and start all over again with generating magic numbers.

If you don't believe in magic ...

Same should hold for straight moves too but strangely engine did not crash.

So I have to repair the bug and start all over again with generating magic numbers.

If you don't believe in magic ...