Whoa! 800 kB for rooks, and less than _50_ kB for bishops!
I think better technique is to use a separate table for Rank and File Attacks. Rank attacks need no multiplication, and I've figured out that with a clever implementation, they take only 0.5 kB. File attacks need magic multiplication but constant shifts with 57, and they're are only 8 different magic no.s you need. Besides, they can be computed at compile time, because successive magic is just two multiplied by previous magic. With this, rook tables need less than 0.5 + 8 * ( 8 * 64) = 4.5 kB.
What do you think?
On Rook tables in magic move generation
Moderators: hgm, Rebel, chrisw
-
- Posts: 646
- Joined: Wed Jun 18, 2014 2:30 pm
- Full name: Fahad Syed
-
- Posts: 27809
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: On Rook tables in magic move generation
This technique is known as 'kindergarten bitboards'. It requires more operations, because you have to calculate ranks and files separately. But you can hope to earn that back by using smaller tables, which fit in the L1 cache, so that you can do the multiple accesses (for ranks and files) to them in parallel, which you could not do to tables in L2.
An even less memory-consuming technique is to use 8-fold rotated bitboard, where the various rays are always stored contiguously in the direction of travel, and then use carry propagation to generate the accessible squares. This doesn't require any table lookups at all.
An even less memory-consuming technique is to use 8-fold rotated bitboard, where the various rays are always stored contiguously in the direction of travel, and then use carry propagation to generate the accessible squares. This doesn't require any table lookups at all.
-
- Posts: 646
- Joined: Wed Jun 18, 2014 2:30 pm
- Full name: Fahad Syed
Re: On Rook tables in magic move generation
Then I'm yet again in a quest for an original ideahgm wrote:This technique is known as 'kindergarten bitboards'. It requires more operations, because you have to calculate ranks and files separately. But you can hope to earn that back by using smaller tables, which fit in the L1 cache, so that you can do the multiple accesses (for ranks and files) to them in parallel, which you could not do to tables in L2.
An even less memory-consuming technique is to use 8-fold rotated bitboard, where the various rays are always stored contiguously in the direction of travel, and then use carry propagation to generate the accessible squares. This doesn't require any table lookups at all.
I find hyperbola quintessence with reverse bits algorithm far the most easiest way to generate attack masks. But reversing bits is a slow task and hardware intrinsics aren't available for it, but reversing is only required once in rank attacks.
I think now I'd find an equation whose root is attack mask of a given piece and occupancy. Then use Newton's method or householder's iteration. Just kidding.
-
- Posts: 589
- Joined: Tue Jun 04, 2013 10:15 pm
Re: On Rook tables in magic move generation
I was figuring that two orientations suffice for this, because you can ripple over intermediate bits if you clear (or set) them first. Still feels like a hassle.hgm wrote:An even less memory-consuming technique is to use 8-fold rotated bitboard, where the various rays are always stored contiguously in the direction of travel, and then use carry propagation to generate the accessible squares. This doesn't require any table lookups at all.
[Account deleted]
-
- Posts: 27809
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: On Rook tables in magic move generation
Interesting idea. What I like is that it would also work for oblique directions, e.g. Nightriders. And it isn't that much of a hassle. It just needs to apply the usual mask before as well as after the subtraction. And it is relatively cheap to apply to larger boards.
-
- Posts: 27809
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: On Rook tables in magic move generation
The way I had in mind (but so fare never applied) was to store the boards as individully addressible rays in 4 different directions. This would not put such a cramped limit on board size. Each ray would contain both a normal and bit-reversed copy of the occupation. Even on a 32-bit machine this could work upto 16x16 boards. The O^(O-2*R) trick could then be applied in both directions along the ray simultaneously. There would never be any need to actually reverse bits.
You would just need 4 board-size tables to indicate which rays from the ray set go through each square, and whne aan occupation changes you would have to update all rays through it.
You would just need 4 board-size tables to indicate which rays from the ray set go through each square, and whne aan occupation changes you would have to update all rays through it.