I'm a fan of copy / make. Undoing a move is a simple decrement: _positionIndex--;
spirch wrote: ↑Sat Mar 20, 2021 5:37 pm
this is not used for actual move which is something completely different, i do bitwise but i store way less information
You've described the purpose of 63 bits above and yet it's not used to play the move? If that's the case you have a lot of duplication. My advice is to reduce that down to a single 64 bit structure. Or a single 32 bit structure if you wish to handle history heuristic elsewhere.
spirch wrote: ↑Sat Mar 20, 2021 5:37 pm
also, my board is 0x88 which might affect how i do my packing, max position is 119 which need 7 bits to store
Bitboard or 0x88 board should not matter re: the move encoding scheme. Meaning, an efficient move-encoding scheme will help a bitboard engine or a mailbox engine.
"Max position is 119 which need 7 bits to store." I don't understand that sentence at all. What is max position?
Joost Buijs wrote: ↑Sat Mar 20, 2021 4:48 pmIn the past I used 8 bit from and 8 bit to as well, the scheme I use now gives me some extra bits to play with for future extensions. And it doesn't make any difference in speed on modern hardware with a modern compiler, at least I can't measure it. Al this speed stuff is nonsense anyway, it really doesn't matter if your engine is 1% slower or faster. All that matters is evaluation, this is where you can gain hundreds of Elo points.
It was different in the 80's when computers were too slow to be tactically sufficient, but the times of the 2 MHz. Intel 8080 are long gone.
All very true. But it is also a matter of code simplicity. Your posting really pre-empted mine, which I was writing in reply to the OP. But to comment on your format: if the struct thakes 32 bit anyway, why not just take 8 bits for the from- and to-square each, and put the 4 info bits in the 4th byte? Then you can keep the bit-twiddling out of the code. Even if it matters only an utterly irrelevant 1%, why do the slower thing if it is just as easy (if not easier) to do the faster thing.
What has always bothered me is that because iof the issue of endienness I cannot define a union { int all, struct { char from, to, flags, key;} } and rely on it that the key ends up in the high byte, so I can justs ort the moves as int. I still have to write ugly things like from = move >> 8 & 255 to access the byte that holds the from-square, even if the compiler completely optimizes that away.
The decoding of the moves in my format goes very natural:
to = move & 255;
if(to & 64) { // is special move
// rarely executed code
switch(moveDecode[to-64].type) {
case DOUBLEPUSH:
case CASTLING:
case EP:
case PROMOTION:
}
to = from + moveDecode[to-64].step; // true destination
}
I use bit-fields and let the compiler handle these, in the past this was slow, nowadays the compilers seem to produce highly optimized code for bit-fields. The difference in speed is so small (if there is any) that I can't measure it, it is very difficult to measure exact speed differences on modern processors, even when you turn off turbo-boost (precision-boost on AMD) the difference between each run will be several percent. Multi-threading doesn't help either.
emadsen wrote: ↑Sat Mar 20, 2021 5:48 pm
"Max position is 119 which need 7 bits to store." I don't understand that sentence at all. What is max position?
position 119 is H8
so if i want to store a move: from 71(h5) to 119(h8) i need to be able to store that 119 into bit, it look like : 1110111
spirch wrote: ↑Sat Mar 20, 2021 5:37 pmalso, my board is 0x88 which might affect how i do my packing, max position is 119 which need 7 bits to store
Most of my engines also use 0x88-like square numbering. That means the moveDecode table is also indexed as 0x88. But that just means I can store one data item for a given index at table[n], and another at table[n+8].
spirch wrote: ↑Sat Mar 20, 2021 5:47 pmfor castling, you need to figure out which rook was used
for en passant you need to figure out where was the pawn
No need to 'figure out' anything. The Rook origin and destination, the location of the e.p. victim, it is all in the moveDecode table. Extracting that information from the packed info requires you to make a copy of that info, shift it, and mask out the wanted part. That is 3 instructions. Which you would have to repeat for every other data item. That counts as 'figuring out'. Getting it them from the table would just be a single byte-size memory load, using an index that was already in a register.
thanks, that will be useful to clean up some messy manual inlining / duplicated code all over the place haha
That's the least important feature of the code I shared. The C# compiler is smart and inlines much on its own. Though I did find explicit use of the AggressiveInlining attribute on small methods in the hot code path helps, marginally. Very marginally.
The most important aspect of the code I shared is a single move encoding scheme that solves three problems: 1) stores characteristics necessary to peform the move on the board, 2) stores characteristics necessary to make reduction and pruning decisions about the move, and 3) stores scoring necessary to prioritize the move for proper move ordering. All in one 64 bit ulong.
As I said before, I sidestep the complications of undoing a move via copy / make. I simply decrement the position index. In my personal experience, the code is simpler and it's faster than mechanically undoing the move and correcting board state. Your mileage may vary depending on your implementation.