Gcc gives me a nice even sizeof 16 without any explicit packing.
Move structure and hash tables
Moderator: Ras
-
gaard
- Posts: 463
- Joined: Mon Jun 07, 2010 3:13 am
- Location: Holland, MI
- Full name: Martin W
Re: Move structure and hash tables
-
Ras
- Posts: 2729
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: Move structure and hash tables
Maybe the problem is that the bitfield base types for the first eight bytes are incompatible. You could try it like this:gaard wrote: ↑Sun Oct 24, 2021 4:05 pmCode: Select all
struct TTEntry { u64 key : 48 = 0; u8 bound : 2 = 0; u8 age : 6 = 0; u8 depth; i16 score, value; u32 move; }
Code: Select all
struct TTEntry {
u64 key : 48 = 0;
u64 bound : 2 = 0;
u64 age : 6 = 0;
u64 depth : 8;
i16 score, value;
u32 move;
}Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
diep
- Posts: 1822
- Joined: Thu Mar 09, 2006 11:54 pm
- Location: The Netherlands
Re: Move structure and hash tables
Well didn't follow latest developments here yet oldie diep - you really can mess up with hashtables. Boy the 2012 diep sure i f'ed up there with yet another hashtable attempt in how to store where and how to probe. Idea was good implementation sucked. And it all is al ot of work.gaard wrote: ↑Wed Oct 20, 2021 9:24 pm I see a lot of effort going towards packing a move in 16 bits to save space in the hash table or even for general use. But then you need special functions to determine, with the board state, whether a move is castling (although this one may be as simple as abs(orig - dest) == 2), en passant, capturing, promotion piece, etc. I suppose some of those can be encoded in the remaining 4 bits of a 16-bit move, but usually not all of them. An extra 16 bits also leaves room for flags like direct check, reveal check, etc. Has anyone every measured performance between the two schemes, that is, 16-bit move versus 32-bit. Is there any reason to believe the difference is worth more than a couple of Elo?
Yet the structure itself. It has to do with how the RAM works. Not sure about DDR4 there, but ddr3 can abort a read after reading just 32 bytes.
So if you do several probes and you manage to keep it shorter there by quite some bytes it is worth it.
If i look in diep's source i see that every year basically i toyed with a new hashtable implementation for the transpositiontable (not to confuse with evaluation table nor pawn hashtable nor other hashtables). In 2006,2007,2008 i see different versions
Here is how i stored for example in 2008:
dword1 = 32: hash
dword2 = 20:score 8:age 4:flags
dword3 = 21:sicfm algorithm 6: sicfm-delta-depth 5:reductionsdone
dword3a = 20:sortscore 12:empty
dword4 = 20:eval 6:movdepth 6:scoredepth
dword5 = 15:move 2:bound 15:hash (inverse side)
flags : matethreat, pvsingularmove, egtbscore, SideToMove
bound: istruebound, isbetabound
decoding: dword1 = dword1' XOR dword4;
dword5 = dword2' XOR dword5;
dword2,dword3,dword4 get stored unchanged of course
So in total it's 6 dwords of 32 bits. In total 24 bytes per entry. In that 2008 implementation i'm doing 8 probes.
So that's a total of 24 x 8 = 160+32 = 192 bytes in total. With DDR3 that's 3 cachelines obviously.
The sortscore is stored because for example in alfa nodes (nodes where score was <= alfa) we want to sort better. It's pretty complex to get that well to work but can deliver you hundreds of elopoints with an engine - and sure against some significant slowdown linearly.
the sicfm algorithm i invented start this century and works way more complex. Also improves the sorting. It has to do with detecting draw scores and avoiding storing drawscore bugs into hashtable. And if you do then annotate you do it. Tricky to get bugfree. Yet also would improve sorting code a lot.
The rest you guys can work out.
p.s. note that the reason why many engines that forward prune last 9 plies or so in pretty hard manner by simply doing
if( depthleft <= 9 && !incheck && eval >= beta ) then return eval;
That hard forward pruning doesn't mess up your sorting further back in the search as you get a pretty good score estimate back. So it improves move ordering versus many engines that do not forward prune yet use a nullmove there. The nullmove does mess up sorting a lot - even though it's better to do than hard forward pruning. Yet you need to correct your sortscores then. It's a mixture of hashtables, nullmove scores and draw scores you have to correct in <= alfa nodes.