Move structure and hash tables

Discussion of chess software programming and technical issues.

Moderator: Ras

gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: Move structure and hash tables

Post by gaard »

gaard wrote: Sun Oct 24, 2021 4:05 pm
Ras wrote: Sun Oct 24, 2021 11:02 am
gaard wrote: Sun Oct 24, 2021 3:51 amGcc lete get away with it, even allowing me to initialize the fields on construction, but msvc said no way. Something about bit fields of different types.
Can you show the code for the type declaration of the bitfield and the assignment?
Something like

struct TTEntry {
u64 key : 48 = 0;
u8 bound : 2 = 0;
u8 age : 6 = 0;
u8 depth;
i16 score, value;
u32 move;
}
Gcc gives me a nice even sizeof 16 without any explicit packing.
User avatar
Ras
Posts: 2729
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Move structure and hash tables

Post by Ras »

gaard wrote: Sun Oct 24, 2021 4:05 pm

Code: Select all

struct TTEntry {
    u64 key : 48 = 0;
    u8 bound : 2 = 0;
    u8 age : 6 = 0;
    u8 depth;
    i16 score, value;
    u32 move;
}
Maybe the problem is that the bitfield base types for the first eight bytes are incompatible. You could try it like this:

Code: Select all

struct TTEntry {
    u64 key : 48 = 0;
    u64 bound : 2 = 0;
    u64 age : 6 = 0;
    u64 depth : 8;
    i16 score, value;
    u32 move;
}
Also, please note that this way of initialisation requires C++20. Neither earlier C++ nor C can do it this way. I would keep data type declaration and initialisation separate anyway to avoid superfluous initialisation.
Rasmus Althoff
https://www.ct800.net
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Move structure and hash tables

Post by diep »

gaard wrote: Wed Oct 20, 2021 9:24 pm I see a lot of effort going towards packing a move in 16 bits to save space in the hash table or even for general use. But then you need special functions to determine, with the board state, whether a move is castling (although this one may be as simple as abs(orig - dest) == 2), en passant, capturing, promotion piece, etc. I suppose some of those can be encoded in the remaining 4 bits of a 16-bit move, but usually not all of them. An extra 16 bits also leaves room for flags like direct check, reveal check, etc. Has anyone every measured performance between the two schemes, that is, 16-bit move versus 32-bit. Is there any reason to believe the difference is worth more than a couple of Elo?
Well didn't follow latest developments here yet oldie diep - you really can mess up with hashtables. Boy the 2012 diep sure i f'ed up there with yet another hashtable attempt in how to store where and how to probe. Idea was good implementation sucked. And it all is al ot of work.

Yet the structure itself. It has to do with how the RAM works. Not sure about DDR4 there, but ddr3 can abort a read after reading just 32 bytes.
So if you do several probes and you manage to keep it shorter there by quite some bytes it is worth it.

If i look in diep's source i see that every year basically i toyed with a new hashtable implementation for the transpositiontable (not to confuse with evaluation table nor pawn hashtable nor other hashtables). In 2006,2007,2008 i see different versions :)

Here is how i stored for example in 2008:

dword1 = 32: hash
dword2 = 20:score 8:age 4:flags
dword3 = 21:sicfm algorithm 6: sicfm-delta-depth 5:reductionsdone
dword3a = 20:sortscore 12:empty
dword4 = 20:eval 6:movdepth 6:scoredepth
dword5 = 15:move 2:bound 15:hash (inverse side)
flags : matethreat, pvsingularmove, egtbscore, SideToMove
bound: istruebound, isbetabound

decoding: dword1 = dword1' XOR dword4;
dword5 = dword2' XOR dword5;
dword2,dword3,dword4 get stored unchanged of course

So in total it's 6 dwords of 32 bits. In total 24 bytes per entry. In that 2008 implementation i'm doing 8 probes.
So that's a total of 24 x 8 = 160+32 = 192 bytes in total. With DDR3 that's 3 cachelines obviously.

The sortscore is stored because for example in alfa nodes (nodes where score was <= alfa) we want to sort better. It's pretty complex to get that well to work but can deliver you hundreds of elopoints with an engine - and sure against some significant slowdown linearly.

the sicfm algorithm i invented start this century and works way more complex. Also improves the sorting. It has to do with detecting draw scores and avoiding storing drawscore bugs into hashtable. And if you do then annotate you do it. Tricky to get bugfree. Yet also would improve sorting code a lot.

The rest you guys can work out.

p.s. note that the reason why many engines that forward prune last 9 plies or so in pretty hard manner by simply doing
if( depthleft <= 9 && !incheck && eval >= beta ) then return eval;

That hard forward pruning doesn't mess up your sorting further back in the search as you get a pretty good score estimate back. So it improves move ordering versus many engines that do not forward prune yet use a nullmove there. The nullmove does mess up sorting a lot - even though it's better to do than hard forward pruning. Yet you need to correct your sortscores then. It's a mixture of hashtables, nullmove scores and draw scores you have to correct in <= alfa nodes.