Move structure and hash tables

Discussion of chess software programming and technical issues.

Moderator: Ras

Sopel
Posts: 392
Joined: Tue Oct 08, 2019 11:39 pm
Full name: Tomasz Sobczyk

Re: Move structure and hash tables

Post by Sopel »

hgm wrote: Sat Oct 23, 2021 4:33 pm You mean like warning people who call posters names just because these have an opinion they don't agree with?
Saying that 30% of stockfish's runtime is spent on move generation is not an opinion. It's a wrong statement trying to derail the discussion towards his perft generator (which is completely irrelevant to this thread). He does it in every thread possible, and it is killing the forum. "Go into their discord and look it up - or ask them." is also a useless and offensive response.
dangi12012 wrote:No one wants to touch anything you have posted. That proves you now have negative reputations since everyone knows already you are a forum troll.

Maybe you copied your stockfish commits from someone else too?
I will look into that.
User avatar
hgm
Posts: 28443
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Move structure and hash tables

Post by hgm »

Well, so you just point out that is for perft, and nobody cares about that, which would make him look pretty stupid. A handfull of tangential postings is not 'derailing a discussion', and a single wrong or easily misunderstood claim in a posting that did address the topic (move encoding) even less so.. If it gets too much follow-up it becomes thread hijacking, and the discussion thread can be split. But as it was the discussion continued on the original topic. 'Go look it up' is not an insult in the sense of the charter.
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Move structure and hash tables

Post by dangi12012 »

chrisw wrote: Sat Oct 23, 2021 5:14 pm
gaard wrote: Wed Oct 20, 2021 9:24 pm I see a lot of effort going towards packing a move in 16 bits to save space in the hash table or even for general use. But then you need special functions to determine, with the board state, whether a move is castling (although this one may be as simple as abs(orig - dest) == 2), en passant, capturing, promotion piece, etc. I suppose some of those can be encoded in the remaining 4 bits of a 16-bit move, but usually not all of them. An extra 16 bits also leaves room for flags like direct check, reveal check, etc. Has anyone every measured performance between the two schemes, that is, 16-bit move versus 32-bit. Is there any reason to believe the difference is worth more than a couple of Elo?
It makes little difference to performance.
Another disadvantage of 16-bit, apart from having to write more fiddly code for determining the characteristics of the move, is that you also start running into problems between finding move character before MakeMove() and after it, because you’re dependent on board state (after MakeMove, the moving piece type had to be got from sq-to, not sq-from and so on and so on). Or maybe for example you need to know something about the move at nply-1, or nply-2, well, you’ll also need a boardstate history, not just a move history.
For an engine in development where you probably can’t predict all possible move informations you might need, then 32 bit with OTT info would seem desirable. After the engine is fully complete, then think about cutting back to 16-bit, but I’ld predict it would by way down the list of important things to try and optimize.
I also like to add that the difference between 16bit or a bigger format will not make any difference. Since you probably need to store other metadata like the evaluation etc as well.

What many people dont think about is that X86 also has "Non Temporal" memory store which bypasses the memory hirarchy. I once wrote a raytracer where storing to memory is 2x faster and I know from a colleague that BLAS uses these as well:

MOVDQA xmmi, m128 vs MOVNTDQA xmmi, m128
https://lwn.net/Articles/255364/

So if you store - and not immediately read back from the same memory location this is a huge performance opportunity.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
User avatar
hgm
Posts: 28443
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Move structure and hash tables

Post by hgm »

It would require some redesign of TT architecture to make effective use of nontemporal memory access. Schemes that can only store a TT entry in a single location do not perform very well, as these offer no way for protecting the more valuable entries with large search depth. So an entry can be in multiple locations, meaning that multiple locations have to be probed. These are then all mapped to the same cache line. So the entire TT bucket will already be loaded into cache during the probing. And then a nontemporal store cannot bypass it, but will just dirty the cache line.

Another problem is that entries usually do not fit into a single machine word.

A possible solution is to store part of the hash signatures (e.g. 8 bits), possibly together with depth and aging info, in a single word. Then this word could be tested with a nontemporal read, and it would make it clear whether the sought entry can be in the table and where, or which entry should be replaced. This entry coul then be read and written nontemporally.
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Move structure and hash tables

Post by dangi12012 »

hgm wrote: Sat Oct 23, 2021 9:45 pm So the entire TT bucket will already be loaded into cache during the probing. And then a nontemporal store cannot bypass it, but will just dirty the cache line.
Correct on all points except that non temporal writes are made to bypass any and all cache hirarchy and go directly to memory. Exactly for the reason to never evict good data because you write other unrelated data.
Non temporal is applicable where you write data and read back from that buffer later in the code. ie. not immediately.

An excellent summary on that topic can be found here.
http://www.nic.uoregon.edu/~khuck/ts/ac ... 05s03.html

Its a way of talking to the processor directly:
"a non-temporal store also hints to the processor that the program intends to write the entire cache line, completely replacing the current content, so that the cache line does not need to first be fetched from memory"
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: Move structure and hash tables

Post by gaard »

Henk wrote: Fri Oct 22, 2021 8:34 pm Yes I regret storing move in 16 bits. Just make debugging as easy as possible. At least annoying.
Why all these tricks for a few elo. If you can't invent something better than just resign.
I decided to shave the key size down to 48 bits using the u64 key : 48; trick and it seems to be working. I'm sure there some hidden overhead but it's just a few more lines of code and it looks like it gets me down to a 16-byte tt entry from 18. I suppose storing the whole key is redundant anyways, when the lower bits are already used as index.
gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: Move structure and hash tables

Post by gaard »

gaard wrote: Sat Oct 23, 2021 11:10 pm
Henk wrote: Fri Oct 22, 2021 8:34 pm Yes I regret storing move in 16 bits. Just make debugging as easy as possible. At least annoying.
Why all these tricks for a few elo. If you can't invent something better than just resign.
I decided to shave the key size down to 48 bits using the u64 key : 48; trick and it seems to be working. I'm sure there some hidden overhead but it's just a few more lines of code and it looks like it gets me down to a 16-byte tt entry from 18. I suppose storing the whole key is redundant anyways, when the lower bits are already used as index.
Gcc lete get away with it, even allowing me to initialize the fields on construction, but msvc said no way. Something about bit fields of different types. Is there any real grave danger to just using a 32-bit key without clusters?
User avatar
hgm
Posts: 28443
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Move structure and hash tables

Post by hgm »

dangi12012 wrote: Sat Oct 23, 2021 10:59 pm Correct on all points except that non temporal writes are made to bypass any and all cache hirarchy and go directly to memory. Exactly for the reason to never evict good data because you write other unrelated data.
Non temporal is applicable where you write data and read back from that buffer later in the code. ie. not immediately.
If the data is already cached the write cannot alter the memory, because then the cache woul hold an obsolete value. Which would be returned instead of the correct one on a later read.
User avatar
Ras
Posts: 2730
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Move structure and hash tables

Post by Ras »

gaard wrote: Sun Oct 24, 2021 3:51 amGcc lete get away with it, even allowing me to initialize the fields on construction, but msvc said no way. Something about bit fields of different types.
Can you show the code for the type declaration of the bitfield and the assignment?
Rasmus Althoff
https://www.ct800.net
gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: Move structure and hash tables

Post by gaard »

Ras wrote: Sun Oct 24, 2021 11:02 am
gaard wrote: Sun Oct 24, 2021 3:51 amGcc lete get away with it, even allowing me to initialize the fields on construction, but msvc said no way. Something about bit fields of different types.
Can you show the code for the type declaration of the bitfield and the assignment?
Something like

struct TTEntry {
u64 key : 48 = 0;
u8 bound : 2 = 0;
u8 age : 6 = 0;
u8 depth;
i16 score, value;
u32 move;
}