hashtables

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Sven
Posts: 3833
Joined: Thu May 15, 2008 7:57 pm
Location: Berlin, Germany
Full name: Sven Schüle
Contact:

Re: hashtables

Post by Sven » Tue Oct 15, 2013 3:11 pm

rtitle wrote:No. If I'm using it only for move ordering, I'm calling the move generator anyway, so storing the index is sufficient. Similarly if I were storing evals and using the stored eval for improved alpha/beta bounds, I'd be calling the move generator anyway. The only case where you can *save* a call to the move generator by storing the full move is when you can actually re-use the previously computed eval and *not* do the search (i.e. the previous eval was from a search to the appropriate depth and its fail-high/fail-low flags don't prevent you from re-using it). I don't think this is as common a scenario as you guys think.

I actually used to do it the more common way, then I switched to my "minimal TT entry" approach as an experiment. Experimentation revealed it is beneficial in some situations and detrimental in others. It's not as clear-cut as you are claiming. It depends on the position. In an endgame where there are a lot of transpositions, a TT that stores rich information is beneficial. In a complex middlegame position, it doesn't pay off that much, i.e. you don't have as many true transpositions. Also it depends how deep you're searching. In fast time controls, you probably won't fill up a billion-entry TT table anyway, so a smaller-number-of-entries/richer-entries approach is better. But for deep analysis you do want to keep billions of entries and my "minimal TT entry" approach starts to pay off.

Rich
No :-)

If you store the best move in the TT you can save the move generator call, you only need to validate that the TT move is actually (pseudo-)legal in the current position to avoid a crash. Then you simply search the TT move without ever calling the move generator, and only if the TT move does not cause a cutoff you continue with normal search by calling the move generator, not forgetting to remove the TT move from the move list to avoid searching it twice.

That's what almost everyone does at least.

And I do not believe that always calling the move generator is really faster than doing what I wrote above but with a slightly larger TT entry size (9 instead of your 8 bytes) and therefore a slightly smaller number of TT entries (90% compared to your case).

Sven

rtitle
Posts: 33
Joined: Wed Aug 14, 2013 5:23 pm

Re: hashtables

Post by rtitle » Tue Oct 15, 2013 3:50 pm

I was tempted to see how many times we could say "No" to each other. :-) But upon reflection, you are right. I didn't think of the best-move-causes-a-cutoff case.

You say I have to validate it's legal. Is this because of possible hash collisions?

I'd still like to keep my TT entries at 8 bytes, both for easier arithmetic and for atomic reads & writes. As I pointed out in a different thread (pun intended), what most chess engines are doing - unsynchronized access to TT entries from different threads - is not safe unless you can read & write the entries atomically. However, I might be able to store the move and still keep the entry within 8 bytes.

Rich

Sven
Posts: 3833
Joined: Thu May 15, 2008 7:57 pm
Location: Berlin, Germany
Full name: Sven Schüle
Contact:

Re: hashtables

Post by Sven » Tue Oct 15, 2013 4:08 pm

rtitle wrote:You say I have to validate it's legal. Is this because of possible hash collisions?
Yes ;-) Or because of bugs or other irregularities, and also simply to be safe.
rtitle wrote:unsynchronized access to TT entries from different threads - is not safe unless you can read & write the entries atomically.
A lot has been written about that topic. You might want to experiment with lockless hashing (see here, for instance), and you might also be interested in a paper published by Bob Hyatt and Anthony Cozzie about the effect of hash collisions in a computer chess program. I also remember a quite recent CCC thread about atomicity.
rtitle wrote:However, I might be able to store the move and still keep the entry within 8 bytes.
Yes, e.g. with 32 bit hash lock + 16 bit best move + 16 bit other data (your two depths), if you do not store any score-related information in the TT entry.

Sven

jdart
Posts: 3924
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: hashtables

Post by jdart » Tue Oct 15, 2013 5:26 pm

In an endgame where there are a lot of transpositions, a TT that stores rich information is beneficial. In a complex middlegame position, it doesn't pay off that much
But when it does pay off, it pays off a lot, saving you from re-searching potentially thousands of nodes. Also, many chess programs visit the same node repeatedly at different depths due to things like internal iterative deepening.

--Jon

rtitle
Posts: 33
Joined: Wed Aug 14, 2013 5:23 pm

Re: hashtables

Post by rtitle » Tue Oct 15, 2013 6:16 pm

Yes, I've thought about internal iterative deepening as a way to deal with the engine "hitting a wall" on deep searches when the TT fills up and it's no longer getting the benefit of finding best-move info at nodes near the leaves. (Because those nodes are not getting inscribed in the already-full TT). I didn't know the IID idea is implemented in other engines. I'm writing my engine from scratch and haven't looked at other engines.

I was thinking that if I implement some kind of IID scheme, the internal search would use its own temporary smaller TT off to the side. One thing I want to avoid is displacing "high value" TT entries in the main TT (those near the root & based on deep searches) with lower value TT entries (i.e. for nodes near the leaves). Seems like if the internal iterative deepening search is sharing the same TT, it'd be doing exactly that, right? Or am I misunderstanding how other engines implement it?

Rich

jdart
Posts: 3924
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: hashtables

Post by jdart » Tue Oct 15, 2013 7:52 pm

One thing I want to avoid is displacing "high value" TT entries in the main TT (those near the root & based on deep searches) with lower value TT entries (i.e. for nodes near the leaves).
Doesn't happen, because the hashtable code typically implements a replacement scheme that keeps entries which have the same signature but greater depth.

--Jon

rtitle
Posts: 33
Joined: Wed Aug 14, 2013 5:23 pm

Re: hashtables

Post by rtitle » Wed Oct 16, 2013 1:03 am

jdart wrote:
One thing I want to avoid is displacing "high value" TT entries in the main TT (those near the root & based on deep searches) with lower value TT entries (i.e. for nodes near the leaves).
Doesn't happen, because the hashtable code typically implements a replacement scheme that keeps entries which have the same signature but greater depth.

--Jon
OK, but in that case I'm not clear on what internal iterative deepening is buying you. If the TT is almost full, the internal searches will write almost no entries (because they're shallow searches). That's why I had planned to have my internal searches (done for move-ordering) operate with small temporary TT's separate from the main TT.

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 2:27 pm

Re: hashtables

Post by Don » Wed Oct 16, 2013 1:13 am

flok wrote:Hi,

According to http://chessprogramming.wikispaces.com/ ... s%20Stored you need to store not only hash, depth and score but some other parameters well.

Apart from the "age", why would I do that? Shouldn't it work as well to when I enter a node, either look-up the hash for that node and then return the coupled score for that node (if the depth for that tt-entry is bigger than what can be reached now) or calculate the tree below it and store the hash/score(/age)?

regards
With alpha beta pruning the score return from a node is usually not exact. It would be exact if you used pure mini-max without alpha/beta pruning.

For example, if a move returns a score greater than or equal to beta it is called a beta cutoff. You don't search the rest of the moves so you don't really know how good the score really might be for the parent position. So the score is a LOWER BOUND because you know only that it is AT LEAST what was returned by the move that created the cutoff. If you put that in the hash table you must also make a notation that the score is a lower bound so that it's used correctly elsewhere. If the best move is below alpha, the same reasoning applies. You know that it's an UPPER BOUND, it cannot be better than that but it might be much worse.

If the score returned from a searched node is between alpha and beta but not alpha or beta it's a trustworthy score. In a sense it is BOTH a lower and an upper bound - it's an exact score.


Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.

Daniel Shawul
Posts: 3868
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: hashtables

Post by Daniel Shawul » Wed Oct 16, 2013 1:27 am

This reminds me of a similar case for programming chess on GPUs. Because of memory constraints, it was not possible to save the whole move list , say int[256], on shared memory so I used a similar method as richard's indexing. What I had was a function that generates a specific move i, gen(i), that skips all the moves up to i-1 doing as less work as possible and then generating that move_i. Well there is no doubt that this is much slower than saving the whole move list, but I didn't find a better approach at the time. Anyway for hashtables saving the move index only indeed is not better overall as you would have to call movegen every time.

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 2:27 pm

Re: hashtables

Post by Don » Wed Oct 16, 2013 1:46 am

Daniel Shawul wrote:This reminds me of a similar case for programming chess on GPUs. Because of memory constraints, it was not possible to save the whole move list , say int[256], on shared memory so I used a similar method as richard's indexing. What I had was a function that generates a specific move i, gen(i), that skips all the moves up to i-1 doing as less work as possible and then generating that move_i. Well there is no doubt that this is much slower than saving the whole move list, but I didn't find a better approach at the time. Anyway for hashtables saving the move index only indeed is not better overall as you would have to call movegen every time.
In the old days of 8 bit programs Sargon used an interesting scheme for storing the book compactly. Each move was represented in 6 bits as an offset into a deterministic ordered set of legal moves and the other 2 bits were considered left and right parenthesis. So a sequence of bytes could represent a tree structure easily.

You could do better than that by having a move generator that generates moves in a GOOD move order, perhaps using a 1 ply search and then huffman coding based on it's position.

I often wondered if you could have an incredibly compact book by doing deeper searches which produced moves with a VERY good move ordering which was deterministic. The move the computer chooses is the first move chosen by default up to whatever point the book ends. Then the book is stored as a set of exceptions to the default (and a stopping point) instead of the actual moves themselves.

A similar idea is to use this special generator in conjunction with a bloom filter. I have not worked out any details - just random daydreams.

It's not that important these days of course since book size is just not a limitation these days, but it's still fun to think about. Another fun problem is what is the most compact representation for storing chess positions? Fen notation is more compact than an array, but falls far short of how compact one could actually get.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.

Post Reply