Crafty Transpostion Table Question

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty Transpostion Table Question

Post by bob »

LiquidNitrogenOverclocker wrote:
bob wrote: Any of the above will work just fine, but (1) introduces overhead for the validation procedure, while 2 introduces tons of memory/cache issues as well as stalling processes when there really is no conflict. I just took the solution with the least impact on NPS since all work equally well.
I had to read over this a few times. Then the lightbulb went off.

That's awesome!

:)
:)
FrancoisK
Posts: 80
Joined: Tue Jul 18, 2006 10:46 pm

Re: Crafty Transpostion Table Question

Post by FrancoisK »

In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.

François
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty Transpostion Table Question

Post by bob »

FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.

François
My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.
FrancoisK
Posts: 80
Joined: Tue Jul 18, 2006 10:46 pm

Re: Crafty Transpostion Table Question

Post by FrancoisK »

bob wrote:
FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.

François
My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.
yes, that is indeed what I understood from your first post.
I am just saying that I chose (1) myself and that i am pretty sure the overhead you are talking about in your last post is completely negligible ("(1) introduces overhead for the validation procedure").
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty Transpostion Table Question

Post by bob »

FrancoisK wrote:
bob wrote:
FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.

François
My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.
yes, that is indeed what I understood from your first post.
I am just saying that I chose (1) myself and that i am pretty sure the overhead you are talking about in your last post is completely negligible ("(1) introduces overhead for the validation procedure").
To properly validate a move in Crafty, it was > 1% overhead, which is not something I would ignore. And when it provides a little more safety in hashing when doing SMP searches, it is even better. The "lockless hash" idea introduces one extra XOR on a store, and on a probe. That's cheaper than any sort of validation procedure I could imagine, by orders of magnitude
User avatar
phhnguyen
Posts: 1434
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Crafty Transpostion Table Question

Post by phhnguyen »

bob wrote: SImple. This is the "lockless hashing" idea for parallel search.

The issue is that different threads can store into the same hash table address at approximately the same time. There are two 64 bit values, the signature (A) and the score/etc (B). If two processors try to store at the same time, you want to get eihere A1,B1 or A2,B2 stored in the table, but due to timing you can get A1,B2 or A2, B1 which is a problem.

If, when you store A1, B1, you xor them to modify the stored signature, and when you probe that entry, you xor again to recover the original signature, you eliminate the possibility of getting a match on A1, but getting data B2 which would be wrong.

The lockless hashing paper is on my web page at www.cis.uab.edu, where you click on faculty then my name...

If you don't do this, you have to use a normal Lock()/Unlock() which is way excessive overhead due to all the hash storing and probing going on.

Hi Bob,

I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.

What is your opinion / experience? Many thanks.
Pham
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: Crafty Transpostion Table Question

Post by wgarvin »

phhnguyen wrote:I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.
The XOR trick is probably cheaper. Anyway, it is incredibly cheap, you won't notice any impact on performance at all.
User avatar
phhnguyen
Posts: 1434
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Crafty Transpostion Table Question

Post by phhnguyen »

wgarvin wrote:
phhnguyen wrote:I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.
The XOR trick is probably cheaper. Anyway, it is incredibly cheap, you won't notice any impact on performance at all.
Why cheaper?

If you set the hash key in the middle of hash data and using some "trick" of pointers, you can set, access that hash key it without any use of XOR nor shift.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty Transpostion Table Question

Post by bob »

phhnguyen wrote:
bob wrote: SImple. This is the "lockless hashing" idea for parallel search.

The issue is that different threads can store into the same hash table address at approximately the same time. There are two 64 bit values, the signature (A) and the score/etc (B). If two processors try to store at the same time, you want to get eihere A1,B1 or A2,B2 stored in the table, but due to timing you can get A1,B2 or A2, B1 which is a problem.

If, when you store A1, B1, you xor them to modify the stored signature, and when you probe that entry, you xor again to recover the original signature, you eliminate the possibility of getting a match on A1, but getting data B2 which would be wrong.

The lockless hashing paper is on my web page at www.cis.uab.edu, where you click on faculty then my name...

If you don't do this, you have to use a normal Lock()/Unlock() which is way excessive overhead due to all the hash storing and probing going on.

Hi Bob,

I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.

What is your opinion / experience? Many thanks.
Pham

In thinking about it, it is not really a given that this would work on the PC. Nothing to prevent one thread from writing the first 4 bytes, then the next 8, then the next 4, but after the first 4, that thread gets suspended for an instant, leaving wrong data combined with right data. The processor is certainly free to complete all writes done by the active process at any point in time after a context-switch, which could be an issue.

Good idea, and it might work most of the time.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Crafty Transpostion Table Question

Post by mcostalba »

bob wrote:
FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.

François
My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.
I guess that is clear for you that you have not answered the question.

Here the issue is hash key aliasing, nothing to do with SMP, it has to do with the fact that relation between a position and its hash key is not univoque.

Hope it is clear, in case is not I state more clearly: you cannot avoid checking for illegality even in single thread case as long as hash keys are not univocal to positions (and are not).

Of course you can say that is a very very very rare event, but this doesn't change the fact that you cannot skip legality check if you want to be sure that a crash will _never_ occur.