Crafty Transpostion Table Question

bob · Post by **bob** » Fri Jun 04, 2010 2:49 pm

LiquidNitrogenOverclocker wrote:
bob wrote: Any of the above will work just fine, but (1) introduces overhead for the validation procedure, while 2 introduces tons of memory/cache issues as well as stalling processes when there really is no conflict. I just took the solution with the least impact on NPS since all work equally well.
I had to read over this a few times. Then the lightbulb went off.

That's awesome!

FrancoisK · Post by **FrancoisK** » Fri Jun 04, 2010 3:11 pm

In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.

François

bob · Post by **bob** » Fri Jun 04, 2010 3:24 pm

FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.

François

My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.

FrancoisK · Post by **FrancoisK** » Fri Jun 04, 2010 3:36 pm

bob wrote:
FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.

François
My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.

yes, that is indeed what I understood from your first post.
I am just saying that I chose (1) myself and that i am pretty sure the overhead you are talking about in your last post is completely negligible ("(1) introduces overhead for the validation procedure").

bob · Post by **bob** » Fri Jun 04, 2010 5:17 pm

FrancoisK wrote:
bob wrote:
FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.

François
My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.
yes, that is indeed what I understood from your first post.
I am just saying that I chose (1) myself and that i am pretty sure the overhead you are talking about in your last post is completely negligible ("(1) introduces overhead for the validation procedure").

To properly validate a move in Crafty, it was > 1% overhead, which is not something I would ignore. And when it provides a little more safety in hashing when doing SMP searches, it is even better. The "lockless hash" idea introduces one extra XOR on a store, and on a probe. That's cheaper than any sort of validation procedure I could imagine, by orders of magnitude

phhnguyen · Post by **phhnguyen** » Fri Jun 04, 2010 5:21 pm

bob wrote: SImple. This is the "lockless hashing" idea for parallel search.

The issue is that different threads can store into the same hash table address at approximately the same time. There are two 64 bit values, the signature (A) and the score/etc (B). If two processors try to store at the same time, you want to get eihere A1,B1 or A2,B2 stored in the table, but due to timing you can get A1,B2 or A2, B1 which is a problem.

If, when you store A1, B1, you xor them to modify the stored signature, and when you probe that entry, you xor again to recover the original signature, you eliminate the possibility of getting a match on A1, but getting data B2 which would be wrong.

The lockless hashing paper is on my web page at www.cis.uab.edu, where you click on faculty then my name...

If you don't do this, you have to use a normal Lock()/Unlock() which is way excessive overhead due to all the hash storing and probing going on.

Hi Bob,

I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.

What is your opinion / experience? Many thanks.
Pham

wgarvin · Post by **wgarvin** » Fri Jun 04, 2010 5:28 pm

phhnguyen wrote:I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.

The XOR trick is probably cheaper. Anyway, it is incredibly cheap, you won't notice any impact on performance at all.

phhnguyen · Post by **phhnguyen** » Fri Jun 04, 2010 5:51 pm

wgarvin wrote:
phhnguyen wrote:I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.
The XOR trick is probably cheaper. Anyway, it is incredibly cheap, you won't notice any impact on performance at all.

Why cheaper?

If you set the hash key in the middle of hash data and using some "trick" of pointers, you can set, access that hash key it without any use of XOR nor shift.

bob · Post by **bob** » Fri Jun 04, 2010 7:41 pm

phhnguyen wrote:
bob wrote: SImple. This is the "lockless hashing" idea for parallel search.

The issue is that different threads can store into the same hash table address at approximately the same time. There are two 64 bit values, the signature (A) and the score/etc (B). If two processors try to store at the same time, you want to get eihere A1,B1 or A2,B2 stored in the table, but due to timing you can get A1,B2 or A2, B1 which is a problem.

If, when you store A1, B1, you xor them to modify the stored signature, and when you probe that entry, you xor again to recover the original signature, you eliminate the possibility of getting a match on A1, but getting data B2 which would be wrong.

The lockless hashing paper is on my web page at www.cis.uab.edu, where you click on faculty then my name...

If you don't do this, you have to use a normal Lock()/Unlock() which is way excessive overhead due to all the hash storing and probing going on.

Hi Bob,

I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.

What is your opinion / experience? Many thanks.
Pham

In thinking about it, it is not really a given that this would work on the PC. Nothing to prevent one thread from writing the first 4 bytes, then the next 8, then the next 4, but after the first 4, that thread gets suspended for an instant, leaving wrong data combined with right data. The processor is certainly free to complete all writes done by the active process at any point in time after a context-switch, which could be an issue.

Good idea, and it might work most of the time.

mcostalba · Post by **mcostalba** » Fri Jun 04, 2010 8:09 pm

bob wrote:
FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.

François
My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.

I guess that is clear for you that you have not answered the question.

Here the issue is hash key aliasing, nothing to do with SMP, it has to do with the fact that relation between a position and its hash key is not univoque.

Hope it is clear, in case is not I state more clearly: you cannot avoid checking for illegality even in single thread case as long as hash keys are not univocal to positions (and are not).

Of course you can say that is a very very very rare event, but this doesn't change the fact that you cannot skip legality check if you want to be sure that a crash will _never_ occur.

Crafty Transpostion Table Question

Re: Crafty Transpostion Table Question

Re: Crafty Transpostion Table Question

Re: Crafty Transpostion Table Question

Re: Crafty Transpostion Table Question

Re: Crafty Transpostion Table Question

Re: Crafty Transpostion Table Question

Re: Crafty Transpostion Table Question

Re: Crafty Transpostion Table Question

Re: Crafty Transpostion Table Question

Re: Crafty Transpostion Table Question