LiquidNitrogenOverclocker wrote:I had to read over this a few times. Then the lightbulb went off.bob wrote: Any of the above will work just fine, but (1) introduces overhead for the validation procedure, while 2 introduces tons of memory/cache issues as well as stalling processes when there really is no conflict. I just took the solution with the least impact on NPS since all work equally well.
That's awesome!
Crafty Transpostion Table Question
Moderators: hgm, Rebel, chrisw
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Crafty Transpostion Table Question
-
- Posts: 80
- Joined: Tue Jul 18, 2006 10:46 pm
Re: Crafty Transpostion Table Question
In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.
François
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.
François
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Crafty Transpostion Table Question
My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.
François
-
- Posts: 80
- Joined: Tue Jul 18, 2006 10:46 pm
Re: Crafty Transpostion Table Question
yes, that is indeed what I understood from your first post.bob wrote:My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.
François
I am just saying that I chose (1) myself and that i am pretty sure the overhead you are talking about in your last post is completely negligible ("(1) introduces overhead for the validation procedure").
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Crafty Transpostion Table Question
To properly validate a move in Crafty, it was > 1% overhead, which is not something I would ignore. And when it provides a little more safety in hashing when doing SMP searches, it is even better. The "lockless hash" idea introduces one extra XOR on a store, and on a probe. That's cheaper than any sort of validation procedure I could imagine, by orders of magnitudeFrancoisK wrote:yes, that is indeed what I understood from your first post.bob wrote:My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.
François
I am just saying that I chose (1) myself and that i am pretty sure the overhead you are talking about in your last post is completely negligible ("(1) introduces overhead for the validation procedure").
-
- Posts: 1437
- Joined: Wed Apr 21, 2010 4:58 am
- Location: Australia
- Full name: Nguyen Hong Pham
Re: Crafty Transpostion Table Question
bob wrote: SImple. This is the "lockless hashing" idea for parallel search.
The issue is that different threads can store into the same hash table address at approximately the same time. There are two 64 bit values, the signature (A) and the score/etc (B). If two processors try to store at the same time, you want to get eihere A1,B1 or A2,B2 stored in the table, but due to timing you can get A1,B2 or A2, B1 which is a problem.
If, when you store A1, B1, you xor them to modify the stored signature, and when you probe that entry, you xor again to recover the original signature, you eliminate the possibility of getting a match on A1, but getting data B2 which would be wrong.
The lockless hashing paper is on my web page at www.cis.uab.edu, where you click on faculty then my name...
If you don't do this, you have to use a normal Lock()/Unlock() which is way excessive overhead due to all the hash storing and probing going on.
Hi Bob,
I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.
What is your opinion / experience? Many thanks.
Pham
-
- Posts: 838
- Joined: Thu Jul 05, 2007 5:03 pm
- Location: British Columbia, Canada
Re: Crafty Transpostion Table Question
The XOR trick is probably cheaper. Anyway, it is incredibly cheap, you won't notice any impact on performance at all.phhnguyen wrote:I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.
-
- Posts: 1437
- Joined: Wed Apr 21, 2010 4:58 am
- Location: Australia
- Full name: Nguyen Hong Pham
Re: Crafty Transpostion Table Question
Why cheaper?wgarvin wrote:The XOR trick is probably cheaper. Anyway, it is incredibly cheap, you won't notice any impact on performance at all.phhnguyen wrote:I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.
If you set the hash key in the middle of hash data and using some "trick" of pointers, you can set, access that hash key it without any use of XOR nor shift.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Crafty Transpostion Table Question
phhnguyen wrote:bob wrote: SImple. This is the "lockless hashing" idea for parallel search.
The issue is that different threads can store into the same hash table address at approximately the same time. There are two 64 bit values, the signature (A) and the score/etc (B). If two processors try to store at the same time, you want to get eihere A1,B1 or A2,B2 stored in the table, but due to timing you can get A1,B2 or A2, B1 which is a problem.
If, when you store A1, B1, you xor them to modify the stored signature, and when you probe that entry, you xor again to recover the original signature, you eliminate the possibility of getting a match on A1, but getting data B2 which would be wrong.
The lockless hashing paper is on my web page at www.cis.uab.edu, where you click on faculty then my name...
If you don't do this, you have to use a normal Lock()/Unlock() which is way excessive overhead due to all the hash storing and probing going on.
Hi Bob,
I am wondering if it is another solution for "lockless" if we set the hash key in the middle of the hash data (e. g: 32 bit data + 64 bit hash key + 32 bit data). Any wrong pair of A and B will simply destroy the hash key and we can avoid using the Xor.
What is your opinion / experience? Many thanks.
Pham
In thinking about it, it is not really a given that this would work on the PC. Nothing to prevent one thread from writing the first 4 bytes, then the next 8, then the next 4, but after the first 4, that thread gets suspended for an instant, leaving wrong data combined with right data. The processor is certainly free to complete all writes done by the active process at any point in time after a context-switch, which could be an issue.
Good idea, and it might work most of the time.
-
- Posts: 2684
- Joined: Sat Jun 14, 2008 9:17 pm
Re: Crafty Transpostion Table Question
I guess that is clear for you that you have not answered the question.bob wrote:My issue is the "best move". I use Make/Unmake, which means an illegal move can wreck the chessboard (for example, the move O-O after the king has moved elsewhere will results in a second king on the board. Rather than validating the move, I prevent the out-of-order-stores caused by multiple threads from producing the problem in the first place.FrancoisK wrote:In BugChess2 I validate the move after each TT probe on a single thread (no SMP yet) to protect the program against a very rare crash due to hash key collision.
The validating code is really straightforward and I am pretty sure it has a completely negligible impact on performance. I can check if you are interested.
François
Here the issue is hash key aliasing, nothing to do with SMP, it has to do with the fact that relation between a position and its hash key is not univoque.
Hope it is clear, in case is not I state more clearly: you cannot avoid checking for illegality even in single thread case as long as hash keys are not univocal to positions (and are not).
Of course you can say that is a very very very rare event, but this doesn't change the fact that you cannot skip legality check if you want to be sure that a crash will _never_ occur.