lockless hashing

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: lockless hashing

Post by syzygy »

bob wrote:
syzygy wrote:
bob wrote:
syzygy wrote:
bob wrote:That's not the problem he is describing.
The point is that that problem does not exist in what Peter was describing, namely the use of C++11/C11 atomics to implement the xor-trick correctly.
It doesn't stop "false sharing".
Correctly implementing the xor-trick using C++11/C11 atomics has nothing to do with false sharing, so I think you are talking about something entirely different.

We're in this subthread: http://talkchess.com/forum/viewtopic.ph ... 73&t=60122
See, it's just about "lockless hashing" which is really the xor-trick. Nothing to do with Folkert's alternative approach of using mutexes which is discussed in a different subthread.
If you back up about 6 posts in this sub-thread, Martin was talking explicitly about false-sharing, which is what I commented on...
No, you commented on Peter's reply who explained to Martin that Martin misunderstood him ("There are no RMW operations involved though when using memory_order_relaxed.") and then went on to explain how he implemented the xor-trick. No false sharing there, duh.

Folkert: I don't like the xor-trick as its expects certain hardware behavior.
Peter: using C+11 atomics it is portable.
Folkert: but that comes with a performance cost
Peter: no it does not
Martin (mistakenly): false sharing
Peter: (you are mistaken) this is how xor-trick is implemented correctly
Bob: (did not properly read the above)
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: lockless hashing

Post by bob »

syzygy wrote:
bob wrote:
syzygy wrote:
bob wrote:
syzygy wrote:
bob wrote:That's not the problem he is describing.
The point is that that problem does not exist in what Peter was describing, namely the use of C++11/C11 atomics to implement the xor-trick correctly.
It doesn't stop "false sharing".
Correctly implementing the xor-trick using C++11/C11 atomics has nothing to do with false sharing, so I think you are talking about something entirely different.

We're in this subthread: http://talkchess.com/forum/viewtopic.ph ... 73&t=60122
See, it's just about "lockless hashing" which is really the xor-trick. Nothing to do with Folkert's alternative approach of using mutexes which is discussed in a different subthread.
If you back up about 6 posts in this sub-thread, Martin was talking explicitly about false-sharing, which is what I commented on...
No, you commented on Peter's reply who explained to Martin that Martin misunderstood him ("There are no RMW operations involved though when using memory_order_relaxed.") and then went on to explain how he implemented the xor-trick. No false sharing there, duh.

Folkert: I don't like the xor-trick as its expects certain hardware behavior.
Peter: using C+11 atomics it is portable.
Folkert: but that comes with a performance cost
Peter: no it does not
Martin (mistakenly): false sharing
Peter: (you are mistaken) this is how xor-trick is implemented correctly
Bob: (did not properly read the above)
You can't do a hash table without reading, modifying and writing. Simply not possible. Martin was pointing out that the reading and modifying is a BIG problem thanks to cache. I simply agreed and explained why.

The XOR trick is not a solution to cache invalidation any more than any other approach is. If you have more than one updater, you have the problem...

So perhaps I don't understand how you guys think the new C++ stuff will solve that problem, and why anyone would think that the lockless hash trick would solve it either. It solves the bad move from interleaved update problem, doesn't do a think for cache invalidation issues...

End of story for me?
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: lockless hashing

Post by syzygy »

bob wrote:
syzygy wrote:
bob wrote:
syzygy wrote:
bob wrote:
syzygy wrote:
bob wrote:That's not the problem he is describing.
The point is that that problem does not exist in what Peter was describing, namely the use of C++11/C11 atomics to implement the xor-trick correctly.
It doesn't stop "false sharing".
Correctly implementing the xor-trick using C++11/C11 atomics has nothing to do with false sharing, so I think you are talking about something entirely different.

We're in this subthread: http://talkchess.com/forum/viewtopic.ph ... 73&t=60122
See, it's just about "lockless hashing" which is really the xor-trick. Nothing to do with Folkert's alternative approach of using mutexes which is discussed in a different subthread.
If you back up about 6 posts in this sub-thread, Martin was talking explicitly about false-sharing, which is what I commented on...
No, you commented on Peter's reply who explained to Martin that Martin misunderstood him ("There are no RMW operations involved though when using memory_order_relaxed.") and then went on to explain how he implemented the xor-trick. No false sharing there, duh.

Folkert: I don't like the xor-trick as its expects certain hardware behavior.
Peter: using C+11 atomics it is portable.
Folkert: but that comes with a performance cost
Peter: no it does not
Martin (mistakenly): false sharing
Peter: (you are mistaken) this is how xor-trick is implemented correctly
Bob: (did not properly read the above)
You can't do a hash table without reading, modifying and writing. Simply not possible. Martin was pointing out that the reading and modifying is a BIG problem thanks to cache. I simply agreed and explained why.

The XOR trick is not a solution to cache invalidation any more than any other approach is. If you have more than one updater, you have the problem...

So perhaps I don't understand how you guys think the new C++ stuff will solve that problem, and why anyone would think that the lockless hash trick would solve it either. It solves the bad move from interleaved update problem, doesn't do a think for cache invalidation issues...

End of story for me?
Oh my....
Martin was talking about Folkert's 256 r/w locks which will obviously be quite heavily contested and which will suffer to an extreme degree of false sharing unless each lock is placed in its own cache line. Peter was talking about something entirely different.
What a waste again. Why don't you ever read instead of blurb?