A1: In the case of mutex the variables between the call to lock and unlock may already be updated by the latest thread before the next thread could use them. In the case of this specific Spinlock, there is no guarantee, so either declare those variables volatile or atomic.
A2. Spinlock is a bit faster based on my experience with Hannibal SMP.
With my volatile example, it might miss one read due to the race, the next read will get the correct value. Once the write is done.
Aha! So while the write "is in progress" the old value can stil be read by some CPUs while other CPUs already see the new value. Woohoo. So much for your original statement..
My original statement: ONCE the write is done, everybody gets new value. That has not changed one scintilla.
That is a modified statement with exactly zero content. You now define "write is done" as "everybody sees the new value".
Originally we talked about this:
bob wrote:
syzygy wrote:
hgm wrote:Returning the local value in the core's private cache is always fine. If that wasn't the currently valid value (because it was changed in DRAM, a shared higher cache level or some other core's private cache), it would be no longer in your cache.
Already on x86 there is no general guarantee that the value read from cache is identical to the value stored in DRAM by another thread, especially if you consider multi-socket systems. What is guaranteed (on x86, not on other architectures) is that if CPU1 writes to A and then to B, and (some small time later) CPU2 reads B and then A, it will retrieve the new value from A if it retrieved the new value from B. But it is OK if it retrieves old (cached) values for both A and B or if it retrieves the old value for B and the new value for A.
It is NOT ok to retrieve old values. The caches on Intel SPECIFICALLY prevent this by their snooping and inter-cache forwarding. Where is this stuff coming from? On Intel, the value you read will be the LAST value written by any other CPU. That's guaranteed.
What I described to HGM is completely correct. What you wrote is not. Period.
A1: In the case of mutex the variables between the call to lock and unlock may already be updated by the latest thread before the next thread could use them. In the case of this specific Spinlock, there is no guarantee, so either declare those variables volatile or atomic.
A2. Spinlock is a bit faster based on my experience with Hannibal SMP.
syzygy wrote:The best way to implement a spinlock in C++11 is probably to use std::atomic_flag type and the std::atomic_flag_test_and_set function. This would be completely portable.
edit: it seems Hannibal already uses C++11 atomics. Maybe the first engine to do that?
edit2: this is not an optimal implementation of spinlocks. If multiple processors are spinning on the same lock this will eat up a lot of bandwidth. I'll try to come up with something better later.
Last edited by syzygy on Wed Mar 26, 2014 12:09 pm, edited 2 times in total.
MyType my_var;
std::mutex mtx;
{
std::lock_guard<std::mutex> lk(mtx_);
// aribtary code reading/writing my_var;
} // mutex will be unlocked by destructor of lock_guard, even if your own code throws an exception
If you have class that you want to access from multiple threads, just put the mutex as a class member and put the lock_guard in every public member function.
If your type is trivially copyable, you can put it in a std::atomic variable and then you don't have to write the mutex and locking yourself.
btw. mutable is superfluous in this particular case, it just marks members that can be modified by const methods
i just forgot to type const. anyway, the intent is to derive classes from that, so when they lock/unlock it's const from a logical point of view, but not strictly from a programming pt of view since the mutex of the base class is modified. that's exactly where it's the right place to use mutable. lock() and unlock() should be const.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
syzygy wrote:The best way to implement a spinlock in C++11 is probably to use std::atomic_flag type and the std::atomic_flag_test_and_set function. This would be completely portable.
edit: it seems Hannibal already uses C++11 atomics. Maybe the first engine to do that?
It's in the dev version of Hannibal but not yet in the last public release. Only Komodo, Senpai and a version of SF is using C++11 that I'm aware of. I'm not sure if Komodo is using atomics. So maybe Hannibal is the first engine to use that.
Also I don't understand why the bool is declared atomic and not volatile instead.
Don't know about C++11 but in C++ bool is probably atomic anyway so what it needs is to be told that it is modifiable simultaneously from different threads.
Your implementation is the classic "shadow spin lock" which works just fine. I do exactly the same in Crafty (with one exception in a moment). The linux kernel uses the same form as you do also.
My only difference is that I do something like this using C pseudo-code:
while (xchg(lock,1)) {while(lock) pause;}
The pause helps if you use hyper-threading, causing the physical core to switch to the other logical core rather than wasting time and interfering with the core that is actually doing real work.