volatile?

Edsel Apostol · Post by **Edsel Apostol** » Wed Mar 26, 2014 11:38 am

lucasart wrote:Consider these two implementation of class Lockable, a base class to any kind of global data that must be accessed under lock:

version 1: std::mutex
Code: Select all
class Lockable &#123;
    mutable std&#58;&#58;mutex mtx;
public&#58;
    void lock&#40;) &#123; mtx.lock&#40;); &#125;
    void unlock&#40;) &#123; mtx.unlock&#40;); &#125;
&#125;;
version 2: std::atomic
Code: Select all
class Lockable &#123;
    mutable std&#58;&#58;atomic<bool> busy;
public&#58;
    Lockable&#40;)&#58; busy&#40;false&#41; &#123;&#125;
    void lock&#40;) &#123;
        while &#40;busy&#41;;
        busy = true;
    &#125;
    void unlock&#40;) &#123; busy = false; &#125;
&#125;;
Q1: are these functionally equivalent?
Q2: is the second version likely to be faster?

For version 2 you can probably use this one (from Hannibal):

Code: Select all

class Spinlock &#123;
public&#58;
    Spinlock&#40;) &#58; m_Lock&#40;false&#41; &#123; &#125;
    void lock&#40;) &#123; while &#40;m_Lock.exchange&#40;true&#41;); &#125;
    void unlock&#40;) &#123; m_Lock.store&#40;false&#41;; &#125;
private&#58;
    std&#58;&#58;atomic<bool> m_Lock;
&#125;;

A1: In the case of mutex the variables between the call to lock and unlock may already be updated by the latest thread before the next thread could use them. In the case of this specific Spinlock, there is no guarantee, so either declare those variables volatile or atomic.

A2. Spinlock is a bit faster based on my experience with Hannibal SMP.

Another version of Spinlock from Hannibal:

Code: Select all

class Spinlock &#123;
public&#58;
    Spinlock&#40;) &#123; m_Lock.clear&#40;std&#58;&#58;memory_order_release&#41;; &#125;
    void lock&#40;) &#123; while &#40;m_Lock.test_and_set&#40;std&#58;&#58;memory_order_acquire&#41;); &#125;
    void unlock&#40;) &#123; m_Lock.clear&#40;std&#58;&#58;memory_order_release&#41;; &#125;
private&#58;
    std&#58;&#58;atomic_flag m_Lock;
&#125;;

syzygy · Post by **syzygy** » Wed Mar 26, 2014 11:58 am

bob wrote:
syzygy wrote:
With my volatile example, it might miss one read due to the race, the next read will get the correct value. Once the write is done.
Aha! So while the write "is in progress" the old value can stil be read by some CPUs while other CPUs already see the new value. Woohoo. So much for your original statement..
My original statement: ONCE the write is done, everybody gets new value. That has not changed one scintilla.

That is a modified statement with exactly zero content. You now define "write is done" as "everybody sees the new value".

Originally we talked about this:

bob wrote:
syzygy wrote:
hgm wrote:Returning the local value in the core's private cache is always fine. If that wasn't the currently valid value (because it was changed in DRAM, a shared higher cache level or some other core's private cache), it would be no longer in your cache.
Already on x86 there is no general guarantee that the value read from cache is identical to the value stored in DRAM by another thread, especially if you consider multi-socket systems. What is guaranteed (on x86, not on other architectures) is that if CPU1 writes to A and then to B, and (some small time later) CPU2 reads B and then A, it will retrieve the new value from A if it retrieved the new value from B. But it is OK if it retrieves old (cached) values for both A and B or if it retrieves the old value for B and the new value for A.
It is NOT ok to retrieve old values. The caches on Intel SPECIFICALLY prevent this by their snooping and inter-cache forwarding. Where is this stuff coming from? On Intel, the value you read will be the LAST value written by any other CPU. That's guaranteed.

What I described to HGM is completely correct. What you wrote is not. Period.

lucasart · Post by **lucasart** » Wed Mar 26, 2014 12:02 pm

Edsel Apostol wrote:
lucasart wrote:Consider these two implementation of class Lockable, a base class to any kind of global data that must be accessed under lock:

version 1: std::mutex
Code: Select all
class Lockable &#123;
    mutable std&#58;&#58;mutex mtx;
public&#58;
    void lock&#40;) &#123; mtx.lock&#40;); &#125;
    void unlock&#40;) &#123; mtx.unlock&#40;); &#125;
&#125;;
version 2: std::atomic
Code: Select all
class Lockable &#123;
    mutable std&#58;&#58;atomic<bool> busy;
public&#58;
    Lockable&#40;)&#58; busy&#40;false&#41; &#123;&#125;
    void lock&#40;) &#123;
        while &#40;busy&#41;;
        busy = true;
    &#125;
    void unlock&#40;) &#123; busy = false; &#125;
&#125;;
Q1: are these functionally equivalent?
Q2: is the second version likely to be faster?
For version 2 you can probably use this one (from Hannibal):
Code: Select all
class Spinlock &#123;
public&#58;
    Spinlock&#40;) &#58; m_Lock&#40;false&#41; &#123; &#125;
    void lock&#40;) &#123; while &#40;m_Lock.exchange&#40;true&#41;); &#125;
    void unlock&#40;) &#123; m_Lock.store&#40;false&#41;; &#125;
private&#58;
    std&#58;&#58;atomic<bool> m_Lock;
&#125;;
A1: In the case of mutex the variables between the call to lock and unlock may already be updated by the latest thread before the next thread could use them. In the case of this specific Spinlock, there is no guarantee, so either declare those variables volatile or atomic.

A2. Spinlock is a bit faster based on my experience with Hannibal SMP.

Another version of Spinlock from Hannibal:
Code: Select all
class Spinlock &#123;
public&#58;
    Spinlock&#40;) &#123; m_Lock.clear&#40;std&#58;&#58;memory_order_release&#41;; &#125;
    void lock&#40;) &#123; while &#40;m_Lock.test_and_set&#40;std&#58;&#58;memory_order_acquire&#41;); &#125;
    void unlock&#40;) &#123; m_Lock.clear&#40;std&#58;&#58;memory_order_release&#41;; &#125;
private&#58;
    std&#58;&#58;atomic_flag m_Lock;
&#125;;

Thanks! Exactly what I was looking for.

syzygy · Post by **syzygy** » Wed Mar 26, 2014 12:04 pm

syzygy wrote:The best way to implement a spinlock in C++11 is probably to use std::atomic_flag type and the std::atomic_flag_test_and_set function. This would be completely portable.

http://anki3d.org/spinlock/

Code: Select all

#include <atomic>
 
class SpinLock
&#123;
public&#58;
    void lock&#40;)
    &#123;
        while&#40;lck.test_and_set&#40;std&#58;&#58;memory_order_acquire&#41;)
        &#123;&#125;
    &#125;
 
    void unlock&#40;)
    &#123;
        lck.clear&#40;std&#58;&#58;memory_order_release&#41;;
    &#125;
 
private&#58;
    std&#58;&#58;atomic_flag lck = ATOMIC_FLAG_INIT;   
&#125;;

edit: it seems Hannibal already uses C++11 atomics. Maybe the first engine to do that?
edit2: this is not an optimal implementation of spinlocks. If multiple processors are spinning on the same lock this will eat up a lot of bandwidth. I'll try to come up with something better later.

mar · Post by **mar** » Wed Mar 26, 2014 12:04 pm

lucasart wrote:

Code: Select all

class Lockable &#123;
    mutable std&#58;&#58;mutex mtx;
public&#58;
    void lock&#40;) &#123; mtx.lock&#40;); &#125;
    void unlock&#40;) &#123; mtx.unlock&#40;); &#125;
&#125;;

btw. mutable is superfluous in this particular case, it just marks members that can be modified by const methods

Rein Halbersma · Post by **Rein Halbersma** » Wed Mar 26, 2014 12:08 pm

lucasart wrote:Consider these two implementation of class Lockable, a base class to any kind of global data that must be accessed under lock:

version 1: std::mutex
Code: Select all
class Lockable &#123;
    mutable std&#58;&#58;mutex mtx;
public&#58;
    void lock&#40;) &#123; mtx.lock&#40;); &#125;
    void unlock&#40;) &#123; mtx.unlock&#40;); &#125;
&#125;;
version 2: std::atomic
Code: Select all
class Lockable &#123;
    mutable std&#58;&#58;atomic<bool> busy;
public&#58;
    Lockable&#40;)&#58; busy&#40;false&#41; &#123;&#125;
    void lock&#40;) &#123;
        while &#40;busy&#41;;
        busy = true;
    &#125;
    void unlock&#40;) &#123; busy = false; &#125;
&#125;;
Q1: are these functionally equivalent?
Q2: is the second version likely to be faster?

You should be using std::lock_guard because that makes your code exception-safe.

Code: Select all

    MyType my_var;
    std&#58;&#58;mutex mtx;
    &#123;
        std&#58;&#58;lock_guard<std&#58;&#58;mutex> lk&#40;mtx_);
        // aribtary code reading/writing my_var;
    &#125; // mutex will be unlocked by destructor of lock_guard, even if your own code throws an exception

If you have class that you want to access from multiple threads, just put the mutex as a class member and put the lock_guard in every public member function.

If your type is trivially copyable, you can put it in a std::atomic variable and then you don't have to write the mutex and locking yourself.

lucasart · Post by **lucasart** » Wed Mar 26, 2014 12:10 pm

mar wrote:
lucasart wrote:
Code: Select all
class Lockable &#123;
    mutable std&#58;&#58;mutex mtx;
public&#58;
    void lock&#40;) &#123; mtx.lock&#40;); &#125;
    void unlock&#40;) &#123; mtx.unlock&#40;); &#125;
&#125;;
btw. mutable is superfluous in this particular case, it just marks members that can be modified by const methods

i just forgot to type const. anyway, the intent is to derive classes from that, so when they lock/unlock it's const from a logical point of view, but not strictly from a programming pt of view since the mutex of the base class is modified. that's exactly where it's the right place to use mutable. lock() and unlock() should be const.

Edsel Apostol · Post by **Edsel Apostol** » Wed Mar 26, 2014 12:14 pm

syzygy wrote:
syzygy wrote:The best way to implement a spinlock in C++11 is probably to use std::atomic_flag type and the std::atomic_flag_test_and_set function. This would be completely portable.
http://anki3d.org/spinlock/
Code: Select all
#include <atomic>
 
class SpinLock
&#123;
public&#58;
    void lock&#40;)
    &#123;
        while&#40;lck.test_and_set&#40;std&#58;&#58;memory_order_acquire&#41;)
        &#123;&#125;
    &#125;
 
    void unlock&#40;)
    &#123;
        lck.clear&#40;std&#58;&#58;memory_order_release&#41;;
    &#125;
 
private&#58;
    std&#58;&#58;atomic_flag lck = ATOMIC_FLAG_INIT;   
&#125;;
edit: it seems Hannibal already uses C++11 atomics. Maybe the first engine to do that?

It's in the dev version of Hannibal but not yet in the last public release. Only Komodo, Senpai and a version of SF is using C++11 that I'm aware of. I'm not sure if Komodo is using atomics. So maybe Hannibal is the first engine to use that.

bob · Post by **bob** » Wed Mar 26, 2014 4:57 pm

Daniel Shawul wrote:There are test_ant_set atomic intrinsics to achieve just that. In windows I use the below to get spinlocks
Code: Select all
#             define l_lock&#40;x&#41;     while&#40;InterlockedExchange&#40;&#40;LPLONG&#41;&&#40;x&#41;,1&#41; != 0&#41; &#123;while&#40;&#40;x&#41; != 0&#41;;&#125;
Also I don't understand why the bool is declared atomic and not volatile instead.
Don't know about C++11 but in C++ bool is probably atomic anyway so what it needs is to be told that it is modifiable simultaneously from different threads.

Your implementation is the classic "shadow spin lock" which works just fine. I do exactly the same in Crafty (with one exception in a moment). The linux kernel uses the same form as you do also.

My only difference is that I do something like this using C pseudo-code:

while (xchg(lock,1)) {while(lock) pause;}

The pause helps if you use hyper-threading, causing the physical core to switch to the other logical core rather than wasting time and interfering with the core that is actually doing real work.

rvida · Post by **rvida** » Wed Mar 26, 2014 7:04 pm

Edsel Apostol wrote:Only Komodo, Senpai and a version of SF is using C++11 that I'm aware of.

Critter uses anonymous unions and typed enums, so I guess that makes it somewhat C++11 (-ish) too.

volatile?

Re: volatile?

Re: volatile?

Re: volatile?

Re: volatile?

Re: volatile?

Re: volatile?

Re: volatile?

Re: volatile?

Re: volatile?

Re: volatile?