volatile?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Edsel Apostol
Posts: 803
Joined: Mon Jul 17, 2006 5:53 am
Full name: Edsel Apostol

Re: volatile?

Post by Edsel Apostol »

lucasart wrote:Consider these two implementation of class Lockable, a base class to any kind of global data that must be accessed under lock:

version 1: std::mutex

Code: Select all

class Lockable {
    mutable std::mutex mtx;
public:
    void lock() { mtx.lock(); }
    void unlock() { mtx.unlock(); }
};
version 2: std::atomic

Code: Select all

class Lockable {
    mutable std&#58;&#58;atomic<bool> busy;
public&#58;
    Lockable&#40;)&#58; busy&#40;false&#41; &#123;&#125;
    void lock&#40;) &#123;
        while &#40;busy&#41;;
        busy = true;
    &#125;
    void unlock&#40;) &#123; busy = false; &#125;
&#125;;
Q1: are these functionally equivalent?
Q2: is the second version likely to be faster?
For version 2 you can probably use this one (from Hannibal):

Code: Select all

class Spinlock &#123;
public&#58;
    Spinlock&#40;) &#58; m_Lock&#40;false&#41; &#123; &#125;
    void lock&#40;) &#123; while &#40;m_Lock.exchange&#40;true&#41;); &#125;
    void unlock&#40;) &#123; m_Lock.store&#40;false&#41;; &#125;
private&#58;
    std&#58;&#58;atomic<bool> m_Lock;
&#125;;
A1: In the case of mutex the variables between the call to lock and unlock may already be updated by the latest thread before the next thread could use them. In the case of this specific Spinlock, there is no guarantee, so either declare those variables volatile or atomic.

A2. Spinlock is a bit faster based on my experience with Hannibal SMP.

Another version of Spinlock from Hannibal:

Code: Select all

class Spinlock &#123;
public&#58;
    Spinlock&#40;) &#123; m_Lock.clear&#40;std&#58;&#58;memory_order_release&#41;; &#125;
    void lock&#40;) &#123; while &#40;m_Lock.test_and_set&#40;std&#58;&#58;memory_order_acquire&#41;); &#125;
    void unlock&#40;) &#123; m_Lock.clear&#40;std&#58;&#58;memory_order_release&#41;; &#125;
private&#58;
    std&#58;&#58;atomic_flag m_Lock;
&#125;;
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: volatile?

Post by syzygy »

bob wrote:
syzygy wrote:
With my volatile example, it might miss one read due to the race, the next read will get the correct value. Once the write is done.
Aha! So while the write "is in progress" the old value can stil be read by some CPUs while other CPUs already see the new value. Woohoo. So much for your original statement..
My original statement: ONCE the write is done, everybody gets new value. That has not changed one scintilla.
That is a modified statement with exactly zero content. You now define "write is done" as "everybody sees the new value".

Originally we talked about this:
bob wrote:
syzygy wrote:
hgm wrote:Returning the local value in the core's private cache is always fine. If that wasn't the currently valid value (because it was changed in DRAM, a shared higher cache level or some other core's private cache), it would be no longer in your cache.
Already on x86 there is no general guarantee that the value read from cache is identical to the value stored in DRAM by another thread, especially if you consider multi-socket systems. What is guaranteed (on x86, not on other architectures) is that if CPU1 writes to A and then to B, and (some small time later) CPU2 reads B and then A, it will retrieve the new value from A if it retrieved the new value from B. But it is OK if it retrieves old (cached) values for both A and B or if it retrieves the old value for B and the new value for A.
It is NOT ok to retrieve old values. The caches on Intel SPECIFICALLY prevent this by their snooping and inter-cache forwarding. Where is this stuff coming from? On Intel, the value you read will be the LAST value written by any other CPU. That's guaranteed.
What I described to HGM is completely correct. What you wrote is not. Period.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: volatile?

Post by lucasart »

Edsel Apostol wrote:
lucasart wrote:Consider these two implementation of class Lockable, a base class to any kind of global data that must be accessed under lock:

version 1: std::mutex

Code: Select all

class Lockable &#123;
    mutable std&#58;&#58;mutex mtx;
public&#58;
    void lock&#40;) &#123; mtx.lock&#40;); &#125;
    void unlock&#40;) &#123; mtx.unlock&#40;); &#125;
&#125;;
version 2: std::atomic

Code: Select all

class Lockable &#123;
    mutable std&#58;&#58;atomic<bool> busy;
public&#58;
    Lockable&#40;)&#58; busy&#40;false&#41; &#123;&#125;
    void lock&#40;) &#123;
        while &#40;busy&#41;;
        busy = true;
    &#125;
    void unlock&#40;) &#123; busy = false; &#125;
&#125;;
Q1: are these functionally equivalent?
Q2: is the second version likely to be faster?
For version 2 you can probably use this one (from Hannibal):

Code: Select all

class Spinlock &#123;
public&#58;
    Spinlock&#40;) &#58; m_Lock&#40;false&#41; &#123; &#125;
    void lock&#40;) &#123; while &#40;m_Lock.exchange&#40;true&#41;); &#125;
    void unlock&#40;) &#123; m_Lock.store&#40;false&#41;; &#125;
private&#58;
    std&#58;&#58;atomic<bool> m_Lock;
&#125;;
A1: In the case of mutex the variables between the call to lock and unlock may already be updated by the latest thread before the next thread could use them. In the case of this specific Spinlock, there is no guarantee, so either declare those variables volatile or atomic.

A2. Spinlock is a bit faster based on my experience with Hannibal SMP.

Another version of Spinlock from Hannibal:

Code: Select all

class Spinlock &#123;
public&#58;
    Spinlock&#40;) &#123; m_Lock.clear&#40;std&#58;&#58;memory_order_release&#41;; &#125;
    void lock&#40;) &#123; while &#40;m_Lock.test_and_set&#40;std&#58;&#58;memory_order_acquire&#41;); &#125;
    void unlock&#40;) &#123; m_Lock.clear&#40;std&#58;&#58;memory_order_release&#41;; &#125;
private&#58;
    std&#58;&#58;atomic_flag m_Lock;
&#125;;
Thanks! Exactly what I was looking for.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: volatile?

Post by syzygy »

syzygy wrote:The best way to implement a spinlock in C++11 is probably to use std::atomic_flag type and the std::atomic_flag_test_and_set function. This would be completely portable.
http://anki3d.org/spinlock/

Code: Select all

#include <atomic>
 
class SpinLock
&#123;
public&#58;
    void lock&#40;)
    &#123;
        while&#40;lck.test_and_set&#40;std&#58;&#58;memory_order_acquire&#41;)
        &#123;&#125;
    &#125;
 
    void unlock&#40;)
    &#123;
        lck.clear&#40;std&#58;&#58;memory_order_release&#41;;
    &#125;
 
private&#58;
    std&#58;&#58;atomic_flag lck = ATOMIC_FLAG_INIT;   
&#125;;
edit: it seems Hannibal already uses C++11 atomics. Maybe the first engine to do that?
edit2: this is not an optimal implementation of spinlocks. If multiple processors are spinning on the same lock this will eat up a lot of bandwidth. I'll try to come up with something better later.
Last edited by syzygy on Wed Mar 26, 2014 12:09 pm, edited 2 times in total.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: volatile?

Post by mar »

lucasart wrote:

Code: Select all

class Lockable &#123;
    mutable std&#58;&#58;mutex mtx;
public&#58;
    void lock&#40;) &#123; mtx.lock&#40;); &#125;
    void unlock&#40;) &#123; mtx.unlock&#40;); &#125;
&#125;;
btw. mutable is superfluous in this particular case, it just marks members that can be modified by const methods
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: volatile?

Post by Rein Halbersma »

lucasart wrote:Consider these two implementation of class Lockable, a base class to any kind of global data that must be accessed under lock:

version 1: std::mutex

Code: Select all

class Lockable &#123;
    mutable std&#58;&#58;mutex mtx;
public&#58;
    void lock&#40;) &#123; mtx.lock&#40;); &#125;
    void unlock&#40;) &#123; mtx.unlock&#40;); &#125;
&#125;;
version 2: std::atomic

Code: Select all

class Lockable &#123;
    mutable std&#58;&#58;atomic<bool> busy;
public&#58;
    Lockable&#40;)&#58; busy&#40;false&#41; &#123;&#125;
    void lock&#40;) &#123;
        while &#40;busy&#41;;
        busy = true;
    &#125;
    void unlock&#40;) &#123; busy = false; &#125;
&#125;;
Q1: are these functionally equivalent?
Q2: is the second version likely to be faster?
You should be using std::lock_guard because that makes your code exception-safe.

Code: Select all

    MyType my_var;
    std&#58;&#58;mutex mtx;
    &#123;
        std&#58;&#58;lock_guard<std&#58;&#58;mutex> lk&#40;mtx_);
        // aribtary code reading/writing my_var;
    &#125; // mutex will be unlocked by destructor of lock_guard, even if your own code throws an exception
If you have class that you want to access from multiple threads, just put the mutex as a class member and put the lock_guard in every public member function.

If your type is trivially copyable, you can put it in a std::atomic variable and then you don't have to write the mutex and locking yourself.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: volatile?

Post by lucasart »

mar wrote:
lucasart wrote:

Code: Select all

class Lockable &#123;
    mutable std&#58;&#58;mutex mtx;
public&#58;
    void lock&#40;) &#123; mtx.lock&#40;); &#125;
    void unlock&#40;) &#123; mtx.unlock&#40;); &#125;
&#125;;
btw. mutable is superfluous in this particular case, it just marks members that can be modified by const methods
i just forgot to type const. anyway, the intent is to derive classes from that, so when they lock/unlock it's const from a logical point of view, but not strictly from a programming pt of view since the mutex of the base class is modified. that's exactly where it's the right place to use mutable. lock() and unlock() should be const.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Edsel Apostol
Posts: 803
Joined: Mon Jul 17, 2006 5:53 am
Full name: Edsel Apostol

Re: volatile?

Post by Edsel Apostol »

syzygy wrote:
syzygy wrote:The best way to implement a spinlock in C++11 is probably to use std::atomic_flag type and the std::atomic_flag_test_and_set function. This would be completely portable.
http://anki3d.org/spinlock/

Code: Select all

#include <atomic>
 
class SpinLock
&#123;
public&#58;
    void lock&#40;)
    &#123;
        while&#40;lck.test_and_set&#40;std&#58;&#58;memory_order_acquire&#41;)
        &#123;&#125;
    &#125;
 
    void unlock&#40;)
    &#123;
        lck.clear&#40;std&#58;&#58;memory_order_release&#41;;
    &#125;
 
private&#58;
    std&#58;&#58;atomic_flag lck = ATOMIC_FLAG_INIT;   
&#125;;
edit: it seems Hannibal already uses C++11 atomics. Maybe the first engine to do that?
It's in the dev version of Hannibal but not yet in the last public release. Only Komodo, Senpai and a version of SF is using C++11 that I'm aware of. I'm not sure if Komodo is using atomics. So maybe Hannibal is the first engine to use that.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: volatile?

Post by bob »

Daniel Shawul wrote:There are test_ant_set atomic intrinsics to achieve just that. In windows I use the below to get spinlocks

Code: Select all

#             define l_lock&#40;x&#41;     while&#40;InterlockedExchange&#40;&#40;LPLONG&#41;&&#40;x&#41;,1&#41; != 0&#41; &#123;while&#40;&#40;x&#41; != 0&#41;;&#125;
Also I don't understand why the bool is declared atomic and not volatile instead.
Don't know about C++11 but in C++ bool is probably atomic anyway so what it needs is to be told that it is modifiable simultaneously from different threads.
Your implementation is the classic "shadow spin lock" which works just fine. I do exactly the same in Crafty (with one exception in a moment). The linux kernel uses the same form as you do also.

My only difference is that I do something like this using C pseudo-code:

while (xchg(lock,1)) {while(lock) pause;}

The pause helps if you use hyper-threading, causing the physical core to switch to the other logical core rather than wasting time and interfering with the core that is actually doing real work.
User avatar
rvida
Posts: 481
Joined: Thu Apr 16, 2009 12:00 pm
Location: Slovakia, EU

Re: volatile?

Post by rvida »

Edsel Apostol wrote:Only Komodo, Senpai and a version of SF is using C++11 that I'm aware of.
Critter uses anonymous unions and typed enums, so I guess that makes it somewhat C++11 (-ish) too.