volatile?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27808
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: volatile?

Post by hgm »

syzygy wrote:For multithreaded programs using volatile is not necessary if you are properly using the synchronisation primitives of a thread library, e.g. pthreads or C++11 threads.
I don't get this, but that is perhaps because I never used synchronization primitives of a thread library. If I write

Code: Select all

while(busy && notReady) DoSomething(&notReady); 
and the compiler does not know that 'busy' is volatile... what is to stop it from optimizing it to

Code: Select all

if(busy) while(notReady) DoSomething(&notReady);
?

Is it then improper use to do this without locking? This is a single atomic read, so locking the access to it seems very wasteful.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: volatile?

Post by syzygy »

bob wrote:Sorry, but the above is wrong.
Oh boy, there we go again...
The above will fail because variable was not changed in this code between the first reference and the second. Another thread COULD have acquired the lock, modified variable, and then released the lock. When you access variable a second time, the compiler is free to use the original value (now stored in i) rather than re-loading it again, even though the other thread might have modified it inside the locks.
Wrong. If "lock" is based on phtreads primitives or a similar library, which is the case I am talking about, then "lock" acts as a memory barrier. The compiler will reload "variable". No need for making it volatile.

"The above" is right.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: volatile?

Post by syzygy »

bob wrote:The code looks ugly to me and is probably not safe.
If you had just read:
syzygy wrote:It seems splitPoint->moveCount is always accessed under lock protection (except for an assert), so it seems it could be made non-volatile.
Please spare us your confused "contributions" if you can't take the time to read first.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: volatile?

Post by bob »

syzygy wrote:
bob wrote:Sorry, but the above is wrong.
Oh boy, there we go again...
The above will fail because variable was not changed in this code between the first reference and the second. Another thread COULD have acquired the lock, modified variable, and then released the lock. When you access variable a second time, the compiler is free to use the original value (now stored in i) rather than re-loading it again, even though the other thread might have modified it inside the locks.
Wrong. If "lock" is based on phtreads primitives or a similar library, which is the case I am talking about, then "lock" acts as a memory barrier. The compiler will reload "variable". No need for making it volatile.

"The above" is right.
Not in my pthreads library. Lock just acquires the lock or blocks. we can compare library code if you wish.

BTW the lock I use is the same lock the linux kernel guys use, for reference, except they don't do the "pause" instruction. I don't use the pthreads locks because of the unnecessary blocking overhead.

BTW last time I looked the barrier/fence stuff was done INSIDE the CPU. Doesn't do a THING for a compiler that loads a value way up at the top, and then uses that value much later in execution.

So perhaps we are talking apples and oranges. Volatile is critical to make this work right at the software level. Locks are critical to make it work right at the hardware level. The two are not interchangeable and one can't stand in for the other, both are required.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: volatile?

Post by syzygy »

hgm wrote:
syzygy wrote:For multithreaded programs using volatile is not necessary if you are properly using the synchronisation primitives of a thread library, e.g. pthreads or C++11 threads.
I don't get this, but that is perhaps because I never used synchronization primitives of a thread library. If I write

Code: Select all

while(busy && notReady) DoSomething(&notReady); 
and the compiler does not know that 'busy' is volatile... what is to stop it from optimizing it to

Code: Select all

if(busy) while(notReady) DoSomething(&notReady);
?

Is it then improper use to do this without locking? This is a single atomic read, so locking the access to it seems very wasteful.
If you want to adhere to the "rules" of C+pthreads, you need to lock here. Or better, replace the whole thing by a synchronisation primitive provided by pthreads.

You can build your own, as you do here, and then volatile can be useful. But you'll have to trust your compiler to not do strange things. See the threads of a while back for what strange things compilers are allowed to do. (This means the implementor of a pthreads-like library needs to work together with the implementor of the compiler to make sure the combination works correctly.)
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: volatile?

Post by bob »

syzygy wrote:
bob wrote:The code looks ugly to me and is probably not safe.
If you had just read:
syzygy wrote:It seems splitPoint->moveCount is always accessed under lock protection (except for an assert), so it seems it could be made non-volatile.
Please spare us your confused "contributions" if you can't take the time to read first.
Please read what I wrote. Locks do NOT avoid the need for volatile. Not now, not ever. I might not be able to acquire the lock if someone else has it, but I can certainly keep a cached value of the variable for a long time, which can certainly be a problem...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: volatile?

Post by bob »

hgm wrote:
syzygy wrote:For multithreaded programs using volatile is not necessary if you are properly using the synchronisation primitives of a thread library, e.g. pthreads or C++11 threads.
I don't get this, but that is perhaps because I never used synchronization primitives of a thread library. If I write

Code: Select all

while(busy && notReady) DoSomething(&notReady); 
and the compiler does not know that 'busy' is volatile... what is to stop it from optimizing it to

Code: Select all

if(busy) while(notReady) DoSomething(&notReady);
?

Is it then improper use to do this without locking? This is a single atomic read, so locking the access to it seems very wasteful.
Correct. It can do that even if you DO use a lock. Locks don't force re-loading of variables previously loaded into a register. They just synchronize reads/writes so that you get a consistent/expected result. The compiler has no idea what goes on inside the lock library call, nor should it. And the memory fence is meaningless here as this is software, not hardware caching done by the compiler.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: volatile?

Post by Rein Halbersma »

hgm wrote:
syzygy wrote:For multithreaded programs using volatile is not necessary if you are properly using the synchronisation primitives of a thread library, e.g. pthreads or C++11 threads.
I don't get this, but that is perhaps because I never used synchronization primitives of a thread library. If I write

Code: Select all

while(busy && notReady) DoSomething(&notReady); 
and the compiler does not know that 'busy' is volatile... what is to stop it from optimizing it to

Code: Select all

if(busy) while(notReady) DoSomething(&notReady);
?

Is it then improper use to do this without locking? This is a single atomic read, so locking the access to it seems very wasteful.
There are many such "benign" data races. In the end, they are all evil, given a sufficiently aggressively optimizing compiler. See this Intel blog or this paper.

Furthermore, the claim that locks are bad for performance does not hold the way one would expect. Contention for locks hurts performance, not the locks themselves. A few hundred million hash entries with a few dozen threads, should scale just as well with or without locks. See this paper.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: volatile?

Post by syzygy »

bob wrote:
syzygy wrote:
bob wrote:The code looks ugly to me and is probably not safe.
If you had just read:
syzygy wrote:It seems splitPoint->moveCount is always accessed under lock protection (except for an assert), so it seems it could be made non-volatile.
Please spare us your confused "contributions" if you can't take the time to read first.
Please read what I wrote. Locks do NOT avoid the need for volatile. Not now, not ever. I might not be able to acquire the lock if someone else has it, but I can certainly keep a cached value of the variable for a long time, which can certainly be a problem...
Do you understand the term "memory barrier"?
After pthread_mutex_lock() the compiler knows it must reload all values from memory.

Look it up!!
User avatar
hgm
Posts: 27808
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: volatile?

Post by hgm »

Rein Halbersma wrote: Contention for locks hurts performance, not the locks themselves.
Well, the last time I looked machine instructions typically used for implementing locks, like XCHG reg, mem, or instructions using an explicit LOCK prefix, where incredibly expensive (like 48 clocks). I don't know if recent CPUs now have made that problem completely go away.
syzygy wrote:After pthread_mutex_lock() the compiler knows it must reload all values from memory.
So the compiler recognizes this specific function call, and does not treat it like any other?

If the compiler does such thing, it would be much cheaper to have an explicit memory fence function, that doesn't do anything (and for which the compiler knows it doesn't do anything, so it can even optimize the call away), just for the purpose of telling the compiler "here all memory could unpredictably change". Rather than using an expensive mutex where none was needed.

It still seems very inefficient to me, because the programmer would be forced to put in such calls to prevent optimization of a few volatile variables, but because the compiler does not know which, this wrecks optimization of all non-volatiles. The 'volatile' keyword seems a far more precise device.