trick question

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: trick question

Post by syzygy »

jwes wrote:I only write to this variable from one thread and do not care if it is read wrong once (as long as it is either the old value or the new value). I believe that is enough. If it is not, I hope someone will explain it to me.
But the C or C++ standard does not guarantee that the value read will be either the new or the old value. On a machine that writes a value bit for bit, for example, the reading thread could see a combination of old and new bits.

The compiler might also be able to figure out that the variable is only read by a thread, so that that thread can "safely" cache its value in a register or a local variable.

This is why according to the C11 and C++11 standards reading a non-atomic value that may simultaneously be written by another thread gives undefined behaviour. (Earlier standards do not know about threads, so by definition guarantee nothing about threading behaviour.)

If you make the variable atomic and access it with relaxed memory ordering, all is fine. If accesses of the variable by the underlying machine architecture is atomic, this should not cause any overhead. The compiler is still allowed to temporarily cache the value of an atomic variable (so if your code accesses it a couple of times in a row the compiler may assume that all accesses return the same value and keep that value temporarily around in a register) but it will not cache the value indefinitely (so the code is guaranteed to eventually see a write by another thread).

Or you can just leave things as they are and accept that your code makes more assumptions about the machine architecture and the compiler than the C and C++ standards guarantee. This could bite you in the long run, but that may be acceptable to you.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: trick question

Post by syzygy »

Ras wrote:
AlvaroBegue wrote:I'll try to explain what I understand. If it's not atomic, a thread might feel free to do something like keeping its value in a register and not read it from memory the next time around.
I think you are mixing up atomic and volatile. They have nothing to do with each other.
But atomic is the right concept here.

Volatile is unnecessarily inefficient, as it would require the compiler to actually access the memory location everytime the code accesses the variable. With atomic, the compiler only needs to write the final value or can just read the value once and reuse it a number of times (just not indefinitely).
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: trick question

Post by syzygy »

hgm wrote:It seems that all accesses to volatile variables are forced to be actually made, and in program order. If that is true, I don't see how anything could go wrong if all data shared between threads is declared as volatile.
What can go wrong (maybe not in the specific synchronisation example that started this topic but in general) is that the processor reorders the memory accesses to the shared data in a way that the program's logic does not expect.
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: trick question

Post by Ras »

hgm wrote:It seems that all accesses to volatile variables are forced to be actually made, and in program order.
But only in relation to other volatile access, not to non-volatile access. The other thing is that volatile only guarantees that the compiler doesn't do reordering - it does not address out of order execution on CPU level. That would require memory barriers (which are btw. included in mutexes).
If that is true, I don't see how anything could go wrong if all data shared between threads is declared as volatile.
If the variables are not atomic (means mostly: int), but e.g. complex structures or bitfields, then the writing thread can be preempted in the middle of the write process, and the reading thread gets corrupted data. That can be solved either with mutexes or with lockless algorithms.

I'd always take a mutex unless benchmarks proves this is a bottleneck because mutexes are easy to get right while lockless algorithms are hard, especially with regard to future hardware. With a chess engine, the most important place for lockless algorithms is hashtable access. Bob has written a lot on that.
When volatile flags would be used in an attempt to synchronize access to non-volatile data, this just seems improper use to me.
Yep, that works only when you know that there is no out of order execution and you are using compiler barriers. In embedded systems, this is not unusual.

And then there's even more fun with ARM's big.LITTLE architecture where you have radically different cores, depending on the activity of the application, which can be pushed around dynamically.
rjgibert
Posts: 317
Joined: Mon Jun 26, 2006 9:44 am

Re: trick question

Post by rjgibert »

Code: Select all

nodes++

if(nodes == every200) { 
    check_keyboard()
    every200 += 200 
} 
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: trick question

Post by AlvaroBegue »

rjgibert wrote:

Code: Select all

nodes++

if(nodes == every200) { 
    check_keyboard()
    every200 += 200 
} 
My code (I posted earlier) is slightly more complicated because I also count quiescence-search nodes, but I only call check_keyboard() from the main search function.