Rein Halbersma wrote: Contention for locks hurts performance, not the locks themselves.
Well, the last time I looked machine instructions typically used for implementing locks, like XCHG reg, mem, or instructions using an explicit LOCK prefix, where incredibly expensive (like 48 clocks). I don't know if recent CPUs now have made that problem completely go away.
syzygy wrote:After pthread_mutex_lock() the compiler knows it must reload all values from memory.
So the compiler recognizes this specific function call, and does not treat it like any other?
If the compiler does such thing, it would be much cheaper to have an explicit memory fence function, that doesn't do anything (and for which the compiler knows it doesn't do anything, so it can even optimize the call away), just for the purpose of telling the compiler "here all memory could unpredictably change". Rather than using an expensive mutex where none was needed.
It still seems very inefficient to me, because the programmer would be forced to put in such calls to prevent optimization of a few volatile variables, but because the compiler does not know which, this wrecks optimization of all non-volatiles. The 'volatile' keyword seems a far more precise device.