Questions on volatile keyword and memory barriers

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Pradu
Posts: 287
Joined: Sat Mar 11, 2006 3:19 am
Location: Atlanta, GA

Questions on volatile keyword and memory barriers

Post by Pradu »

I've recently learnt about memory barriers and am rather confused about what they are meant to do and how they are different from what the volatile keyword does.

From my understanding:

- A read barrier makes sure all reads operations after the barrier happen after read operations before the barrier.
- A write barrier makes sure all write operations after the barrier happen after write operations before the barrier.

It seems to me that memory barriers are just a way to keep the order of reads and writes happening in the right order.

However I've seen text online which imply it replaces what the volatile keyword does. From my understanding,

- Applying volatile to a variable makes the code read it from "main memory" every time it is read with acquire semantics (read barrier?) and written to "main memory" with release semantics (write barrier?).

However, I haven't read anywhere that a read barrier can force the compiler to emit code that reads from "main memory" every time but I see it being implied:

http://www.mjmwired.net/kernel/Document ... armful.txt :

Code: Select all

58	Another situation where one might be tempted to use volatile is
59	when the processor is busy-waiting on the value of a variable.  The right
60	way to perform a busy wait is:
61	
62	    while (my_variable != what_i_want)
63	        cpu_relax();
http://lwn.net/Articles/233482/ :

Code: Select all

 (c) if you spin on a value [that's] changing, you should use "cpu_relax()" or
     "barrier()" anyway, which will force gcc to re-load any values from
     memory over the loop.
Apparently from the MSDN documentation, it seems like barriers can be used as a complete replacement of the volatile keyword:

http://msdn.microsoft.com/en-us/library/f20w0x5e.aspx

Code: Select all

Marking memory with a memory barrier is similar to marking memory with the  volatile (C++) keyword. However, a memory barrier is more efficient because reads and writes are forced to complete at specific points in the program rather than globally. The optimizations that can occur if a memory barrier is used cannot occur if the variable is declared volatile.
What I would like to know is if memory barriers can be used to completely replace the volatile keyword. Do read barriers force the compiler to emit code that makes the code read from "main memory" in addition to making sure all reads complete as implied by some of the documents above?

Would appreciate any clarification on my confused understanding of memory barriers and the volatile keyword.
Gian-Carlo Pascutto
Posts: 1243
Joined: Sat Dec 13, 2008 7:00 pm

Re: Questions on volatile keyword and memory barriers

Post by Gian-Carlo Pascutto »

I believe that barriers can indeed replace volatile (and have better defined semantics).

You can find some very interesting discussion by Linus Torvalds about this on the Linux Kernel mailing list.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions on volatile keyword and memory barriers

Post by bob »

memory barriers are used on architectures that do out-of-order memory writes. In those, the usual place a write barrier is used is where you acquire a lock, then write some values to memory and then clear the lock. Since clearing the lock is a write, you do _not_ want that to happen before the previous writes have been done, else you lose the protection of the lock and your critical section becomes unprotected.

volatile is a compiler directive, and has nothing to do with the architecture. It simply says "this value can change spontaneously". This instructs the compiler to re-read the value from memory each time it is accessed, rather than trying to optimize and keep the value in a register. This is an issue in two common places. One is where you have a threaded/parallel application and another thread can change a value you are occasionally checking. The compiler doesn't know about the concept of threading, so it looks at your source and sees a load here from X, then a little further it sees another load from X, and concludes "OK, X is not modified between the first and second load, so I can use the first value. not good for parallel algorithms. Operating systems use this for another purpose, since many devices use memory mapped I/O. If you want to discover if a disk drive is busy, you can go to a specific memory address and read it. The disk controller is watching for references to that address and it will supply the value rather than memory. This means that the value can change based on the status of the external hard drive, and by using volatile you tell the compiler "each time I load from this address, you can't assume it hasn't changed just because I haven't written to it, so go to memory each time and get the new value.

Hope that helps...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions on volatile keyword and memory barriers

Post by bob »

Gian-Carlo Pascutto wrote:I believe that barriers can indeed replace volatile (and have better defined semantics).

You can find some very interesting discussion by Linus Torvalds about this on the Linux Kernel mailing list.
How? Without the volatile keyword, something like

while (v);

will either never loop, or will loop infinitely. Even though another thread can change it. Because the compiler will optimize the load away since it doesn't see any local writes to the value.
vladstamate
Posts: 161
Joined: Thu Jan 08, 2009 9:06 pm
Location: San Francisco, USA

Re: Questions on volatile keyword and memory barriers

Post by vladstamate »

bob wrote:
Gian-Carlo Pascutto wrote:I believe that barriers can indeed replace volatile (and have better defined semantics).

You can find some very interesting discussion by Linus Torvalds about this on the Linux Kernel mailing list.
How? Without the volatile keyword, something like

while (v);

will either never loop, or will loop infinitely. Even though another thread can change it. Because the compiler will optimize the load away since it doesn't see any local writes to the value.
I agree with that. I recently spent a lot of time debugging the MP code of plisk and I have realised the the compiler decided to entirely remove my equivalent to while(v); code because the v variable was not being modified in the function. It was however modified in another thread. Adding volatile to variable v made the while statement work.

Debug code was fine since the compiler did not optimize the code out.

Regards,
Vlad.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions on volatile keyword and memory barriers

Post by bob »

vladstamate wrote:
bob wrote:
Gian-Carlo Pascutto wrote:I believe that barriers can indeed replace volatile (and have better defined semantics).

You can find some very interesting discussion by Linus Torvalds about this on the Linux Kernel mailing list.
How? Without the volatile keyword, something like

while (v);

will either never loop, or will loop infinitely. Even though another thread can change it. Because the compiler will optimize the load away since it doesn't see any local writes to the value.
I agree with that. I recently spent a lot of time debugging the MP code of plisk and I have realised the the compiler decided to entirely remove my equivalent to while(v); code because the v variable was not being modified in the function. It was however modified in another thread. Adding volatile to variable v made the while statement work.

Debug code was fine since the compiler did not optimize the code out.

Regards,
Vlad.
The problem is actually _much_ worse than that. It is more than possible that a non-volatile variable is treated as volatile because the compiler doesn't have enough registers to contain the value for any length of time, and therefore has to load it each time it is referenced due to nothing but luck. And then later versions of the compiler optimize better, free up a register, and a working code now fails and the assumption is "this compiler is broken, it works on the old compiler..." Or you change the code slightly so that it can be better optimized and then you get burned.

This is the part of parallel programming that takes _careful_ analysis during the design, or it will take a ton of analysis during the debugging.
User avatar
Zach Wegner
Posts: 1922
Joined: Thu Mar 09, 2006 12:51 am
Location: Earth

Re: Questions on volatile keyword and memory barriers

Post by Zach Wegner »

Pradu wrote:However I've seen text online which imply it replaces what the volatile keyword does. From my understanding,

- Applying volatile to a variable makes the code read it from "main memory" every time it is read with acquire semantics (read barrier?) and written to "main memory" with release semantics (write barrier?).

However, I haven't read anywhere that a read barrier can force the compiler to emit code that reads from "main memory" every time but I see it being implied:

...
I hadn't thought of this before, but it makes perfect sense. A read barrier guarantees that the reads after it are done after the ones before it. If you have a read-barrier-read sequence, the second read has to come after the first, not even at the same time. If it is stored in a register that means that the second read isn't happening after the first.

I think the real problem is getting the compiler to recognize the fence. Just putting an mfence asm instruction isn't suitable, since the compiler has to recognize that and adjust the reads so they complete in order--the asm instruction only applies to memory, not registers obviously. So the barrier should either be an intrinsic, or the compiler should see the instruction (if it parses through inline asm like GCC does), and the compiler can then "commit" modified registers to memory before the barrier and reread memory after it. There were some discussions about this at the last CCT...
Gian-Carlo Pascutto
Posts: 1243
Joined: Sat Dec 13, 2008 7:00 pm

Re: Questions on volatile keyword and memory barriers

Post by Gian-Carlo Pascutto »

bob wrote: How? Without the volatile keyword, something like

while (v);

will either never loop, or will loop infinitely. Even though another thread can change it. Because the compiler will optimize the load away since it doesn't see any local writes to the value.
The barrier will do exactly that: force the read to come from memory.

Here's a demonstration program for GCC:

Code: Select all

#include <stdio.h>

static inline void barrier&#40;void&#41; &#123;
         asm volatile (""&#58; &#58; &#58;"memory");
&#125;

int main&#40;) &#123;
        int v = 1;
        int c = 0;

        printf&#40;"%p\n", &v&#41;;

        while&#40;v&#41; &#123;
                //barrier&#40;);
                c++;
        &#125;;

        printf&#40;"%d\n", c&#41;;
&#125;
If you uncomment barrier, the loop will read v from memory just as I said, despite the variable itself not being volatile.

Note that GCC really is nasty: if you don't put the first printf, it will correctly infer nothing could possibly ever know where the variable is, and it will optimize away the entire program.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions on volatile keyword and memory barriers

Post by bob »

Gian-Carlo Pascutto wrote:
bob wrote: How? Without the volatile keyword, something like

while (v);

will either never loop, or will loop infinitely. Even though another thread can change it. Because the compiler will optimize the load away since it doesn't see any local writes to the value.
The barrier will do exactly that: force the read to come from memory.

Here's a demonstration program for GCC:

Code: Select all

#include <stdio.h>

static inline void barrier&#40;void&#41; &#123;
         asm volatile (""&#58; &#58; &#58;"memory");
&#125;

int main&#40;) &#123;
        int v = 1;
        int c = 0;

        printf&#40;"%p\n", &v&#41;;

        while&#40;v&#41; &#123;
                //barrier&#40;);
                c++;
        &#125;;

        printf&#40;"%d\n", c&#41;;
&#125;
If you uncomment barrier, the loop will read v from memory just as I said, despite the variable itself not being volatile.

Note that GCC really is nasty: if you don't put the first printf, it will correctly infer nothing could possibly ever know where the variable is, and it will optimize away the entire program.
Wait. That isn't a barrier. It is a volatile function which is a different issue that says "everything done prior to this function has to be completed before this function is executed." All it does as far as I can tell is limit the optimizer such that nothing can be moved beyond the call to the volatile function. Which is safe on X86. But if your architecture does out of order writes, you will be beyond screwed unless the compiler is smart enough to put a real barrier/fence instruction immediately preceeding that function call. I tried this on an old alpha box and it didn't, but I am pretty sure this is not the latest gcc on that machine...
Gian-Carlo Pascutto
Posts: 1243
Joined: Sat Dec 13, 2008 7:00 pm

Re: Questions on volatile keyword and memory barriers

Post by Gian-Carlo Pascutto »

bob wrote: It is a volatile function which is a different issue that says "everything done prior to this function has to be completed before this function is executed."
Yes, so it's a barrier. Note that the volatile in there is purely a GCC convention and doesn't have anything to do with volatile variables.

If you had been on Windows, I'd have put _ReadBarrier and there would have been no volatile word in there at all.
All it does as far as I can tell is limit the optimizer such that nothing can be moved beyond the call to the volatile function.
Yes, which is why it's called a barrier.

Just look in the Linux kernel that you're running how barriers are implemented...
Which is safe on X86. But if your architecture does out of order writes, you will be beyond screwed unless the compiler is smart enough to put a real barrier/fence instruction immediately preceeding that function call. I tried this on an old alpha box and it didn't, but I am pretty sure this is not the latest gcc on that machine...
Newer gcc versions have sync_synchronize to emit a hardware dependent barrier. I'm not aware of any portable way to emit one with older gcc's. The code I posted is ok for x86 but you want to add a "fence" or whatever is the equivalent to the asm block for other architectures.
For the code posted that is not even needed, it just needs v to come from memory instead of a register.

But anyway, case proven: you can get the code you posted working perfectly without using volatile and with only barriers.