-2

It is known that asm volatile ("" ::: "memory") can serve as a compiler barrier to prevent compiler from reordering assembly instructions across it. For example, it is mentioned in https://preshing.com/20120625/memory-ordering-at-compile-time/, section "Explicit Compiler Barriers".

However, all the articles I can find only mention the fact that asm volatile ("" ::: "memory") can serve as a compiler barrier without giving a reason why the "memory" clobber can effectively form a compiler barrier. The GCC online documentation only says that all the special clobber "memory" does is tell the compiler that the assembly code may potentially perform memory reads or writes other than those specified in operands lists. But how does such a semantic cause compiler to stop any attempt to reorder memory instructions across it? I tried to answer myself but failed, so I ask here: why can asm volatile ("" ::: "memory") serve as a compiler barrier, based on the semantics of "memory" clobber? Please note that I am asking about "compiler barrier" (in effect at compile-time), not stronger "memory barrier" (in effect at run-time). For convenience, I excerpt the semantics of "memory" clobber in GCC online doc below:

The "memory" clobber tells the compiler that the assembly code performs memory reads or writes to items other than those listed in the input and output operands (for example, accessing the memory pointed to by one of the input parameters). To ensure memory contains correct values, GCC may need to flush specific register values to memory before executing the asm. Further, the compiler does not assume that any values read from memory before an asm remain unchanged after that asm; it reloads them as needed. Using the "memory" clobber effectively forms a read/write memory barrier for the compiler.

Peter Cordes
  • 286,368
  • 41
  • 520
  • 731
zzzhhh
  • 179
  • 7
  • 1
    You cannot reorder an instruction if its effect is unknown. – n. 1.8e9-where's-my-share m. Jun 11 '21 at 21:23
  • IMO it tells the compiler to completer all memory (and consequently other operations as well) before the next statements. like in this example: https://godbolt.org/z/fKv6GaGET – 0___________ Jun 11 '21 at 23:20
  • 1
    I think you asked a very similar question a day or two ago, and as with it, I don't see why the quoted passage doesn't already answer the question. Can you give a specific example of a possible reordering, for which you are not sure whether or why the quoted text forbids it? Say, a snippet of C source together with assembly (pseudo)code that you are unsure whether the compiler could emit. Then someone can probably explain more concretely. – Nate Eldredge Jun 11 '21 at 23:31
  • @Nate Eldredge: You don't see it because you are an expert, and you think the two questions are very similar for the same reason. But I did not see the similarity of reads-everything / writes-everything effect to the effect of compiler barrier until just now -- they are both "the desired effect of requiring memory to be in sync." It is the phrase "in sync" that remind me of the analogy in https://preshing.com/20120710/memory-barriers-are-like-source-control-operations and then all at once I understand the similarity you referred to in the comment. – zzzhhh Jun 12 '21 at 22:26

1 Answers1

1

If a variable is potentially read or written, it matters what order that happens in. The point of a "memory" clobber is to make sure the reads and/or writes in an asm statement happen at the right point in the program's execution.

Any read of a C variable's value that happens in the source after an asm statement must be after the memory-clobbering asm statement in the compiler-generated assembly output for the target machine, otherwise it might be reading a value before the asm statement would have changed it.

Any read of a C var in the source before an asm statement similarly must stay sequenced before, otherwise it might incorrectly read a modified value.

Similar reasoning applies to assignments to (writes of) C variables before/after any asm statement with a "memory" clobber. Just like a function call to an "opaque" function, one who's definition the compiler can't see.

No reads or writes can reorder with the barrier in either direction, therefore no operation before the barrier can reorder with any operation after the barrier, or vice versa.


Another way to look at it: the actual machine memory contents must match the C abstract machine at that point. The compiler-generated asm has to respect that, by storing any variable values from registers to memory before the start of an asm("":::"memory") statement, and afterwards it has to assume that any registers that had copies of variable values might not be up to date anymore. So they have to be reloaded if they're needed.

This reads-everything / writes-everything assumption for the "memory" clobber is what keeps the asm statement from reordering at all at compile time wrt. all accesses, even non-volatile ones. The volatile is already implicit from being an asm() statement with no "=..." output operands, and is what stops it from being optimized away entirely (and with it the memory clobber).


Note that only potentially "reachable" C variables are affected. For example, escape analysis can still let the compiler keep a local int i in a register across a "memory" clobber, as long as the asm statement itself doesn't have the address as an input.

Just like a function call: for (int i=0;i<10;i++) {foobar("%d\n", i);} can keep the loop counter in a register, and just copy it to the 2nd arg-passing register for foobar every iteration. There's no way foobar can have a reference to i because its address hasn't been stored anywhere or passed anywhere.

(This is fine for the memory barrier use-case; no other thread could have its address either.)


Related:

Peter Cordes
  • 286,368
  • 41
  • 520
  • 731
  • 1
    I understand all at once when reading "Just like a function call to an "opaque" function". https://preshing.com/20120625/memory-ordering-at-compile-time/ gives the explanation I want, but for functions. Since it is in section "Implied Compiler Barriers", I failed to apply the same reasoning to `asm("":::"memory")` in section "Explicit Compiler Barriers". Now I think they are both implicit. Thank you for the detailed answer. PS, it's cool to point out `volatile` is implicit so we don't need to write it. – zzzhhh Jun 12 '21 at 02:05
  • 1
    PS, an example related to "reachable" can be found here: https://gcc.gnu.org/legacy-ml/gcc-help/2019-09/msg00016.html – zzzhhh Jun 12 '21 at 21:26