5

If I have atomic_bool flag;, how can I write C code to toggle it that's atomic, portable, and efficient? Regarding "efficient", I'd like it to assemble on x86_64 to lock xorb $1, flag(%rip). The "obvious" flag = !flag; is out because it isn't actually atomic. My next guess would be flag ^= true;, which assembled to this mess on GCC:

        movzbl  flag(%rip), %eax
0:
        movb    %al, -1(%rsp)
        xorl    $1, %eax
        movl    %eax, %edx
        movzbl  -1(%rsp), %eax
        lock cmpxchgb   %dl, flag(%rip)
        jne     0b

And this mess on Clang:

        movb    flag(%rip), %al
0:
        andb    $1, %al
        movl    %eax, %ecx
        xorb    $1, %cl
        lock            cmpxchgb        %cl, flag(%rip)
        jne     0b

Then I tried specifying a weaker memory order by doing atomic_fetch_xor_explicit(&flag, true, memory_order_acq_rel); instead. This does what I want on Clang, but GCC now completely fails to compile it with error: operand type '_Atomic atomic_bool *' {aka '_Atomic _Bool *'} is incompatible with argument 1 of '__atomic_fetch_xor'. Interestingly, if my type is an atomic_char instead of an atomic_bool, then both GCC and Clang emit the assembly that I want. Is there a way to do what I want with atomic_bool?

Peter Cordes
  • 286,368
  • 41
  • 520
  • 731
  • 3
    Note from standard: [7.17.7.5](https://port70.net/~nsz/c/c11/n1570.html#7.17.7.5) xor is not applicable to `atomic_bool`. (I wonder if the "None of these operations is applicable to atomic_bool" part from the standard suggest that compilers _should_ refuse to compile such code, making clang non-conformant here.) – KamilCuk Aug 17 '20 at 16:10
  • 2
    This is a deliberate choice in gcc. See [bugzilla 68966](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68966) and [bugzilla 68908](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68908#c13). – P.P Aug 17 '20 at 16:22
  • 3
    Is there a reason you can't use `atomic_char flag;`? – chtz Aug 17 '20 at 16:27
  • @chtz I *can*, but it feels like an awful hack, and then it'd be much easier for someone to mistakenly put something other than 0 or 1 in it, which would end badly. – Joseph Sible-Reinstate Monica Aug 17 '20 at 16:30
  • Efficiency just doesn't have anything to do with the code generation. Count about 150 cycles for the lock and you'll rarely be disappointed. – Hans Passant Aug 17 '20 at 16:33
  • @JosephSible-ReinstateMonica gcc considers it not allowed per standard (even if there's a question about what "not applicable" C11, 7.17.7.5 means). You could instead use: `typedef unsigned char mybool_t;` and then you could use it with the gcc intrinsics. It's not ideal - having to create another "bool" type - but it could be a good enough workaround (At least, tThat's what I did when I had this problem in the past :) – P.P Aug 17 '20 at 16:37
  • 1
    @JosephSible-ReinstateMonica Using `_Bool` isn't any safer than using `char` when it comes to "someone setting it to a value other than 0 or 1", just use `atomic_char`, and if it is a global, then provide `set`/`get` functions that accept a bool and write to a char. If you really think that it is a hack, remember that C bool type is just a hacked in typedef for `_Bool`, whatever that is, which is a hack that is here simply to keep backward compatibility with code that may have used "bool" with assumption that it isn't a keyword, and there's no point in worrying about such things anyway. – Yamirui Aug 17 '20 at 17:23
  • 3
    @Yamirui `_Bool flag = 2;` won't actually store a 2, but `char flag = 2;` will, so I disagree that `_Bool` isn't any safer. – Joseph Sible-Reinstate Monica Aug 17 '20 at 17:31
  • 1
    @JosephSible-ReinstateMonica both are fundamentally wrong to do and it is a problem with the programmer doing it. Worrying about someone using `2` when there's no valid boolean value that can be mapped from `2` is like worrying about someone passing a `NULL` to api that clearly states it an undefined behaviour to do so. This is just how it is in C and it is too late to change that now. – Yamirui Aug 17 '20 at 20:17
  • 2
    @Yamirui: So you're arguing the `bool foo = return_zero_or_nonzero();` is an error, and you should have written `bool foo = (f() != 0);` to explicitly booleanize, rather than rely on implicit conversion to bool? That's one style choice, but with `unsigned char` the compiler isn't going to warn you if you get it wrong. – Peter Cordes Jun 02 '22 at 17:13
  • If compilers suck at `flag ^= 1;`, that's a missed optimization on their part, and should get reported (https://github.com/llvm/llvm-project/issues and https://gcc.gnu.org/bugzilla/enter_bug.cgi?product=gcc). If the return value is unused, yes, `lock xorb` is optimal. And if it is used, `lock btc $0, flag(%rip)`. – Peter Cordes Jun 02 '22 at 17:24
  • @PeterCordes Do the compilers actually suck at that, or is the mess they make necessary for that to be `memory_order_seq_cst` (which is what the standard requires for any operation where you don't specify an explicit one, even though I don't need it)? – Joseph Sible-Reinstate Monica Jun 02 '22 at 17:28
  • 1
    `lock xorb` is a full barrier and more than sufficient for a seq_cst RMW, just like `lock addl` is safe for `atomic_fetch_add`. (Or `lock xaddl` if the return value is used.) x86 can't do atomic RMWs with anything less than a full memory barrier, so `atomic_fetch_add_explicit` for a relaxed integer add only allows compile-time reordering, still `lock add` or `lock xadd` in the asm, same as seq_cst. See [The strong-ness of x86 store instruction wrt. SC-DRF?](https://stackoverflow.com/q/70249647) re: `xchg` or other locked insn being as strong as a full SC *fence*. – Peter Cordes Jun 02 '22 at 17:36

0 Answers0