Efficient way to toggle an atomic_bool

Question

If I have atomic_bool flag;, how can I write C code to toggle it that's atomic, portable, and efficient? Regarding "efficient", I'd like it to assemble on x86_64 to lock xorb $1, flag(%rip). The "obvious" flag = !flag; is out because it isn't actually atomic. My next guess would be flag ^= true;, which assembled to this mess on GCC:

        movzbl  flag(%rip), %eax
0:
        movb    %al, -1(%rsp)
        xorl    $1, %eax
        movl    %eax, %edx
        movzbl  -1(%rsp), %eax
        lock cmpxchgb   %dl, flag(%rip)
        jne     0b

And this mess on Clang:

        movb    flag(%rip), %al
0:
        andb    $1, %al
        movl    %eax, %ecx
        xorb    $1, %cl
        lock            cmpxchgb        %cl, flag(%rip)
        jne     0b

Then I tried specifying a weaker memory order by doing atomic_fetch_xor_explicit(&flag, true, memory_order_acq_rel); instead. This does what I want on Clang, but GCC now completely fails to compile it with error: operand type '_Atomic atomic_bool *' {aka '_Atomic _Bool *'} is incompatible with argument 1 of '__atomic_fetch_xor'. Interestingly, if my type is an atomic_char instead of an atomic_bool, then both GCC and Clang emit the assembly that I want. Is there a way to do what I want with atomic_bool?

Note from standard: [7.17.7.5](https://port70.net/~nsz/c/c11/n1570.html#7.17.7.5) xor is not applicable to `atomic_bool`. (I wonder if the "None of these operations is applicable to atomic_bool" part from the standard suggest that compilers _should_ refuse to compile such code, making clang non-conformant here.) — KamilCuk, Aug 17 '20 at 16:10
This is a deliberate choice in gcc. See [bugzilla 68966](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68966) and [bugzilla 68908](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68908#c13). — P.P, Aug 17 '20 at 16:22
@chtz I *can*, but it feels like an awful hack, and then it'd be much easier for someone to mistakenly put something other than 0 or 1 in it, which would end badly. — Joseph Sible-Reinstate Monica, Aug 17 '20 at 16:30
Efficiency just doesn't have anything to do with the code generation. Count about 150 cycles for the lock and you'll rarely be disappointed. — Hans Passant, Aug 17 '20 at 16:33
@JosephSible-ReinstateMonica gcc considers it not allowed per standard (even if there's a question about what "not applicable" C11, 7.17.7.5 means). You could instead use: `typedef unsigned char mybool_t;` and then you could use it with the gcc intrinsics. It's not ideal - having to create another "bool" type - but it could be a good enough workaround (At least, tThat's what I did when I had this problem in the past :) — P.P, Aug 17 '20 at 16:37
@JosephSible-ReinstateMonica Using `_Bool` isn't any safer than using `char` when it comes to "someone setting it to a value other than 0 or 1", just use `atomic_char`, and if it is a global, then provide `set`/`get` functions that accept a bool and write to a char. If you really think that it is a hack, remember that C bool type is just a hacked in typedef for `_Bool`, whatever that is, which is a hack that is here simply to keep backward compatibility with code that may have used "bool" with assumption that it isn't a keyword, and there's no point in worrying about such things anyway. — Yamirui, Aug 17 '20 at 17:23
@Yamirui `_Bool flag = 2;` won't actually store a 2, but `char flag = 2;` will, so I disagree that `_Bool` isn't any safer. — Joseph Sible-Reinstate Monica, Aug 17 '20 at 17:31
@JosephSible-ReinstateMonica both are fundamentally wrong to do and it is a problem with the programmer doing it. Worrying about someone using `2` when there's no valid boolean value that can be mapped from `2` is like worrying about someone passing a `NULL` to api that clearly states it an undefined behaviour to do so. This is just how it is in C and it is too late to change that now. — Yamirui, Aug 17 '20 at 20:17
@Yamirui: So you're arguing the `bool foo = return_zero_or_nonzero();` is an error, and you should have written `bool foo = (f() != 0);` to explicitly booleanize, rather than rely on implicit conversion to bool? That's one style choice, but with `unsigned char` the compiler isn't going to warn you if you get it wrong. — Peter Cordes, Jun 02 '22 at 17:13
If compilers suck at `flag ^= 1;`, that's a missed optimization on their part, and should get reported (https://github.com/llvm/llvm-project/issues and https://gcc.gnu.org/bugzilla/enter_bug.cgi?product=gcc). If the return value is unused, yes, `lock xorb` is optimal. And if it is used, `lock btc $0, flag(%rip)`. — Peter Cordes, Jun 02 '22 at 17:24
@PeterCordes Do the compilers actually suck at that, or is the mess they make necessary for that to be `memory_order_seq_cst` (which is what the standard requires for any operation where you don't specify an explicit one, even though I don't need it)? — Joseph Sible-Reinstate Monica, Jun 02 '22 at 17:28
`lock xorb` is a full barrier and more than sufficient for a seq_cst RMW, just like `lock addl` is safe for `atomic_fetch_add`. (Or `lock xaddl` if the return value is used.) x86 can't do atomic RMWs with anything less than a full memory barrier, so `atomic_fetch_add_explicit` for a relaxed integer add only allows compile-time reordering, still `lock add` or `lock xadd` in the asm, same as seq_cst. See [The strong-ness of x86 store instruction wrt. SC-DRF?](https://stackoverflow.com/q/70249647) re: `xchg` or other locked insn being as strong as a full SC *fence*. — Peter Cordes, Jun 02 '22 at 17:36

Efficient way to toggle an atomic_bool

0 Answers0