C++20: How is the returning from atomic::wait() guaranteed by the standard?

Question

This is a language-lawyer question.

First of all, does the a.wait() in the following code always get to return?

std::atomic_int a{ 0 };

void f()
{
    a.store(1, std::memory_order_relaxed);
    a.notify_one();
}
int main()
{
    std::thread thread(f);

    a.wait(0, std::memory_order_relaxed);//always return?

    thread.join();
}

I believe the standard's intention is that a.wait() always get to return. (Otherwise atomic::wait/notify would be useless, isn't it?) But I think the current standard text cannot guarantee this.

The relevant part of the standard is in §31.6 [atomics.wait] paragraph 4:

A call to an atomic waiting operation on an atomic object M is eligible to be unblocked by a call to an atomic notifying operation on M if there exist side effects X and Y on M such that:

(4.1) — the atomic waiting operation has blocked after observing the result of X,

(4.2) — X precedes Y in the modification order of M, and

(4.3) — Y happens before the call to the atomic notifying operation.

and §31.8.2 [atomics.types.operations] paragraph 29~33:

void wait(T old, memory_order order = memory_order::seq_cst) const volatile noexcept;

void wait(T old, memory_order order = memory_order::seq_cst) const noexcept;

Effects: Repeatedly performs the following steps, in order:

(30.1) — Evaluates load(order) and compares its value representation for equality against that of old.

(30.2) — If they compare unequal, returns.

(30.3) — Blocks until it is unblocked by an atomic notifying operation or is unblocked spuriously.

void notify_one() volatile noexcept;

void notify_one() noexcept;

Effects: Unblocks the execution of at least one atomic waiting operation that is eligible to be unblocked (31.6) by this call, if any such atomic waiting operations exist.

With the above wording, I see two problems:

If the wait() thread saw the value in step (30.1), compared it equal to old in step (30.2), and got scheduled out; then in another thread notify_one() stepped in and saw no blocking thread, doing nothing; the subsequent blocking in step (30.3) would never be unblocked. Here isn't it necessary for the standard to say "wait() function atomically performs the evaluation-compare-block operation", similar to what is said about condition_variable::wait()?
There's no synchronization between notify_*() and unblocking of wait(). If in step (30.3), the thread was unblocked by an atomic notifying operation, it would repeat step (30.1) to evaluate load(order). Here there is nothing preventing it from getting the old value. (Or is there?) Then it would block again. Now no one would wake it.

Is the above concern just nit-picking, or defect of the standard?

@NateEldredge, I think the `store(1)` in `f` corresponds to the `Y` in the standard. It is sequenced before the `notify`, hence happens before the `notify`. — zwhconst, Dec 04 '21 at 19:15
So then I don't see a problem. The `wait` is an atomic wait operation, and if it loads 0, then it is eligible to be unblocked by the notify. So the notify must in fact unblock it. It's the entire call to `wait` that's an atomic wait operation, not merely step 3 of it. — Nate Eldredge, Dec 04 '21 at 19:31
Unlike Nicol Bolas below, I see no way to prove that the store happens-before the second load. So I agree that the second load could still see the old value. Now, it seems to still be true that the wait is eligible to be unblocked by that same notify that already happened, but I don't think that's meant to force it to unblock itself immediately and keep loading `a` until it finally sees the new value. AFAICT you're supposed to be able to notify a thread even if the variable hasn't changed, and it should go back to sleep until you call `notify()` a second time. — Nate Eldredge, Dec 04 '21 at 20:31
I'm not even sure that making everything `seq_cst` would fix it. What if both the loads in the wait precede the store in the total order S? That would be extremely counterintuitive, but I don't see how to disprove it. I am beginning to wonder if #2 is indeed a defect, and that they meant to have the notify synchronize with the resulting unblock (though not with any spurious unblock). — Nate Eldredge, Dec 04 '21 at 20:41
I think your #1 is very similar to https://stackoverflow.com/questions/65556595/c20-thread-possibly-waiting-on-stdatomic-forever?rq=1 — Nate Eldredge, Dec 04 '21 at 20:46
@NateEldredge _I'm not even sure that making everything `seq_cst` would fix it. What if both the loads in the wait precede the store in the total order S?_ Till now, I was thinking `seq_cst` reads are like RMW operations — see the last value from the modification order of M, or at least something like the last value stored by a `seq_cst` write to M. Which is not really the case… — Language Lawyer, Dec 04 '21 at 21:02
Well, I think that much is true, at least if every operation in sight is `seq_cst`. But which one is the "last value" is kind of tautological. If your load happens-before a store, then you see the old value; but even with `seq_cst` the converse is not true. If the coherence ordering on `a` were "initialize to 0, load 0 first time, load 0 second time, store 1" then everything is fine; the "store 1" doesn't synchronize with anything, since nobody loads the value 1. To contradict it, I think we have to show that the "store 1" happens-before the second load, which I see no way to prove. — Nate Eldredge, Dec 04 '21 at 21:15
@NateEldredge _But which one is the "last value" is kind of tautological_ Ehm, why? Modification order of M is specified and the write of 1 is the last operation there when `wait` is unblocked. The thing is, even `seq_cst` doesn't require reading the value from the last operation in modification order of M. — Language Lawyer, Dec 04 '21 at 21:20
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/239829/discussion-between-nate-eldredge-and-language-lawyer). — Nate Eldredge, Dec 04 '21 at 21:37

score 1 · Answer 1 · answered Dec 04 '21 at 23:23

#1 is pretty much addressed by C++20 thread possibly waiting on std::atomic forever. The wait() operation is clearly eligible to be unblocked by the notify(), and is the only such operation, so the notify() must unblock it. The eligible "wait operation" is the entire call, not only step 30.3.

If an implementation performs steps 30.1-3 in a non-atomic fashion, such that the notify can happen "between" steps 1 and 3, then it has to somehow ensure that step 3 unblocks anyway.

#2 is stickier. At this point I think you are right: the standard doesn't guarantee that the second load gets the value 1; and if it doesn't, then it will presumably block again and never be woken up.

The use of the relaxed memory ordering makes it pretty clear in this example. If we wanted to prove that the second load must see 1, the only way I can see is to invoke write-read coherence (intro.races p18) which requires that we prove the store happens before the load, in the sense of intro.races p10. This in turn requires that somewhere along the way, we have some operation in one thread that synchronizes with some operation in the other (you can't get inter-thread happens before without a synchronizes with, unless there are consume operations which is not the case here). Usually you get synchronizes with from a pairing of an acquire load with a release store (atomics.order p2), and here we have no such thing; nor, as far as I can tell, anything else that would synchronize. So we don't have a proof.

In fact, I think the problem persists even if we upgrade to seq_cst operations. We could then have both loads coherence-ordered before the store, and the total order S of atomics.order p4 would go "first load, second load, store". I don't see that contradicting anything. We would still have to show a synchronizes with to rule this out, and again we can't. There might appear to be a better chance than in the relaxed case, since seq_cst loads and stores are acquire and release respectively. But the only way to use this would be if one of the loads were to take its value from the store, i.e. if one of the loads were to return 1, and we are assuming that is not the case. So again this undesired behavior seems consistent with all the rules.

It does make you wonder if the Standard authors meant to require the notification to synchronize with the unblocking. That would fix the problem, and I would guess real-life implementations already include the necessary barriers.

But indeed, I am not seeing this specified anywhere.

The only possible way out that I can see is that "eligible to be unblocked" applies to the entire wait operation, not just to a single iteration of it. But it seems clear that the intent was that if you are unblocked by a notify and the value has not changed, then you block again until a second notify occurs (or spurious wakeup).

It's starting to look to me like a defect.

C++20: How is the returning from atomic::wait() guaranteed by the standard?

1 Answers1

Linked