1

I'm thinking about "simple" problem. I have database with messages and email adresses to send:

emailAddress string
message string 
Sent bool

Is it possible to write a system in which I can be sure whatever happens that email will be sent only once? I don't care if it was received, lets assume that email send operation is atomic and just after sending it I know if it succedded or not.

So pseudocode can look like:

OpenTransaction
   try
      UpdateRow (sent = true)
      SendEmail()
   catch
      RollbackTransaction
CommitTransation

But in this case after email is sent, someone may turn off server and then that information will not be stored in DB.

I would like to read a bit about such problems. Do you have any resources to take a look at?

BobDalgleish
  • 4,694
Snorlax
  • 117

3 Answers3

8

But in this case after email is sent someone may turn off server and then that information will not be stored in DB.

This is not something you can 100% cover in code.

It's always possible that the server loses power exactly between the "send" and "confirm it being sent" steps. You can't write code that forces the machine to stay powered for longer than it physically is.

If you instead invert the steps, and first store the confirmation and only then send the email, then you run the risk (when the power goes out) of having registered an email as sent but not actually having sent it.

The simple question becomes: is it better to have sent an email twice, or to have sent no email at all?

In distributed systems, there are only two possible approaches:

  • Send at most once
  • Send at least once

In either scenario, you'd ideally send your message/command exactly once. The distinction between them is related to what should happen if the scenario is not ideal, i.e. if a problem occurs.

In "send at most once", you accept that it's better to not send a message/command than it is to send it twice. Compared to the other scenario, this one is rarely used, only in cases where taking an action twice is something that must be avoided at all costs (e.g. medical equipment that provides dosages, credit card transactions, ...)

"Send at least once" is the more common scenario, which accepts that it's better to mistakenly send a message/command more than once than it is to never send it at all. This is the better choice for scenario's where there is no real consequence from sending something twice. The extra bandwidth used is negligible, receiving an email twice is a minor glitch with no real consequence.

Eventual consistency is founded on the "at least once" principle, as it entails retrying something until it finally succeeds. If you retry something, that inherently means you try something more than once. If you try something more than once, it's inherently possible that more than one of those attempts succeeded (possibly without you knowing), and thus it's plausible for something to be done more than once.

There is no such thing as "send exactly once without fail". More broadly, there is no such thing as "do without fail".
If you're struggling to find an example of how your system might ever fail, consider what would happen if a nuclear missile or planetkiller asteroid were to hit your server park.

Flater
  • 49,580
  • note: there is "send at least once and have the receiver ignore duplicates", but email does not have this option (until the email gets to the human). – user253751 Mar 03 '20 at 13:53
  • 2
    This is a good example of the 'two generals problem' https://en.wikipedia.org/wiki/Two_Generals%27_Problem – Robin Bennett Mar 03 '20 at 14:16
  • @user253751: Mail clients do tend to condense emails with the same subject/sender into a single conversation, which does somewhat cut down on the possible clutter. – Flater Mar 03 '20 at 14:24
  • 1
    @RobinBennett: I wanted to add that problem but it felt like it would be took lengthy. We discussed the two generals problem in college and it always stuck with me. – Flater Mar 03 '20 at 14:25
  • I think it only needs a brief mention, the wiki page does a good job of discussing it, and there's plenty more for anyone that googles it. Sometimes it helps to know that you've run into a problem that has been studied in detail, and that there is no easy solution. – Robin Bennett Mar 03 '20 at 14:39
3

Record the attempt to send and the success of sending separately.

Begin
try
   UpdateRow(status = sending)
catch
   Rollback 
Commit

Begin
try
   SendEmail() 
   UpdateRow(status = sent)
catch
   Rollback 
Commit

If something goes wrong part way through, you will have records stranded in the "sending" state. You'll need something to clean these up but there shouldn't be many of them.

Phill W.
  • 12,181
  • 1
    If there's a problem during sending, we don't know whether the email was already sent at that point. Thus, just setting the sent status before attempting the SendEmail() might boil down to the same thing in practice. – amon Mar 03 '20 at 12:52
  • "If something goes wrong part way through, you will have records stranded in the "sending" state. You'll need something to clean these up" Note that this is built on the assumption that it's acceptable to remove "sending" entries which may not actually have been sent. It requires you to accept the possibility of not sending emails, which is orthogonal to the concept of eventual consistency (which this question is tagged as) – Flater Mar 03 '20 at 13:42
  • 1
    This doesn't solve the problem, it just lets you know which rows have a problem - and that might have been obvious before. – Robin Bennett Mar 03 '20 at 14:14
  • @Flater: Your reply is based on the assumption that "clean these up" means "delete them". :-)
    Depending on /why/ the message couldn't be sent, the messages could be set back to the "ready-to-send" status and go round the loop again. Alternatively there may be genuine reasons why a message cannot be sent and that requires manual intervention (a.k.a "cleaning up").
    – Phill W. Mar 04 '20 at 14:44
  • @PhillW.: The point is that you can't know if they have been sent or not (did it not get sent, or did the send confirmation never get registered? You can't know that for sure), regardless of why they maybe weren't sent. – Flater Mar 04 '20 at 14:47
0

You can solve this problem for practical purposes with two computers.

The first one (A) sends, the second (B) monitors, in this case the network traffic.

When A sends a message B can see the network traffic and detect a successful send as denoted by the email server returning a "221 Bye"

Now if computer A crashes before it can record the success, It knows that it may have been sent. ie status = "sending". Computer A can check this unknown message with Computer B. "Hi I crashed, did that message go through?"

If B crashes A needs to know about it, so it should keep talking to B "are you alive?" and stop sending if it doesn't get a response.

So now you only have an issue if A and B crash at same time. (or networking edge cases) You can manage this risk with separate power supplies, physical separation, a third monitor computer etc etc

Ewan
  • 75,506