The SMTP service we used is based on AWS Simple Email Service (SES), and we have updated the airflow config file to include hostname, smtp user and password, port number, mail from address, etc.
This is how we set up in the config file. The mail from address is a google group whose domain is owned by our SES service and has been verified by the SES we use.
[email]
email_backend = airflow.utils.email.send_email_smtp
[smtp]
# If you want airflow to send emails on retries, failure, and you want to use
# the airflow.utils.email.send_email_smtp function, you have to configure an
# smtp server here
smtp_host = email-smtp.us-west-2.amazonaws.com
smtp_starttls = True
smtp_ssl = False
# Example: smtp_user = airflow
smtp_user = <user name>
# Example: smtp_password = airflow
smtp_password = <password>
smtp_port = 587
smtp_mail_from = <verified email service> (this is verified on SES)
And in the DAG job, we set up the parameter for send_email_smtp function as below
args = {
'owner': 'airflow',
'email': ['*******s5t8@salesforceiq.slack.com'], (this reciepient address is also verified, basically if you send email to this address, the slack channel will get notified)
'depends_on_past': False,
'start_date': datetime(2021,5,2),
'end_date':datetime(2021,6,2)
}
Then I added a DAG job that was supposed to fail and send email, however in the log file, the email was not sent successfully because of this error
[2021-05-06 21:22:46,787] {{taskinstance.py:1194}} INFO - Marking task as FAILED. dag_id=testemaildag, task_id=fail_task, execution_date=20210506T212242, start_date=20210506T212246, end_date=20210506T212246
[2021-05-06 21:22:46,804] {{configuration.py:338}} WARNING - section/key [smtp/smtp_user] not found in config
[2021-05-06 21:22:46,804] {{configuration.py:338}} WARNING - section/key [smtp/smtp_user] not found in config
[2021-05-06 21:22:46,804] {{taskinstance.py:1200}} ERROR - Failed to send email to: ['v8i4h9j3e2n1s5t8@salesforceiq.slack.com']
[2021-05-06 21:22:46,805] {{taskinstance.py:1201}} ERROR - [Errno 99] Cannot assign requested address
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/airflow/models/taskinstance.py", line 984, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.7/dist-packages/airflow/operators/python_operator.py", line 113, in execute
return_value = self.execute_callable()
File "/usr/local/lib/python3.7/dist-packages/airflow/operators/python_operator.py", line 118, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/mnt/apps/airflow_dags/test_alert2.py", line 11, in throwerror
raise ValueError("Failure")
ValueError: Failure
I searched about this error online and found a relevant answer (https://stackoverflow.com/a/64330227/4240869), but I don’t quite understand the fix I am supposed to add to make it work. In my case, the airflow service is running in the container, so I tested some quick fixes including
- exposed the port 587 in Dockerfile here
- adding the port 587 as a reserved port in nomad job as well
This is after I made the change, 587 exposed for docker image and port 587 on the host is listening to it. However, I am still seeing the same error when testing the job failure notification.
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
<container id> <image tag> "/entrypoint.sh webs…" 2 minutes ago Up 2 minutes 10.27.0.119:587->587/tcp, 10.27.0.119:587->587/udp, 5555/tcp, 10.27.0.119:8080->8080/tcp, 10.27.0.119:8080->8080/udp, 8793/tcp airflow-webserver-098d9c23-c6f5-0c85-f970-83e8e5ab9fc6
I found a Github issue similar to my case (https://github.com/puckel/docker-airflow/issues/338) and looks like it hasn't been resolved. I wonder if this is a known problem that hasn't be resolved yet.