End/exit a glue job programmatically

Question

I am using Glue bookmarking to process data. My job is scheduled every day, but can also be launch "manually". Since I use bookmarks, sometimes the Glue job can start without having new data to process, the read dataframe is then empty. In this case, I want to end my job properly because it has nothing to do. I tried:

if df.rdd.isEmpty():
    job.commit()
    sys.exit(0)

However, my job terminate in error with SystemExit: 0.

How to end the job with success?

My question is a fork of the not working answer to https://stackoverflow.com/questions/67028388/how-to-stop-exit-a-aws-glue-job-pyspark — Jérémy, Sep 22 '21 at 09:27

score 2 · Accepted Answer · answered Sep 22 '21 at 09:40

After some test, I discovered from @Glyph's answer that :

os._exit() terminates immediately at the C level and does not perform any of the normal tear-downs of the interpreter.

Which is exactly what I was looking for. The final solution is:

import os

if df.rdd.isEmpty():
    job.commit()
    os._exit()

End/exit a glue job programmatically

1 Answers1