2

I am able to follow the instructions in https://docs.aws.amazon.com/glue/latest/dg/monitor-continuous-logging-enable.html, and log messages in driver. But when I try to use the logger inside the map function like this

sc = SparkContext()
glueContext = GlueContext(sc)
logger = glueContext.get_logger()
logger.info("starting glue job...") #successful
...
def transform(item):
    logger.info("starting transform...") #error
    ...transform logics...

Map.apply(frame = dynamicFrame, f = transform)

I get this error:

PicklingError: Could not serialize object: TypeError: can't pickle _thread.RLock objects

I researched around and the message implies that the logger object cannot be serialized when passed to the worker.

What's the correct way to do logging in AWS Glue worker?

ZygD
  • 10,844
  • 36
  • 65
  • 84
  • https://stackoverflow.com/a/52064491/4326922 – Prabhakar Reddy Sep 14 '21 at 01:55
  • @PrabhakarReddy Thank you for the link! But that's not what I wanted. That post talks about how to aggregates logs and sent back to driver, but I want to log stuff on the executor itself. Glue has a the executor log stream that I can access, but I can't find a way to put logs to it – Xiqiang Lin Sep 21 '21 at 00:19
  • @XiqiangLin, did you ever figure this out? – TheHud Jan 19 '22 at 16:38

0 Answers0