0

I have successfully run python code in local to access a RDS MySQL instance in AWS by importing the mysql.connector package.

brew unlink mysql
brew install mysql-connector-c
sed -i -e 's/libs="$libs -l "/libs="$libs -lmysqlclient -lssl -lcrypto"/g' /usr/local/Cellar/mysql/8.0.21/bin/mysql_config
C_INCLUDE_PATH=/Users/myself/OneDrive/ASEME/libs LDFLAGS=-L/usr/local/opt/openssl/lib  pip3 install MYSQL-python
brew unlink mysql-connector-c
brew link --overwrite mysql

However, now I need to move the code to AWS Glue but I don't know how to configure the environment only by uploading a library in .zip in a S3 bucket

I have tried to zip the src folder of the python connector https://dev.mysql.com/downloads/connector/python/?os=26

but when running the job

import mysql.connector
import sys
import boto3
import os

ENDPOINT="yyy"
PORT="3306"
USR="admin"
REGION="zzz"
DBNAME="xxx"
os.environ['LIBMYSQL_ENABLE_CLEARTEXT_PLUGIN'] = '1'

#gets the credentials from .aws/credentials
session = boto3.Session(profile_name='default')
client = boto3.client('rds')

I get the error:

  File "/tmp/runscript.py", line 123, in <module>
    runpy.run_path(temp_file_path, run_name='__main__')
  File "/usr/local/lib/python3.6/runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/usr/local/lib/python3.6/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/local/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/tmp/glue-python-scripts-be10tih2/filescontrol.py", line 1, in <module>
ModuleNotFoundError: No module named 'mysql'

how to run the code? if the problem is that mysql installation is needed a part from the python connector then I don't think glue can handle code involving a MySQL connection through python library

user2728349
  • 101
  • 2
  • 8
  • are you planning to run this code on python shell or Glue pyspark job? – Prabhakar Reddy Jul 28 '20 at 11:50
  • I can do it in a python shell but I also could use pyspark. The reason I have first tried with python is that I see that only with python shell non pure python .zip libraries are supported. Also python shell is cheaper. So I separate the whole ETL in python and pyspark steps. But if it's not possible to do it in python then I will move to pyspark. – user2728349 Jul 28 '20 at 14:05
  • As you are running in python shell can you try using easy_install as in https://stackoverflow.com/a/54852126/4326922 ? – Prabhakar Reddy Aug 04 '20 at 08:10

0 Answers0