0

I want to submit a pyspark task. And some .py files in different folders.Especially I want put configuration files and common tools in only one folder. But when I submit a pyspark task, I just know --py-files param, so how to submit folders? My code struct likes:

--conf folder
|  --origin.conf
|  --scenes.conf
--tools folder
|  --utils.py
|  --vali.py
-- other fodlsers...
Peng He
  • 1,805
  • 5
  • 16
  • 24

2 Answers2

3
  • create Python package to organize the code
  • zip package or create egg file
  • submit your app passing egg or zip file to --py-files / sc.pyFiles
0

This link from Cloudera has some examples of distributing python packages to Spark executors Running Spark Python Applications

WhyNot
  • 47
  • 4