0

QUESTION

Given the below context, how do I resolve this exception I receive when I run this code?

SETUP

I am new to python and spark/pyspark and working on a legacy code base that uses:

  • Python 3.6
  • Spark 2.4.1
  • Pyspark 2.4.3.

I am running the code locally with Monterey(12.1) and and M1 chip.

The code imports sparkSession and creates a new session:

from pyspark.sql import SparkSession

class SomeClass():
  spark = SparkSession.builder.getOrCreate()

  ...Some additional code...

EXCEPTION

At runtime, I am able to import sparkSession but Python throws the following exception when initializing the SparkSession:

[Errno 2] No such file or directory: '/home/[user]/Spark/spark-2.4.1-bin-without-hadoop/./bin/spark-submit': '/home/[user]/Spark/spark-2.4.1-bin-without-hadoop/./bin/spark-submit'

[stack trace...]

This code runs without exception for other users.

CURRENT DEBUGGING

I've verified the file exists:

|--[userHome]
    |--Spark
         |--spark-2.4.1-bin-without-hadoop
             |--hadoop-3.0.0
             |--bin
             |   |--beeline
             |   |--beeline.cmd                  
             |   |--find-spark-home
             |   |--find-spark-home.cmd
             |   |--load-spark-env.cmd
             |   |--pyspark
             |   |--pyspark.cmd
             |   |--pyspark2.cmd
             |   |--run-example
             |   |--run-example.cmd
             |   |--spark-class
             |   |--spark-class.cmd
             |   |--spark-class2.cmd
             |   |--spark-shell
             |   |--spark-class
             |   |--spark-class.cmd
             |   |--spark-class
         --> |   |--spark-submit
         --> |   |--spark-submit.cmd
         --> |   |--spark-submit2.cmd
             |   |--sparkR
             |   |--sparkR.cmd
             |   |--sparkR2.cmd
             |
             |--<Other Directories>
          

Environment variables are as follows:

SPARK_VERSION=2.4.1
SPARK_HOME=/home/$USER/Spark/spark-${SPARK_VERSION}-bin-without-hadoop/libexec
HADOOP_USER_NAME=$USER
HADOOP_VERSION='3.0.0'
HADOOP_HOME=${SPARK_HOME}/hadoop-${HADOOP_VERSION}
HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop SPARK_DIST_CLASSPATH="${HADOOP_HOME}/etc/hadoop/*:${HADOOP_HOME}/share/hadoop/common/lib/*:${HADOOP_HOME}/share/hadoop/common/*:${HADOOP_HOME}/share/hadoop/hdfs/*:${HADOOP_HOME}/share/hadoop/hdfs/lib/*:${HADOOP_HOME}/share/hadoop/hdfs/*:${HADOOP_HOME}/share/hadoop/yarn/lib/*:${HADOOP_HOME}/share/hadoop/yarn/*:${HADOOP_HOME}/share/hadoop/mapreduce/lib/*:${HADOOP_HOME}/share/hadoop/mapreduce/*:${HADOOP_HOME}/share/hadoop/tools/lib/*"
SPARK_LOCAL_IP=127.0.0.1
PYSPARK_PYTHON=python3.6
PYSPARK_DRIVER_PYTHON=python3.6

When I attempt to run spark from the terminal, permission is denied:

~/Spark/spark-2.4.1-bin-without-hadoop > ./bin
==> zsh: permission denied: ./bin

and when trying to run the spark submit script or run Pyspark form the terminal, I receive an error noting the files don't exist.

~/Spark/spark-2.4.1-bin-without-hadoop >  ./bin/spark-submit
==>./bin/spark-submit: line 27: /home/[user]/Spark/spark-2.4.1-bin-without-hadoop/bin/spark-class: No such file or directory
==>./bin/spark-submit: line 27: exec: /home/[user]/Spark/spark-2.4.1-bin-without-hadoop/bin/spark-class: cannot execute: No such file or directory

~/Spark/spark-2.4.1-bin-without-hadoop > ./bin/pyspark
==> ./bin/pyspark: line 24: /home/[user]/Spark/spark-2.4.1-bin-without-hadoop/bin/load-spark-env.sh: No such file or directory
==>./bin/pyspark: line 77: /home/[user]/Spark/spark-2.4.1-bin-without-hadoop/bin/spark-submit: No such file or directory
==>./bin/pyspark: line 77: exec: /home/[user]/Spark/spark-2.4.1-bin-without-hadoop/bin/spark-submit: cannot execute: No such file or directory

I have verified that I have read, write and execute permissions for .bin and .bin/pyspark

drwxr-xr-x   30 [user]  [group]    960 [Date/Time] bin

-rwxr-xr-x  1 [user]  [group]  2987 [Date/Time] pyspark

PREVIOUS REFERENCES

I have referenced the following articles and have not been able to solve the issue:

Permission denied error when setting up local Spark instance and running pyspark

Why I take "spark-shell: Permission denied" error in Spark Setup?

0 Answers0