1

We have a csv file called survey.csv and we need to load it into an rdd.

We tried this:

rdd_test = survey_results.csv.map(lambda x: (x, 1)) 

it doesn't work. Anyone can help?

Adriaan
  • 17,081
  • 7
  • 36
  • 71
Kirsten
  • 11
  • 1
  • Welcome to Stack Overflow! Please take the [tour] and read up on [ask], as well as [mcve]. [edit]ing the question with a sample of your CSV file (only a few rows and columns please) and elaborating on what doesn't work (is there an error, wrong/no data, something else?) would help us help you. – Adriaan May 19 '22 at 12:27

1 Answers1

0

SparkContext.textFile creates an RDD

import sys

from pyspark import SparkContext
 
# create Spark context
sc = SparkContext()
 
# read input text file to RDD
lines = sc.textFile("./survey.csv")

Source

Helpful SO post

EoinS
  • 5,106
  • 1
  • 17
  • 31