0

Right now i have a pyspark data frame as:

x_data  y_data
2.5      2.5
2.5      2.5
2.5      2.5
2.5      2.5

and the value I want in all rows is "Smith"

**How do I create a data frame like this using pyspark?**


x_data  y_data    Name
2.5      2.8      Smith
7.5      5.1      Smith
1.5      1.5      Smith
8.5      6.5      Smith
mck
  • 37,331
  • 13
  • 29
  • 45
emma19
  • 47
  • 6

1 Answers1

0

You can use withColumn to add a new literal column:

import pyspark.sql.functions as F

df2 = df.withColumn('Name', F.lit('Smith'))
mck
  • 37,331
  • 13
  • 29
  • 45