15

I am trying to use a "chained when" function. In other words, I'd like to get more than two outputs.

I tried using the same logic of the concatenate IF function in Excel:

  df.withColumn("device_id", when(col("device")=="desktop",1)).otherwise(when(col("device")=="mobile",2)).otherwise(null))

But that doesn't work since I can't put a tuple into the "otherwise" function.

Grr
  • 14,506
  • 7
  • 57
  • 78
Fede
  • 163
  • 1
  • 1
  • 6

1 Answers1

50

Have you tried:

from pyspark.sql import functions as F
df.withColumn('device_id', F.when(col('device')=='desktop', 1).when(col('device')=='mobile', 2).otherwise(None))

Note that when chaining when functions you do not need to wrap the successive calls in an otherwise function.

pault
  • 37,170
  • 13
  • 92
  • 132
Grr
  • 14,506
  • 7
  • 57
  • 78