Why does column 1st_from_end contain null:
from pyspark.sql.functions import split
df = sqlContext.createDataFrame([('a b c d',)], ['s',])
df.select( split(df.s, ' ')[0].alias('0th'),
split(df.s, ' ')[3].alias('3rd'),
split(df.s, ' ')[-1].alias('1st_from_end')
).show()
I thought using [-1] was a pythonic way to get the last item in a list. How come it doesn't work in pyspark?