1

I am using Pyspark and want to insert-overwrite partitions into a existing hive table.

  • in this use case saveAsTable() is not suitable, it overwrites the whole existing table
  • insertInto() is behaving strangely: I have 3 partition levels, but it is inserting one

Snd what is the right way to use save()? Can save() take options like database-name and table name to insert into, or only HDFS path?

example :

df\
.write\
.format('orc')\
.mode('overwrite)\
.option('database', db_name)\
.option('table', table_name)\
.save()
mazaneicha
  • 7,697
  • 4
  • 30
  • 45
  • What about [Overwrite specific partitions in spark dataframe write method](https://stackoverflow.com/questions/38487667/overwrite-specific-partitions-in-spark-dataframe-write-method) ? – mazaneicha Mar 10 '22 at 14:28

0 Answers0