What is the difference between saveAasTable and save in Spark

Asked Mar 10 '22 at 09:38

Active Mar 13 '22 at 20:53

Viewed 36 times

I am using Pyspark and want to insert-overwrite partitions into a existing hive table.

in this use case saveAsTable() is not suitable, it overwrites the whole existing table
insertInto() is behaving strangely: I have 3 partition levels, but it is inserting one

Snd what is the right way to use save()? Can save() take options like database-name and table name to insert into, or only HDFS path?

example :

df\
.write\
.format('orc')\
.mode('overwrite)\
.option('database', db_name)\
.option('table', table_name)\
.save()

edited Mar 13 '22 at 20:53

mazaneicha

asked Mar 10 '22 at 09:38

amine jisung

What about [Overwrite specific partitions in spark dataframe write method](https://stackoverflow.com/questions/38487667/overwrite-specific-partitions-in-spark-dataframe-write-method) ? – mazaneicha Mar 10 '22 at 14:28

0 Answers0