0

suppose I have two dataframe
df1 : col1 col2 col3
df2 : col1 col2 col4

I would like to join two dataframe using col1 and col2 without defining a new alias table name.

I don't want to do

df=df1.join(df2,(df1.col1 == df2.col1) & (df1.col2 == df2.col2) << this is so dummy And also remove the duplicated join columns after join .

so the final dataframe will have col1 col2 col3 col4 only

How to achieve that ?

mytabi
  • 407
  • 2
  • 8
  • 16

1 Answers1

0

for spark dataframe, use like below.

df3 = df1.join(df2, ['col1', 'col2'])
df3.show()
Prince Francis
  • 2,880
  • 1
  • 11
  • 20