1

I have five dataframes each has unique columns and one common column id no.. Say each dataframe has the following columns:

  1. df1: id no, time, date, age, name.
  2. df2: id no, gender, address, employment, birth_date.
  3. df3: id no, .....etc.
  4. df4: id no, ......etc.
  5. df5: id no, .......etc.

I have used merge as the following:

first1 = pd.merge(df1, df2, how= 'left', on = 'id_no')
first2 = pd.merge(first1, df3, how= 'left', on = 'id_no')
first3 = pd.merge(first2, df4, how= 'left', on = 'id_no')
combineall = pd.merge(first3, df5, how= 'left', on = 'id_no')

The problem: The columns of df3 are missing from combineall dataframe. Though when I print df3 alone I see all the contents available. How my df3 gets missing during merging? Is there I make this process easier and less problematic.

Goal: I want to have comabinall dataframe with the all the columns from df1,df2,df3,df4,df5 merged based on id_no.

Firis
  • 43
  • 8

1 Answers1

1

Try out this:

from functools import reduce
finaldf = reduce(lambda left,right: pd.merge(left, right, on='id_no', how='left'), [df1,df2,df3,df4,df5])
Aditi
  • 804
  • 10
  • 27