0

I have two different size DataFrames, where I want to compare two columns and add new ones.

df1

ColumnA    ColumnsJ  ...  ColumnX
   2           3           Paris
   6           7           London
   7           1           Milan
   2           5           Madrid
   
[274 rows x 106 columns]

*The df1 ColumnX has always a different city, never repeated.

df2

 ColumnX    Info  ...  Num_Num   
 Madrid     High          4
 Madrid     Low3          2
 Paris      Low           4
 Milan      Moderate      1
 Milan      High2         9
 Milan      High          0
 Beijing    Moderate      8
 
[979 rows x 7 columns]

*The df2 ColumnX has repeated values, and could have some values that does not appear on df1 ColumnX, for example Beijing is not on df1 ColumnX.

I want to compare if df1['ColumnX'] == df2['ColumnX'], and add two new columns to df2, df2['CoumnA'] and df2['ColumnZ'], whose columns are in df1, df1['ColumnA'] and df1['ColumnZ']:

 ColumnX    Info  ....  Num_Num    ColumnA    ColumnZ
 Madrid     High            4          2          5
 Madrid     Low3            2          2          5
 Paris      Low             4          2          3
 Milan      Moderate        1          7          1
 Milan      High2           9          7          1
 Milan      High            0          7          1
 Beijing    Moderate        8          NaN        NaN

Then add another column to df2, df2['Output'], where I want to compare something like:

If df2['Info'] == 'High1' and ColumnA > 2 and ColumnZ > 0 -> add to ['Output'] 1, if not add 0
If df2['Info'] == 'Low' and ColumnA = 0 and ColumnZ < 1 -> add to ['Output'] 0, if not add 1
If df2['Info'] == 'Low3' and ColumnA > Column B and ColumnA + ColumnB > 4 -> ['Output'] 1, if not add 0
*There are a lot of comparation to add. ['Output'] only gets 0 or 1 as values.

Is it possible to do something like this? I've tried to compare the two Columns, but since they have different sizes, I continue to get an error.

I don't know if this is a correct approach for using Pandas Library as I'm new to it.

Thank you

Shintaro
  • 1
  • 2
  • `df2.merge(df1, on='ColumnX', how='left')`. Slice `df1` to reduce to the wanted columns if needed. – mozway Apr 22 '22 at 14:41

0 Answers0