I have two pandas type dataframes:
import numpy as np
import pandas as pd
df1 = pd.DataFrame({'ID_df1': [1,2,3,4],
'Name_df1': ['John', 'Alex', 'Alan', 'Marie'],
'Cod_job_df1': [10, 20, 30, 40]})
df2 = pd.DataFrame({'Cod_job_df2': [10, 200, 40],
'Info_df2': [55,66,88]})
I need to compare the dataframes job code columns i.e. compare if 'Cod_job_df1' and 'Cod_job_df2' are equal. In case the codes are the same, I would need to create a new column in df1 containing the information from df2. So I made the following code:
# Creating the new column
df1['New_column'] = ''
for i in range(0, len(df1)):
for j in range(0, len(df2)):
comparing the code job
if(df1['Cod_job_df1'].iloc[i] == df2['Cod_job_df2'].iloc[j]):
# assigning
df1['New_column'].iloc[i] = df2['Info_df2'].iloc[j]
The code is working. But it takes a while to run with larger dataframes. Is there another way to get the same result?
ID_df1 Name_df1 Cod_job_df1 New_column
1 John 10 55
2 Alex 20
3 Alan 30
4 Marie 40 88