I have 2 excel files that i will be using. The filepath for both excel files are passed into a function. The filepath being passed into the function for excel file1 = "Test". The filepath being passed into the function for excel file2 = "Report". The function creates the workbook for the output file and name the sheet that we will be populating “List”
I want to create 2 data frames, df1 for excel file1, and df2 for excel file2.
I want to put the column “Name” and its values from excel file1 into df1. (just a small example)
Name
0 James
1 Ben
2 Frank
3 Charles
4 Tim
I want to put the columns “Name” and “State” and their values from excel file2 into df2. (just a small example, df2 is bigger)
Name State
0 Ben DC
1 Sam MD
2 Tim NC
3 Charles PA
4 Frank IN
5 James VA
I want to compare the cell in the 1st row under the column “Name” from df1 (cv1) to the cell in the 1st row of the column “Name” of df2 (cv2).
If cv1==cv2, I want to copy the cell under the column “State” of the same row from df2 and write it to column “Best” of the sheet “List” of the workbook the function created.
Else increment the row of df2, repeat the comparison.
I want to do this for all the rows in df1 that have a name in the column “name”
The output should look like this
Best
0 VA
1 DC
2 IN
3 PA
4 NC
Below is my code that I have for creating the data frames. I am not sure how-to do the comparison using for loops. Any and all assistance is greatly appreciated.
import os
import pandas as pd
def page(Test,Report):
# select columns i want to work with
compare_column1 = ["Name"]
compare_column2 = ["Name", "State"]
write_column = ["Best"]
# create dataframes for the column to copy to the output file and columns to compare
Df1 = pd.DataFrame(columns=compare_column1)
Df2 = pd.DataFrame(columns=compare_column2)
# compare cell for every row under "Name" column from df1(cv1) to every cell under “Name”
# column df2, if the 2 cells are equal, then copy the cell under the column “State” of that row
# of df2 and write it to the column “Best” of the workbook with the sheet name “List” the
# function created
# else increment the row for df2, compare cv1 to cv2…. Do this for all names in the “Name”
# column
# df1
df_file2 = pd.read_excel(Test)
df_file3 = pd.read_csv(Report)
for i in range (0, length(df1):
cv1 = df1.loc[i], compare_column1]
for j in range (0, length(df2):
cv2 = df2.loc[j], compare_column1] #not sure how to select “Name” column
if cv1==cv2:
cv1.to_excel(writer, sheet_name=write_column, header=false, index=false, startrow=1)