I have two tables, these two tables are the data of sales staff and sales managers from January to February:
The name of table is 【1】commission--finish.xlsx, the contents of the January table are as follows :
The name of table is 【2】commission--finish.xlsx, the contents of the February table are as follows:
I want to calculate the sum of the Indicators and the sum of ActualSales of the GeneralRepresentation is in the hospital he is in charge of.
The code is :
import os
import pandas as pd
import openpyxl
sales_catalog = r"D:\\Hospital"
df_one = pd.DataFrame()
monthlist = []
for root,dir,files in os.walk(sales_catalog):
for file in files:
if file.endswith('finish.xlsx') and file.startswith('【'):
excel_input_address = os.path.join(sales_catalog, file)
monthlist.append(file[1:-24])
df_one = df_one.append(pd.read_excel(excel_input_address, sheet_name = 'sheet1', skiprows=2))
else:
pass
monthlist.sort(key=int)
lastmonth = monthlist[-1]
df_one = df_one.loc[:,['City', 'MarketingManager', 'GeneralRepresentation', 'Hospital', 'Indicators', 'Examination\nBilling', 'Digestion\nBilling']].dropna(axis=0,subset = ['Hospital'])
df_two = df_one.groupby(['Hospital']).agg({'Indicators': 'sum', 'Examination\nBilling': 'sum', 'Digestion\nBilling': 'sum'}).reset_index().rename(columns={'Indicators': 'Indicators∑(1-' + lastmonth + 'month)'})
df_two['ActualSales∑(1-' + lastmonth + 'month)'] = df_two['Examination\nBilling'] + df_two['Digestion\nBilling']
df_two = df_two.loc[:,['Hospital', 'Indicators∑(1-' + lastmonth + 'month)', 'ActualSales∑(1-' + lastmonth + 'month)']].dropna(axis=0,subset = ['Hospital'])
df_three = df_one.loc[:,['City', 'Hospital', 'MarketingManager', 'GeneralRepresentation']].dropna(axis=0,subset = ['Hospital']).drop_duplicates()
df_four = pd.merge(df_two,df_three,on="Hospital",how="outer")
df_four = df_four.loc[:,['City', 'Hospital', 'Indicators∑(1-' + lastmonth + 'month)', 'ActualSales∑(1-' + lastmonth + 'month)', 'MarketingManager', 'GeneralRepresentation']]
init_month_column = 1
for month in monthlist:
file = "【" + month + "】commission--finish.xlsx"
excel_address = os.path.join(sales_catalog, file)
df_months = pd.read_excel(excel_address, sheet_name = 'sheet1', skiprows=2)
df_months = df_months.loc[:,['Hospital', 'Examination\nBilling', 'Digestion\nBilling']].dropna(axis=0,subset = ['Hospital'])
df_months.drop_duplicates(subset=['Hospital'],keep='first',inplace=True)
df_months = df_months.rename(columns={'Digestion\nBilling':'Digestion', 'Examination\nBilling':'Examination'})
df_months.insert(init_month_column, month + "month", df_months['Digestion'] + df_months['Examination'])
df_four = pd.merge(df_four,df_months,on='Hospital',how="outer")
df_four = df_four.rename(columns={"Digestion_x":"Digestion","Examination_x":"Examination","Digestion_y":"Digestion","Examination_y":"Examination"})
try:
df_four.to_excel('D:\\Hospital\\SalesCommission.xlsx', sheet_name='sheet1')
print('The data has been saved in the table, please check the table!')
except PermissionError as e:
print('*****Note: The current data is not saved in the table! Reason: The table has been opened, please close the table and execute the program before the data can be saved!*****')
After I run the code, the result is as follows:
The above result table data is incorrect. When there are multiple GeneralRepresentation in the same hospital, the data of Indicators∑(1-2month) and ActualSales∑(1-2month) is wrong, so it becomes the sum of Indicators and ActualSales of multiple GeneralRepresentation. At the same time, the monthly Examination and Digestion are also wrong.
The result table I want is as follows, :
The Indicators∑(1-2 month), ActualSales∑(1-2month), the monthly Examination and Digestion data of the GeneralRepresentation of the same hospital must be calculated separately.
May I ask if there is a problem with my line of code?
df_two = df_one.groupby(['Hospital']).agg({'Indicators':'sum','Examination\nBilling':'sum','Digestion\nBilling ':'sum'}).reset_index().rename(columns={'Indicators':'Indicators∑(1-' + lastmonth +'month)'})
How should I modify the code to achieve my purpose?