0

I have a dataframe which I need to split into multiple different dataframes based on the value of specific column. Then, I need to add a calculated column to each of the newly created dataframes. Is there a way to do this within a single for loop?

I've managed to split my original dataframe into multiple dataframes using the solutions listed here and here, but I can't seem to integrate the the new column calculation into either option.

My current code is below. Any suggestion on how to modify? I've tried both separating the two for loops as well as putting one for loop inside the other. Neither seems to work:

#create new dataframes based on region id in the original dataframe    
region_df_list=[]
    for id, df_id in df.groupby('Region/Division'):
        print(df_id)
        region_df_list.append(df_id)
#create a new column within each regional dataframe based on the value_counts of a column named 'value' when 'value' is grouped by the column "variable"
for region in region_df_list:
        region.groupby(by = "variable", group_keys = False ).value.value_counts(normalize = True).reset_index(name = 'proportion_avail_region')
        #region_df_list.groupby...doesn't work either

Thanks!

0 Answers0