How to aggregate strings and empty rows for several cases identified by an ID in python?

Question

currently one row describes an event. A case consist of several events. But I need a dataframe with the whole case per row. So I tried to encode the "activities" and the "time", numbered 1 to 6, for each case and write them in new columns.

My question is now: How can I compromise these rows so that I have one complete case with all information in one row? (please have a look at the attached picture)

I will delete the redundant in the next step so I do not care If the values are in all rows for each case or just in the first row of each case.

Here is my attempt, I am struggling with Groupy and I don't even know if this is the right approach.

Thanks for any help! :)

for i in df.index:
    act_col = 'activity_' + df.loc[i, 'case_event'].astype('str')
    time_col = 'time_' + df.loc[i, 'case_event'].astype('str')
    df.loc[i,act_col] = df.loc[i,'activity']
    df.loc[i,time_col] = df.loc[i,'rel_time']

df.head(6)

here I start to struggle:

df['activity_6'] = (df.groupby(['case_id'], sort=False)['activity_6']
        .sum()
        .reset_index())

df.head(6)

Picture of Output:

How to aggregate strings and empty rows for several cases identified by an ID in python?

0 Answers0