2

I have the following pandas dataframe:

df = pd.DataFrame([[1,2,3,'a'],[4,5,6,'a'],[2,4,1,'a'],[2,4,1,'b'],[4,9,6,'b'],[2,4,1,'b']], index=[0,1,2,0,1,2], columns=['aa','bb','cc','cat'])


     aa    bb    cc    cat
0    1      2     3    a
1    4      5     6    a
2    2      4     1    a
0    2      4     1    b
1    4      9     6    b
2    2      4     1    b

I need to add rows with the same index.

    aa   bb   cc  cat
0   3    6    4    ab
1   8   14   12    ab
2   4    8    2    ab

I used the following code:

df_ab = df[df['cat'] == 'a'] + df[df['cat'] == 'b']

But is this the most pythonic way ?

m13op22
  • 1,827
  • 1
  • 14
  • 32
jAguesses
  • 91
  • 5

3 Answers3

5

Use groupby and agg

df.groupby(df.index).agg({'aa': 'sum',
                          'bb': 'sum',
                          'cc': 'sum',
                          'cat': ''.join})

Or pass numeric_only=False (simpler, but I wouldn't recommend)

df.groupby(df.index).sum(numeric_only=False)

Both output

    aa  bb  cc cat
0   3   6   4  ab
1   8  14  12  ab
2   4   8   2  ab
rafaelc
  • 52,436
  • 15
  • 51
  • 78
3

We can select the dtype of column and determined which type of agg function to use

df.groupby(level=0).agg(lambda x : x.sum() if x.dtype!='object' else ''.join(x))
Out[271]: 
   aa  bb  cc cat
0   3   6   4  ab
1   8  14  12  ab
2   4   8   2  ab
BENY
  • 296,997
  • 19
  • 147
  • 204
0

Use this one-liner :)

(df.reset_index().groupby("index")
 .agg(lambda x:np.sum(x) if x.dtype == "int" else "".join(x)) 
ivallesp
  • 1,914
  • 11
  • 17