Combine rows where variables are the same using python pandas

Question

I have a pandas dataframe that has rows like this

   Same1  Same2  Diff3  Encoded1  Encoded2  Encoded3
0     33     22    150         0         0         0
1     33     22    300         1         0         1

What I want to achieve is to combine all rows where the 'Same1' and 'Same2' variables are the same, by adding up the other variables.

   Same1  Same2  Diff3  Encoded1  Encoded2  Encoded3
0     33     22    450         1         0         1

What would be the cleanest way to achieve this using pandas?

Executable python code: https://trinket.io/python3/1da371fd04

score 2 · Accepted Answer · answered May 24 '22 at 15:09

2

You can try

out = df.groupby(['Same1', 'Same2']).agg(sum).reset_index()

print(out)

   Same1  Same2  Diff3  Encoded1  Encoded2  Encoded3
0     33     22    450         1         0         1

answered May 24 '22 at 15:09

Ynjxsjmh

16,448
3
17
42

score 1 · Answer 2 · answered May 24 '22 at 15:09

1

You can use a groupby to get the expected result :

df.groupby(['Same1', 'Same2'], as_index=False).sum()

Output :

    Same1   Same2   Diff3   Encoded1    Encoded2    Encoded3
0   33      22      450     1           0           1

answered May 24 '22 at 15:09

tlentali

3,250
2
11
20

Combine rows where variables are the same using python pandas

2 Answers2