How do I manage repeting rows in Pandas

Question

How do I organise this triple column data-set by removing the repeting elements.

Country       Year      Temperature
US            1990       25
US            1990       27 
US            1990       24
US            1991       26
Canada        1990       20
 .             .          .

Into

Country      Year        AvgTemp
US           1990           25.33
US            1991          26
Canada       1990           20

I can use groupby to do so for just the 'Year' and 'Temp' columns. But what if 3 columns are involved.

(P.S. I am new to pandas )

This is just: `df.groupby(['Country', 'Year'])['Temperature'].mean()` — Erfan, Jun 14 '20 at 16:32
To match your expected output with the new column name, use named aggregations instead: `df.groupby(['Country', 'Year']).agg(AvgTemp=('Temperature', 'mean')).reset_index()` — Erfan, Jun 14 '20 at 16:35

score 1 · Answer 1 · edited Jun 14 '20 at 16:34

1

You can use multiple variables inside groupby() like this

df.groupby(['Country','Year'])['Temp'].mean().reset_index()

edited Jun 14 '20 at 16:34

Ch3steR

19,076
4
25
52

answered Jun 14 '20 at 16:32

DataVizPyR

51
5

warped · Answer 2 · 2020-06-14T16:37:07.327

1

df.groupby(['Country', 'Year']).mean().reset_index().rename(columns={'Temperature':'AvgTemp'})

edited Jun 14 '20 at 16:37

answered Jun 14 '20 at 16:33

warped

8,032
3
21
43

How do I manage repeting rows in Pandas

2 Answers2