How to combine two dataframes and average like values?

Question

I'm pretty new to machine learning. I have two dataframes that have movie ratings in them. Some of the movie ratings have the same movie title, but different number ratings while other rows have movie titles that the other data frame doesn't have. I was wondering how I would be able to combine the two dataframes and average any ratings that have the same movie name. Thanks for the help!

example

Removed `machine-learning` and `numpy` tag it has nothing to do with the question. And please don't post images of data frame, transcribing images is tedious, instead post `df.to_dict()` to the question. It makes reproducing your data locally very easy. Please go through [How to make good pandas reproducible example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — Ch3steR, Jul 22 '20 at 15:19

Ch3steR · Accepted Answer · 2020-07-22T15:45:16.473

You can use pd.concat with GroupBy.agg

# df = pd.DataFrame({'Movie':['IR', 'R'], 'rating':[95, 90], 'director':['SB', 'RC']})
# df1 = pd.DataFrame({'Movie':['IR', 'BH'], 'rating':[93, 88], 'direction':['SB', 'RC']})

(pd.concat([df, df1]).groupby('Movie', as_index=False).
                      agg({'rating':'mean', 'director':'first'}))

  Movie  rating director
0    BH      88       RC
1    IR      94       SB
2     R      90       RC

Or df.append

df.append(df1).groupby('Movie',as_index=False).agg({'rating':'mean', 'director':'first'})

  Movie  rating director
0    BH      88       RC
1    IR      94       SB
2     R      90       RC

If you want Movie column as index, as_index parameter of df.groupby defaults to True, Movie column would be index, remove as_index=False from groupby

If you want to maintain the order then set sort parameter to Truein groupby.

(df.append(df1).groupby('Movie',as_index=False, sort=False).
                agg({'rating':'mean', 'director':'first'}))

    Movie  rating director
0    IR      94       SB
1     R      90       RC
2    BH      88       RC

How to combine two dataframes and average like values?

1 Answers1