0

I got a dataframe roughly like this:

Continent   1                                2
Country     USA                 Canada       Germany   France
City        Boston   Chicago    Vancouver    Cologne   Paris     Marseille
Date
---------------------------------------------------------------------------
2018-01-01  176      10982      794          34225     1875      29001
2018-02-01  500      756        10001        4523      11022     NaN

What I would like to do is creating a df2 with relative values per row (e.g. instead of 176 I want to show how much percent 176 is of the total of the row 2018-01-01).

If I try df / df.sum(axis=1) * 100, I get 'ValueError: cannot join with no overlapping index names'

It does work for one row however: df.iloc[0,:] / df.iloc[0,:].sum() * 100

And it does work with a workaround (transpose and sum columns):

df2 = df.T / df.T.sum() * 100
df2 = df2.T

So I guess it has something to do with the Multilevel header?

rubensch
  • 15
  • 3
  • 3
    The problem is with how DataFrame and Series division align when you use the `/` operator. Instead use `DataFrame.div` that way you can specify the appropriate axis for alignment: `df.div(df.sum(axis=1), axis=0)` – ALollz Jan 05 '22 at 22:16

0 Answers0