56

I'm looking for a simple way to sort a pandas dataframe by the absolute value of a particular column, but without actually changing the values within the dataframe. Something similar to sorted(df, key=abs). So if I had a dataframe like:

    a   b
0   1   -3
1   2   5 
2   3   -1
3   4   2
4   5   -9

The resultant sorted data when sorting on 'b' would look like:

    a   b
2   3   -1
3   4   2
0   1   -3
1   2   5 
4   5   -9
cs95
  • 330,695
  • 80
  • 606
  • 657
afinit
  • 895
  • 1
  • 9
  • 16

3 Answers3

61

UPDATE

Since 0.17.0 order and sort have been deprecated (thanks @Ruggero Turra), you can use sort_values to achieve this now:

In[16]:

df.reindex(df.b.abs().sort_values().index)
Out[16]: 
   a  b
2  3 -1
3  4  2
0  1 -3
1  2  5
4  5 -9
cs95
  • 330,695
  • 80
  • 606
  • 657
EdChum
  • 339,461
  • 188
  • 752
  • 538
  • I'd love to see a version of this that can do this across multiple indices, so that, e.g., if index1 can be either "Fruit" or "Vegetables", and index2 is any fruit or vegetable, and the values are the total spent on that item, it would be possible to sort the data frame by absolute value of the cost of the type of food (index1), and *then*, internally, by the cost of each type of fruit or vegetable. – HaPsantran Jul 26 '16 at 20:12
  • 7
    just for everyone who is here and is searching for the values with the biggest abs listet first, change the code to: `df.reindex(df.b.abs().sort_values(ascending=False).index)` – PV8 Apr 16 '20 at 10:37
37

Perfect Simple Solution with Pandas > V_1.1.0:

Use the parameter key in the sort_values() function:

import pandas as pd
ttt = pd.DataFrame({'A': ['a', 'b', 'c', 'd', 'e', 'f'], 'B': [-3, -2, -1, 0, 1, 2]})

ttt.sort_values(by='B', key=abs)

will yield:

    A   B
3   d   0
2   c   -1
4   e   1
1   b   -2
5   f   2
0   a   -3
Lucecpkn
  • 561
  • 4
  • 8
  • Had not noticed the `key` kwarg until seeing this answer! – Shan Dou Apr 23 '21 at 06:19
  • `ttt.sort_values(by='B', key=pd.Series.abs)` also works, perhaps as well with older versions of Pandas? – Cactus Philosopher Jul 14 '21 at 17:59
  • @CactusPhilosopher: Not sure about the older versions, because according to [the doc](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html), the `key` arg is "new in version 1.1.0.", [released on July 28, 2020](https://pandas.pydata.org/pandas-docs/dev/whatsnew/v1.1.0.html). – Lucecpkn Jul 15 '21 at 16:10
18

Towards more idiomatic pandas: Use argsort

A cleaner approach would be to call Series.argsort on the absolute values, and then index:

df.iloc[df['b'].abs().argsort()]

   a  b
2  3 -1
3  4  2
0  1 -3
1  2  5
4  5 -9

If you need to reset the index, use Series.reset_index,

df.iloc[df['b'].abs().argsort()].reset_index(drop=True)

   a  b
0  3 -1
1  4  2
2  1 -3
3  2  5
4  5 -9

Lastly, since argsort does not have an ascending parameter to specify ascending/descending order, you will need to negate df['b'].abs() to sort by descending order.

df.iloc[(-df['b'].abs()).argsort()]

   a  b
4  5 -9
1  2  5
0  1 -3
3  4  2
2  3 -1

You can do this with NumPy as well—use np.abs and ndarray.argsort.

df.iloc[np.abs(df['b'].values).argsort()]

   a  b
2  3 -1
3  4  2
0  1 -3
1  2  5
4  5 -9

Or, for descending order,

df.iloc[(-np.abs(df['b'].values)).argsort()]

   a  b
4  5 -9
1  2  5
0  1 -3
3  4  2
2  3 -1
cs95
  • 330,695
  • 80
  • 606
  • 657