9

My Pandas data frame contains the following data:

product,values
 a1,     10
 a5,     20
 a10,    15
 a2,     45
 a3,     12
 a6,     67

I have to sort this data frame based on the product column. Thus, I would like to get the following output:

product,values
 a10,     15
 a6,      67
 a5,      20
 a3,      12
 a2,      45
 a1,      10

Unfortunately, I'm facing the following error:

ErrorDuringImport(path, sys.exc_info())

ErrorDuringImport: problem in views - type 'exceptions.Indentation

Community
  • 1
  • 1
Sai Rajesh
  • 1,612
  • 4
  • 19
  • 40

1 Answers1

17

You can first extract digits and cast to int by astype. Then sort_values of column sort and last drop this column:

df['sort'] = df['product'].str.extract('(\d+)', expand=False).astype(int)
df.sort_values('sort',inplace=True, ascending=False)
df = df.drop('sort', axis=1)
print (df)
  product  values
2     a10      15
5      a6      67
1      a5      20
4      a3      12
3      a2      45
0      a1      10

It is necessary, because if use only sort_values:

df.sort_values('product',inplace=True, ascending=False)
print (df)
  product  values
5      a6      67
1      a5      20
4      a3      12
3      a2      45
2     a10      15
0      a1      10

Another idea is use natsort library:

from natsort import index_natsorted, order_by_index

df = df.reindex(index=order_by_index(df.index, index_natsorted(df['product'], reverse=True)))
print (df)
  product  values
2     a10      15
5      a6      67
1      a5      20
4      a3      12
3      a2      45
0      a1      10
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090