1

I have columns with similar names but numeric suffixes that represent different occurrences of each column. For example, I have columns (company_1, job_title_1, location_1, company_2, job_title_2, location_2). I would like to order these columns grouped together by the prefix (before the underscore) and then sequentially by the suffix (after the underscore).

How I would like the columns to be: company_1, company_2, job_title_1, job_title_2, location_1, location_2.

Here's what I tried from this question:

df = df.reindex(sorted(df.columns), axis=1)

This resulted in the order: company_1, company_10, company_11 (skipping over 2-9)

codr
  • 51
  • 6
  • Does this answer your question? [Naturally sorting Pandas DataFrame](https://stackoverflow.com/questions/29580978/naturally-sorting-pandas-dataframe) – Henry Ecker Jul 22 '21 at 18:29

2 Answers2

4

This type of sorting is called natural sorting. (Full details in the linked duplicate Naturally sorting Pandas DataFrame)

Via natsort

import pandas as pd
from natsort import natsorted

df = pd.DataFrame(columns=[f'company_{i}' for i in [5, 2, 3, 4, 1, 10]])

print(df)

Before sort:

Empty DataFrame
Columns: [company_5, company_2, company_3, company_4, company_1, company_10]
Index: []

natsorted:

df = df.reindex(natsorted(df.columns), axis=1)
Empty DataFrame
Columns: [company_1, company_2, company_3, company_4, company_5, company_10]
Index: []

Lexicographic sorting with sorted:

df = df.reindex(sorted(df.columns), axis=1)
Empty DataFrame
Columns: [company_1, company_10, company_2, company_3, company_4, company_5]
Index: []
Henry Ecker
  • 31,792
  • 14
  • 29
  • 50
1

Just for the sake of completeness, you can also get the desired result by passing a function to sorted that splits the strings into name, index tuples.


def index_splitter(x):
    """Example input: 'job_title_1'
       Output: ('job_title', 1)
    """
    *name, index = x.split("_")
    return '_'.join(name), int(index)

df = df.reindex(sorted(df.columns, key=index_splitter), axis=1)
fsimonjetz
  • 3,293
  • 1
  • 4
  • 17