-2

I've got a pandas dataframe with almost a thousand columnds

The column titles are like

[smth,smth,smth,smth.....a,b,c,d,e]

how would I re arrange the columns to move A,B,C,D,E to the start:

[a,b,c,d,e,smth,smth......]
mozway
  • 81,317
  • 8
  • 19
  • 49
Adi Krish
  • 13
  • 2

4 Answers4

0

If I were you I would just use the included .pop() method that is built in to pandas.

So in your case I would do something like this: You will end up with a dataFrame where the column the pop method was used on is now the first and it will subsequently shift all the rest.

first_column = df.pop('A')

You could continue to do this for each of the other columns and it would work well, and if you have so much data that it becomes cumbersome to do it this way you could just run a loop.

There is also some good info from pandas on this:

https://www.geeksforgeeks.org/how-to-move-a-column-to-first-position-in-pandas-dataframe/

ZachUhlig
  • 11
  • 1
0

A clean and efficient way is to use reindex:

cols = list(df.columns)
df.reindex(columns=cols[-5:]+cols[:-5])

Example:

df = pd.DataFrame([], columns=['X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E'])
print(df)

cols = list(df.columns)
df = df.reindex(columns=cols[-5:]+cols[:-5])
print(df)

output:

Empty DataFrame
Columns: [X, Y, Z, A, B, C, D, E]
Index: []

Empty DataFrame
Columns: [A, B, C, D, E, X, Y, Z]
Index: []
mozway
  • 81,317
  • 8
  • 19
  • 49
0

if you simply want to get the last n columns to move to first .You could change the columns order and over write the data frame with that selection

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,100,size=(100, 5)), columns=list('ABCDE'))
cols = list(df.columns)
cols_shift = 2
new_cols = []
new_cols.extend(cols[-cols_shift:])
new_cols.extend(cols[:-cols_shift])
new_cols

df = df[new_cols]
df 
karas27
  • 243
  • 1
  • 3
  • 13
0

An explicit way is as follow. Note that it doesn't even matter where the columns that we want first actually are.

# example setup
cols = 'foo,bar,smth,smth_else,a,b,c,d,e'.split(',')
df = pd.DataFrame(np.random.randint(0,10, size=(4, len(cols))), columns=cols)

Then, say the columns you want first are ['a', 'b', 'c', 'd', 'e']:

first = ['a', 'b', 'c', 'd', 'e']

out = df[first + [k for k in df.columns if k not in first]]

# or:
out = df[pd.Index(first).append(df.columns.difference(first))]


>>> out
   a  b  c  d  e  foo  bar  smth  smth_else
0  7  1  0  2  5    7    2     1          9
1  6  7  7  4  3    7    3     1          2
2  8  9  3  6  2    0    1     5          8
3  1  2  0  3  3    2    4     1          4
Pierre D
  • 19,195
  • 6
  • 50
  • 84