dataframe.columns changes order of list

Question

I import tab delimited data via pandas and assign new column names via dataframe.columns = []. However, while assigning the column names, the names' order is being changed.

This is my data:

"ID_final"  "Value01"   "Value02"   "Value03"   "Value04"   "Value05"   "Value06"   "Value07"   "Value08"   "Value09"   "Value10"   "Value11"   "Value12"
724 0.00332 0.00224 0.00186 0.00131 0.00108 0.09092 0.14388 0.02926 0.01127 0.00829 0.00593 0.00448
1029    0.00317 0.00221 0.00193 0.00139 0.00128 0.04204 0.09327 0.02509 0.01035 0.00776 0.00561 0.00438
1700    0.0051  0.00353 0.00304 0.00233 0.00189 0.13548 0.21747 0.04044 0.01531 0.01173 0.00856 0.00667

And this is what I do:

import pandas as pd 

dataframe = pd.read_csv('data.txt', sep='\t') 

header = {
        'ID',
        'January',
        'Febraury',
        'March',
        'April',
        'May',
        'June',
        'July',
        'August',
        'September',
        'October',
        'November',
        'December'}

dataframe.columns = header

After I've assigned the column names the order of the header has been changed and and starts with September with the other months following more or less randomly. How can I keep the order of header.

Can you check this solution? https://stackoverflow.com/questions/36539396/how-to-create-a-dataframe-while-preserving-order-of-the-columns — Shobhit Kumar, Aug 16 '19 at 11:08
Sets don't preserve order. In fact, my original question doesn't matter because ordering only applies to dictionaries from V 3.6 upwards. Sets are still unordered. — roganjosh, Aug 16 '19 at 11:10

jezrael · Accepted Answer · 2019-08-16T11:12:51.297

I believe you need pass values in list to parameter names in read_csv, also is necessary set header=0 for overwrite old columns names:

header = [
        'ID',
        'January',
        'Febraury',
        'March',
        'April',
        'May',
        'June',
        'July',
        'August',
        'September',
        'October',
        'November',
        'December']
dataframe = pd.read_csv('data.txt', sep='\t', header=0, names=header)

Alternative solution is skip first header values:

dataframe = pd.read_csv('data.txt', sep='\t', skiprows=1, names=header)

EDIT: Like @roganjosh mentioned in your solution only pass list to columns names:

dataframe = pd.read_csv('data.txt', sep='\t') 

header = [
        'ID',
        'January',
        'Febraury',
        'March',
        'April',
        'May',
        'June',
        'July',
        'August',
        'September',
        'October',
        'November',
        'December']

dataframe.columns = header

They could rename the columns afterwards, the issue is that the iterable of names that they passed is a `set` and not a `list` so the ordering isn't preserved. — roganjosh, Aug 16 '19 at 11:12

dataframe.columns changes order of list

1 Answers1