20

I see that Pandas has read_fwf, but does it have something like DataFrame.to_fwf? I'm looking for support for field width, numerical precision, and string justification. It seems that DataFrame.to_csv doesn't do this. numpy.savetxt does, but I wouldn't want to do:

numpy.savetxt('myfile.txt', mydataframe.to_records(), fmt='some format')

That just seems wrong. Your ideas are much appreciated.

jkmacc
  • 5,433
  • 3
  • 27
  • 27
  • 5
    take a look at the `to_string` method so see if you can do what you want. – zach May 13 '13 at 01:02
  • This looks close. It seems that I'd have to give a formatter function for each column if any two float or string columns had different formats. It would do the trick, it just looks a little unwieldy. I'd hoped I was missing something. Thanks! – jkmacc May 13 '13 at 18:35
  • 1
    pandas *df.to_csv* has a *sep =" "* parameter that changes the comma to anything else in this case a space or empty string. That in conjunction with the formatter of the method should do it. – Joop Jun 14 '13 at 10:52
  • 1
    @Joop Actually with the `df.to_csv()` method using empty string as delimiter gives the error `TypeError: delimiter must be set`. – pbreach Jan 16 '15 at 21:48
  • 1
    true.. passing empty string to method would create a mess do ignore my reference to empty string. maybe trying pandas "to_string" method would help. if has formatters parameter that is pretty good – Joop Feb 02 '15 at 11:00

7 Answers7

13

Until someone implements this in pandas, you can use the tabulate package:

import pandas as pd
from tabulate import tabulate

def to_fwf(df, fname):
    content = tabulate(df.values.tolist(), list(df.columns), tablefmt="plain")
    open(fname, "w").write(content)

pd.DataFrame.to_fwf = to_fwf
Matt Kramer
  • 564
  • 7
  • 9
9

For custom format for each column you can set format for whole line. fmt param provides formatting for each line

with open('output.dat') as ofile:
     fmt = '%.0f %02.0f %4.1f %3.0f %4.0f %4.1f %4.0f %4.1f %4.0f'
     np.savetxt(ofile, df.values, fmt=fmt)
Brian Burns
  • 17,878
  • 8
  • 77
  • 67
Amir Uteuov
  • 116
  • 1
  • 3
7

Python, Pandas : write content of DataFrame into text File

The question aboves answer helped me. It is not the best, but until to_fwf exists this will do the trick for me...

np.savetxt(r'c:\data\np.txt', df.values, fmt='%d')

or

np.savetxt(r'c:\data\np.txt', df.values, fmt='%10.5f')
Community
  • 1
  • 1
brandog
  • 1,253
  • 3
  • 17
  • 27
  • IMO this is better than `tabulate` since numpy is included with pandas so doesn't require an additional library – maxymoo Mar 09 '17 at 03:55
5

pandas.DataFrame.to_string() is all you need. The only trick is how to manage the index.

# Write
# df.reset_index(inplace=True)  # uncomment if the index matters
df.to_string(filepath, index=False)

# Read
df = pd.read_fwf(filepath)
# df.set_index(index_names, inplace=True)  # uncomment if the index matters

If the index is a pandas.Index that has no name, reset_index() should assign it to column "index". If it is a pandas.MultiIndex that has no names, it should be assigned to columns ["level_0", "level_1", ...].

Alexandre Huat
  • 540
  • 8
  • 11
4

I'm sure you found a workaround for this issue but for anyone else who is curious... If you write the DF into a list, you can write it out to a file by giving the 'format as a string'.format(list indices) eg:

df=df.fillna('')
outF = 'output.txt'      
dbOut = open(temp, 'w')
v = df.values.T.tolist()        
for i in range(0,dfRows):       
    dbOut.write(( \
    '{:7.2f}{:>6.2f}{:>2.0f}{:>4.0f}{:>5.0f}{:6.2f}{:6.2f}{:6.2f}{:6.1f {:>15}{:>60}'\
    .format(v[0][i],v[1][i],v[2][i],v[3][i],v[4][i],v[5][i],v[6][i],v[7][i],v[8][i],\
    v[9][i],v[10][i]) ))
    dbOut.write("\n")
dbOut.close

Just make sure to match up each index with the correct format :)

Hope that helps!

leon yin
  • 751
  • 2
  • 7
  • 22
1

found a very simple solution! (Python). In the code snapped I am trying to write a DataFrame to a positional File. "finalDataFrame.values.tolist()" will return u a list in which each row of the DataFrame is turn into an another list just a [['Camry',2019,'Toyota'],['Mustang','2016','Ford']]. after that with the help of for loop and if statement i am trying to set its fix length. rest is obvious!

 with open (FilePath,'w') as f:
    for i in finalDataFrame.values.tolist():
        widths=(0,0,0,0,0,0,0)
        if i[2] == 'nan':
            i[2]=''
            for h in range(7):
                i[2]= i[2] + ' '
        else:
            x=7-len(str(i[2]))
            a=''
            for k in range(x):
               a=a+' '
            i[2]=str(i[2])+a

        if i[3] == '':
            i[3]=''
            for h in range(25):
                i[3]=i[3]+' '
        else:
            x = 25 - len(i[3])
            print(x)
            a = ''
            for k in range(x):
                a = a + ' '
            print(a)
            i[3] = i[3] + a


        i[4] = str(i[4])[:10]

        q="".join("%*s" % i for i in zip(widths, i))
        f.write(q+'\n')
0

Based on others' answer, here is the snippet I wrote, not the best in coding and performance:

import pandas as pd
import pickle
import numpy as np
from tabulate import tabulate


left_align_gen = lambda length, value: eval(r"'{:<<<length>>}'.format('''<<value>>'''[0:<<length>>])".replace('<<length>>', str(length)).replace('<<value>>', str(value)))
right_align_gen = lambda length, value: eval(r"'{:><<length>>}'.format('''<<value>>'''[0:<<length>>])".replace('<<length>>', str(length)).replace('<<value>>', str(value)))

# df = pd.read_pickle("dummy.pkl")
with open("df.pkl", 'rb') as f:
    df = pickle.load(f)

# field width defines here, width of each field
widths=(22, 255, 14, 255, 14, 255, 255, 255, 255, 255, 255, 22, 255, 22, 255, 255, 255, 22, 14, 14, 255, 255, 255, 2, )

# format datetime
df['CREATED_DATE'] = df['CREATED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['LAST_MODIFIED_DATE'] = df['LAST_MODIFIED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['TERMS_ACCEPTED_DATE'] = df['TERMS_ACCEPTED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['PRIVACY_ACCEPTED_DATE'] = df['PRIVACY_ACCEPTED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))


# print(type(df.iloc[0]['CREATED_DATE']))
# print(df.iloc[0])
record_line_list = []
# for row in df.iloc[:10].itertuples():
for row in [tuple(x) for x in df.to_records(index=False)]:
    record_line_list.append("".join(left_align_gen(length, value) for length, value in zip(widths, row)))

with open('output.txt', 'w') as f:
    f.write('\n'.join(record_line_list))

Github gist

Chen Du
  • 99
  • 2
  • 7