I have a python script that reads a bunch of csv files and creates a new csv file that contains the last line of each of the files read. The script is this:
import pandas as pd
import glob
import os
path = r'Directory of the files read\*common_file_name_part.csv'
r_path = r'Directory where the resulting file is saved.'
if os.path.exists(r_path + 'csv'):
os.remove(r_path + 'csv')
if os.path.exists(r_path + 'txt'):
os.remove(r_path + 'txt')
files = glob.glob(path)
column_list = [None] * 44
for i in range(44):
column_list[i] = str(i + 1)
df = pd.DataFrame(columns = column_list)
for name in files:
df_n = pd.read_csv(name, names = column_list)
df = df.append(df_n.iloc[-1], ignore_index=True)
del df_n
df.to_csv(r_path + 'csv', index=False, header=False)
del df
The files all have a common name end and a genuine name beginning. The resulting file doesn't have the extension so I can do some checks. My problem is that the files have a variable amount of lines and columns, even inside the same file, and I can't read them properly. If I don't specify the column names, the program assumes the first line as the column names and that leads to a lot of columns being lost from some of the files. Also, I've tried reading the files without headers, by writing:
df = pd.read_csv(r_path, header=None)
but it doesn't seem to work. I wanted to upload some files as an example but I don't know. If someone knows how I'll be happy to do it