-3

Can anyone help me? Im trying to append a few csv files from a folder to a DataFrame using a basic loop, but for some reason I'm getting this error that I had never got before...

import pandas as pd

import os

path = r"C:\Users\Bernardo Faria\OneDrive\xxx\yyy\Data"

files_list = os.listdir(path)

# [
#     ".ipynb_checkpoints",
#     "sales_2021-12-01_2021-12-31 (1).csv",
#     "sales_2021-12-01_2021-12-31 (2).csv",
#     "sales_2021-12-01_2021-12-31 (3).csv",
#     "sales_2021-12-01_2021-12-31 (4).csv",
#     "sales_2021-12-01_2021-12-31 (5).csv",
#     "sales_2021-12-01_2021-12-31 (6).csv",
#     "sales_2021-12-01_2021-12-31 (7).csv",
#     "sales_2021-12-01_2021-12-31 (8).csv",
#     "sales_2021-12-01_2021-12-31.csv",
# ]

df = pd.DataFrame()
for i in files_list:
    df_1 = pd.read_csv(path + "\\" + i)
    df = df.append(df_1)
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\Bernardo Faria\\OneDrive\\xxx\\yyy\\Data\\.ipynb_checkpoints'
AKX
  • 123,782
  • 12
  • 99
  • 138
BernardoF
  • 7
  • 1
  • Why is `.ipynb_checkpoints` in the files list? It's not a csv file, in fact it appears not to be a file at all. – John Gordon Dec 31 '21 at 19:46
  • 1
    @JohnGordon Sorry, that was part of my bad reformat. OP was actually calling `os.listdir()`. – AKX Dec 31 '21 at 19:46
  • Any possibility that the file is open already? Windows does that if the file is opened by other program(s). – MSH Dec 31 '21 at 19:48

3 Answers3

2

You're trying to open the .ipynb_checkpoints directory as a CSV. That won't work.

Only open CSVs in your loop:

df = pd.DataFrame()
for i in files_list:
    if not i.endswith(".csv"):  # Skip files not ending in .csv.
        continue
    df_1 = pd.read_csv(os.path.join(path, i))
    df = df.append(df_1)
AKX
  • 123,782
  • 12
  • 99
  • 138
  • I think you would get `pandas.errors.EmptyDataError: No columns to parse from file` if the file is not formatted correctly. – MSH Dec 31 '21 at 19:51
  • @MSH That'd be a different problem. OP is using `.listdir()` without any filtering, so they end up passing the `.ipynb_checkpoints` directory to `pd.read_csv()`, which won't work. – AKX Dec 31 '21 at 19:52
  • right just saw that ipynb_checkpoints is a folder – gilf0yle Dec 31 '21 at 20:05
  • It worked AKX, thanks so much :) – BernardoF Dec 31 '21 at 21:33
0

Try using glob with a wildcard statement like so:

import glob
import pandas as pd

df = pd.DataFrame()
path = r"C:\Users\Bernardo Faria\OneDrive\xxx\yyy\Data"
files = glob.glob(path+'\*.csv')
for file in files:
    df_1 = pd.read_csv(file)
    df = df.append(df_1)

Also, are you the user "Bernardo Faria" if you are a standard user trying to get access to a different user directory in windows you will get access denied.

Another method I have used in the past to stack dataframes on top of each other is like so:

df = None
for file in files:
    if df is not None:
        df_other = pd.read_csv(file, low_memory=False)
        df = pd.concat([df, df_other])
    else:
        df = pd.read_csv(file, low_memory=False)

Note: I used low_memory=False, because these csv files were pretty large.

Thomas
  • 123
  • 5
-1

I think the error is due to the path of the file. The "Bernardo Faria" part breaks down. I suggest you to put the file directly under the C folder and then try again.

like ;

path = r"C:\Data"