18

I want get a list of files name of all pdf files in folder I have my python script.

Now I have this code:

files = [f for f in os.listdir('.') if os.path.isfile(f)]
for f in files:

e = (len(files) - 1)

The problem are this code found all files in folder(include .py) so I "fix" if my script is the last file on the folder (zzzz.py) and later I subtract the last file of the list that are my script.py.

I try many codes for only find .pdf but this the more near I am.

Bhargav Rao
  • 45,811
  • 27
  • 120
  • 136
Xavier Villafaina
  • 565
  • 3
  • 6
  • 14

6 Answers6

18

Use the glob module:

>>> import glob
>>> glob.glob("*.pdf")
>>> ['308301003.pdf', 'Databricks-how-to-data-import.pdf', 'emr-dg.pdf', 'gfs-sosp2003.pdf']
vy32
  • 26,286
  • 33
  • 110
  • 218
16

Use glob on the directory directly to find all your pdf files:

from os import path
from glob import glob  
def find_ext(dr, ext):
    return glob(path.join(dr,"*.{}".format(ext)))

Demo:

In [2]: find_ext(".","py")
Out[2]: 
['./server.py',
 './new.py',
 './ffmpeg_split.py',
 './clean_download.py',
 './bad_script.py',
 './test.py',
 './settings.py']

If you want the option of ignoring case:

from os import path
from glob import glob
def find_ext(dr, ext, ig_case=False):
    if ig_case:
        ext =  "".join(["[{}]".format(
                ch + ch.swapcase())) for ch in ext])
    return glob(path.join(dr, "*." + ext))

Demo:

In [4]: find_ext(".","py",True)
Out[4]: 
['./server.py',
 './new.py',
 './ffmpeg_split.py',
 './clean_download.py',
 './bad_script.py',
 './test.py',
 './settings.py',
 './test.PY']
Padraic Cunningham
  • 168,988
  • 22
  • 228
  • 312
  • I believe you have one extra closing parenthesis after ch.swapcase on line 6 of your 2nd example. This is really great, thanks! – Paul Nov 08 '18 at 14:38
8

You can use endswith:

files = [f for f in os.listdir('.') if os.path.isfile(f) and f.endswith('.pdf')]
Ahsanul Haque
  • 9,865
  • 4
  • 35
  • 51
8

You simply need to filter the names of files, looking for the ones that end with ".pdf", right?

files = [f for f in os.listdir('.') if os.path.isfile(f)]
files = filter(lambda f: f.endswith(('.pdf','.PDF')), files)

Now, your files contains only the names of files ending with .pdf or .PDF :)

Maciek
  • 3,014
  • 1
  • 20
  • 26
5

To get all PDF files recursively:

import os

all_files = []
for dirpath, dirnames, filenames in os.walk("."):
    for filename in [f for f in filenames if f.endswith(".pdf")]:
        all_files.append(os.path.join(dirpath, filename)
Martin Thoma
  • 108,021
  • 142
  • 552
  • 849
0

You may also use the following,

files = filter(
    lambda f: os.path.isfile(f) and f.lower().endswith(".pdf"),
    os.listdir(".")
)
file_list = list(files)

Or, in one line:

list(filter(lambda f: os.path.isfile(f) and f.lower().endswith(".md"), os.listdir(".")))

You may, or not, convert the filtered object to list using list() function.

Georgios Syngouroglou
  • 16,793
  • 7
  • 83
  • 84