1041

How can I find all the files in a directory having the extension .txt in python?

Andrea Ianni ௫
  • 809
  • 12
  • 22
usertest
  • 26,411
  • 29
  • 68
  • 93

25 Answers25

2920

You can use glob:

import glob, os
os.chdir("/mydir")
for file in glob.glob("*.txt"):
    print(file)

or simply os.listdir:

import os
for file in os.listdir("/mydir"):
    if file.endswith(".txt"):
        print(os.path.join("/mydir", file))

or if you want to traverse directory, use os.walk:

import os
for root, dirs, files in os.walk("/mydir"):
    for file in files:
        if file.endswith(".txt"):
             print(os.path.join(root, file))
Ma0
  • 14,712
  • 2
  • 33
  • 62
ghostdog74
  • 307,646
  • 55
  • 250
  • 337
  • 14
    Using solution #2, How would you create a file or list with that info? – Merlin Oct 19 '10 at 03:48
  • 1
    don't understand. elaborate more. – ghostdog74 Oct 19 '10 at 04:13
  • 1
    `glob.glob(..)` is just `list(glob.iglob(..))`. `os.chdir()` is unnecessary http://stackoverflow.com/questions/3964681/find-all-files-in-directory-with-extension-txt-with-python/3971553#3971553 – jfs Oct 19 '10 at 18:53
  • 79
    @ghostdog74: In my opinion it would more appropriate to write `for file in f` than for `for files in f` since what is in the variable is a single filename. Even better would be to change the `f` to `files` and then the for loops could become `for file in files`. – martineau Oct 26 '10 at 14:18
  • @ghostdog74: is there any performance difference between these options? – benregn Sep 03 '11 at 09:05
  • 4
    @martineau `file` is a reserved word and cannot be used as a variable. probably clearer, however, to do `for f in files` and switch around the rest of the code so that f is a single file and files is the list – computermacgyver Oct 14 '12 at 12:39
  • 50
    @computermacgyver: No, `file` is not a reserved word, just the name of a predefined function, so it's quite possible to use it as a variable name in your own code. Although it's true that generally one should avoid collisions like that, `file` is a special case because there's hardly ever any need to to use it, so it is often consider an exception to the guideline. If you don't want to do that, PEP8 recommends appending a single underscore to such names, i.e. `file_`, which you'd have to agree is still quite readable. – martineau Oct 14 '12 at 19:04
  • 11
    Thanks, martineau, you're absolutely right. I jumped too quickly to conclusions. – computermacgyver Oct 15 '12 at 19:53
  • 1
    Really cool answer, you could replace r,d,f by r,_,f to avoid unused variable declaration. – AsTeR Mar 08 '13 at 20:16
  • I fell for that for file in f suggestion in my head, too. Perhaps for fileName in f would be best, since we're iterating through a list of file names and not file objects. – SimonT Apr 15 '13 at 04:02
  • my five cents - Python glob() is probably different from system glob. when system glob fails with "Argument list too long", python glob works OK. – Nick May 20 '13 at 19:06
  • Why the invocation of `chdir()`? While you're at it using `glob`, just `glob('/mydir/*.txt')`? – FooBar Aug 22 '14 at 12:42
  • 1
    @computermacgyver - Per pep8, you should append an underscore `_` to the end of a name if it would otherwise shadow a builtin (unless of course your intent is to shadow a builtin). So in this case, you should use `for file_ in files`. Or `f` works too, of course, and it's a common choice of variable name for files in Python so I think that's acceptable even though, as a normal rule of thumb, extremely short variable names like that should be avoided. – ArtOfWarfare Oct 23 '14 at 18:12
  • 1
    I stuck with 'fyle' from my [FORTRAN days](https://books.google.ch/books?id=2GPNBQAAQBAJ&lpg=PA324&ots=d6RzaA8FQp&dq=fyle%20fortran&pg=PA324#v=onepage&q=fyle%20fortran&f=false). – philshem Apr 09 '15 at 07:05
  • How the files can be fetched ?? My question is : For example -- if a directory had n number of files with extension .txt and which file has been fetch first when the corresponding function is run ??? Based on file size or any other criteria?? I dont know how to get that... If I need the largest storage size file then how to extract that first ?? – Jay Venkat Nov 19 '15 at 06:45
  • 52
    A more Pythonic way for #2 can be **for file in [f for f in os.listdir('/mydir') if f.endswith('.txt')]:** – ozgur Jan 18 '16 at 11:46
  • 3
    @Merlin, before the for loop, initialize an empty list say fileList = [ ] and then replace the last statement of the loop with fileList.append(os.path.join(root, file)). – Gathide Nov 18 '16 at 16:42
  • 3
    I found `text_files = [f for f in os.listdir("/mydir") if os.path.isfile(os.path.join("/mydir", f)) and f.endswith(".txt")]` to work way faster then the second option.. – Montmons Mar 23 '17 at 11:38
  • 1
    os.listdir also lists directories, so if some cruel person would name directory blabla.txt, it would also show that directory (not only .txt file). Use isfile to verify it is actually a file. – Do-do-new Jul 12 '17 at 08:53
  • @ghostdog74 Hey this is brilliant. Worked nicely for me. Thanks! – bFig8 Sep 16 '17 at 08:19
  • Why not using just `glob.glob('./*.txt')` instead of looping? – Paradox Mar 18 '18 at 19:37
  • How to walk in non recursive manner ,only the files in the dir,not in its child dirs? – Michael IV Jun 26 '20 at 11:45
  • Does this only apply to local drive, or a folder online (e.g Google Drive, Github) as well? – Paw in Data Jul 15 '20 at 06:47
  • This is how to answer a question! No blah blah blah. No challenging the asker why would you want to do that. Code includes the import statement ready to copy. – ChuckZ Jul 31 '20 at 19:12
  • The answer has a few issues that make it not-so-great: 1. The first solution uses `os.chdir` which is a stateful operation (moving to a relative directory twice won't work - think if it's wrapped in a function) 2. pathlib.Path isn't mentioned despite solving this with less code. – Roee Shenberg Nov 19 '20 at 11:12
  • 1
    As pointed out further down, the file match is **case-sensitive**, so use `lower()`. – Timo Nov 24 '20 at 19:40
  • 1
    These are great answers! all of them are 100% correct, it's just up to the OP or some other people who saw it on the internet which one to use and take it as an answer to his/her problem. – Ice Bear Jan 14 '21 at 08:12
  • `file` is IMO not a great name for that loop variable - filename (verbose) or fn (if understood by convention) would seem better to me – bytepusher Feb 07 '22 at 14:55
320

Use glob.

>>> import glob
>>> glob.glob('./*.txt')
['./outline.txt', './pip-log.txt', './test.txt', './testingvim.txt']
Muhammad Alkarouri
  • 22,738
  • 18
  • 63
  • 99
  • Not only is this easy, it is also case insensitive. (At least, it is on Windows, as it should be. I'm not sure about other OSes.) – Jon Coombs Jan 30 '14 at 04:17
  • 42
    Beware that `glob` can't find files **recursively** if your python is under 3.5. [more inform](http://stackoverflow.com/questions/2186525/use-a-glob-to-find-files-recursively-in-python) – qun Apr 27 '16 at 11:26
  • the best part is you can use regular expression test*.txt – Alex Punnen Dec 05 '17 at 05:28
  • @JonCoombs nope. At least not on Linux. – Karuhanga Dec 15 '18 at 17:53
  • This only finds files in the current top level directory, not within the entire directory. – Cerin Feb 15 '22 at 00:23
207

Something like that should do the job

for root, dirs, files in os.walk(directory):
    for file in files:
        if file.endswith('.txt'):
            print(file)
greybeard
  • 2,102
  • 7
  • 24
  • 58
Adam Byrtek
  • 11,653
  • 2
  • 31
  • 30
  • 91
    +1 for naming your variables `root, dirs, files` instead of `r, d, f`. Much more readable. – Clément Jan 04 '13 at 18:31
  • 39
    Note that this is case sensitive (won't match .TXT or .Txt), so you'll probably want to do if file.lower().endswith('.txt'): – Jon Coombs Jan 30 '14 at 03:17
  • 3
    your answer deals with the subdirectory. – Sam Liao Mar 06 '15 at 03:34
  • 1
    As List Comprehension: `text_file_list = [file for root, dirs, files in os.walk(folder) for file in files if file.endswith('.txt')]` – Nir Oct 07 '21 at 09:59
147

Something like this will work:

>>> import os
>>> path = '/usr/share/cups/charmaps'
>>> text_files = [f for f in os.listdir(path) if f.endswith('.txt')]
>>> text_files
['euc-cn.txt', 'euc-jp.txt', 'euc-kr.txt', 'euc-tw.txt', ... 'windows-950.txt']
Seth
  • 42,464
  • 10
  • 85
  • 118
  • 1
    How would i save the path to the text_files? ['path/euc-cn.txt', ... 'path/windows-950.txt'] – IceQueeny Nov 07 '17 at 15:12
  • 8
    You could use [`os.path.join`](https://docs.python.org/2/library/os.path.html#os.path.join) on each element of `text_files`. It could be something like `text_files = [os.path.join(path, f) for f in os.listdir(path) if f.endswith('.txt')]`. – Seth Nov 07 '17 at 21:38
128

You can simply use pathlibs glob 1:

import pathlib

list(pathlib.Path('your_directory').glob('*.txt'))

or in a loop:

for txt_file in pathlib.Path('your_directory').glob('*.txt'):
    # do something with "txt_file"

If you want it recursive you can use .glob('**/*.txt')


1The pathlib module was included in the standard library in python 3.4. But you can install back-ports of that module even on older Python versions (i.e. using conda or pip): pathlib and pathlib2.

Jeril
  • 6,538
  • 3
  • 47
  • 63
MSeifert
  • 133,177
  • 32
  • 312
  • 322
  • `**/*.txt` is not supported by older python versions.So I solved this with: `foundfiles= subprocess.check_output("ls **/*.txt", shell=True)` `for foundfile in foundfiles.splitlines():` `print foundfile` – Roman Jun 22 '17 at 14:05
  • 1
    @Roman Yes, it was just a showcase what `pathlib` can do and I already included the Python version requirements. :) But if your approach hasn't been posted already why not just add it as another answer? – MSeifert Jun 22 '17 at 16:02
  • 1
    yes, posting an answer would have given me better formatting possibilities, definitly. I postet it [there](https://stackoverflow.com/questions/2186525/use-a-glob-to-find-files-recursively-in-python/44719006#44719006) because I think this is a more appropiate place for it. – Roman Jun 23 '17 at 10:24
  • 10
    Note that you can also use `rglob` if you want to look for items recursively. E.g. `.rglob('*.txt')` – Bram Vanroy Jun 12 '19 at 13:39
49
import os

path = 'mypath/path' 
files = os.listdir(path)

files_txt = [i for i in files if i.endswith('.txt')]
user3281344
  • 491
  • 4
  • 3
35

I like os.walk():

import os

for root, dirs, files in os.walk(dir):
    for f in files:
        if os.path.splitext(f)[1] == '.txt':
            fullpath = os.path.join(root, f)
            print(fullpath)

Or with generators:

import os

fileiter = (os.path.join(root, f)
    for root, _, files in os.walk(dir)
    for f in files)
txtfileiter = (f for f in fileiter if os.path.splitext(f)[1] == '.txt')
for txt in txtfileiter:
    print(txt)
TamaMcGlinn
  • 2,287
  • 17
  • 28
hughdbrown
  • 45,464
  • 20
  • 81
  • 105
34

Here's more versions of the same that produce slightly different results:

glob.iglob()

import glob
for f in glob.iglob("/mydir/*/*.txt"): # generator, search immediate subdirectories 
    print f

glob.glob1()

print glob.glob1("/mydir", "*.tx?")  # literal_directory, basename_pattern

fnmatch.filter()

import fnmatch, os
print fnmatch.filter(os.listdir("/mydir"), "*.tx?") # include dot-files
jfs
  • 374,366
  • 172
  • 933
  • 1,594
  • 3
    For the curious, `glob1()` is a helper function in the `glob` module which isn't listed in the Python documentation. There's some inline comments describing what it does in the source file, see `.../Lib/glob.py`. – martineau Oct 26 '10 at 14:03
  • 1
    @martineau: `glob.glob1()` is not public but it is available on Python 2.4-2.7;3.0-3.2; pypy; jython http://github.com/zed/test_glob1 – jfs Oct 26 '10 at 23:15
  • 1
    Thanks, that's good additional information to have when deciding whether to use a undocumented private function in a module. ;-) Here's a little more. The Python 2.7 version is only 12 lines long and looks like it could easily be extracted from the `glob` module. – martineau Oct 27 '10 at 00:30
23

Try this this will find all your files recursively:

import glob, os
os.chdir("H:\\wallpaper")# use whatever directory you want

#double\\ no single \

for file in glob.glob("**/*.txt", recursive = True):
    print(file)
pyDdev
  • 97
  • 7
mayank
  • 255
  • 2
  • 3
  • 1
    not with recursive version (double star: `**`). Only available in python 3. What I don't like is the `chdir` part. No need for that. – Jean-François Fabre May 17 '19 at 17:41
  • 2
    well, you could use the os library to join the path, e.g., `filepath = os.path.join('wallpaper')` and then use it as `glob.glob(filepath+"**/*.psd", recursive = True)`, which would yield the same result. – Mitalee Rao Aug 19 '19 at 05:12
  • note that should rename `file` assignment to something like `_file` to not conflict with saved type names – ganski Jul 29 '20 at 21:14
  • I noticed that it is case insensitive (on windows at least). How to make the pattern matching case sensitive? – qqqqq Apr 01 '21 at 17:14
  • _glob_ acts differently in ipython than in running code and is generally surprising. I have told myself to excise it in the past and keep being stubborn, coming back to it, and paying for it. – WestCoastProjects Jun 16 '21 at 18:14
23

Python v3.5+

Fast method using os.scandir in a recursive function. Searches for all files with a specified extension in folder and sub-folders. It is fast, even for finding 10,000s of files.

I have also included a function to convert the output to a Pandas Dataframe.

import os
import re
import pandas as pd
import numpy as np


def findFilesInFolderYield(path,  extension, containsTxt='', subFolders = True, excludeText = ''):
    """  Recursive function to find all files of an extension type in a folder (and optionally in all subfolders too)

    path:               Base directory to find files
    extension:          File extension to find.  e.g. 'txt'.  Regular expression. Or  'ls\d' to match ls1, ls2, ls3 etc
    containsTxt:        List of Strings, only finds file if it contains this text.  Ignore if '' (or blank)
    subFolders:         Bool.  If True, find files in all subfolders under path. If False, only searches files in the specified folder
    excludeText:        Text string.  Ignore if ''. Will exclude if text string is in path.
    """
    if type(containsTxt) == str: # if a string and not in a list
        containsTxt = [containsTxt]
    
    myregexobj = re.compile('\.' + extension + '$')    # Makes sure the file extension is at the end and is preceded by a .
    
    try:   # Trapping a OSError or FileNotFoundError:  File permissions problem I believe
        for entry in os.scandir(path):
            if entry.is_file() and myregexobj.search(entry.path): # 
    
                bools = [True for txt in containsTxt if txt in entry.path and (excludeText == '' or excludeText not in entry.path)]
    
                if len(bools)== len(containsTxt):
                    yield entry.stat().st_size, entry.stat().st_atime_ns, entry.stat().st_mtime_ns, entry.stat().st_ctime_ns, entry.path
    
            elif entry.is_dir() and subFolders:   # if its a directory, then repeat process as a nested function
                yield from findFilesInFolderYield(entry.path,  extension, containsTxt, subFolders)
    except OSError as ose:
        print('Cannot access ' + path +'. Probably a permissions error ', ose)
    except FileNotFoundError as fnf:
        print(path +' not found ', fnf)

def findFilesInFolderYieldandGetDf(path,  extension, containsTxt, subFolders = True, excludeText = ''):
    """  Converts returned data from findFilesInFolderYield and creates and Pandas Dataframe.
    Recursive function to find all files of an extension type in a folder (and optionally in all subfolders too)

    path:               Base directory to find files
    extension:          File extension to find.  e.g. 'txt'.  Regular expression. Or  'ls\d' to match ls1, ls2, ls3 etc
    containsTxt:        List of Strings, only finds file if it contains this text.  Ignore if '' (or blank)
    subFolders:         Bool.  If True, find files in all subfolders under path. If False, only searches files in the specified folder
    excludeText:        Text string.  Ignore if ''. Will exclude if text string is in path.
    """
    
    fileSizes, accessTimes, modificationTimes, creationTimes , paths  = zip(*findFilesInFolderYield(path,  extension, containsTxt, subFolders))
    df = pd.DataFrame({
            'FLS_File_Size':fileSizes,
            'FLS_File_Access_Date':accessTimes,
            'FLS_File_Modification_Date':np.array(modificationTimes).astype('timedelta64[ns]'),
            'FLS_File_Creation_Date':creationTimes,
            'FLS_File_PathName':paths,
                  })
    
    df['FLS_File_Modification_Date'] = pd.to_datetime(df['FLS_File_Modification_Date'],infer_datetime_format=True)
    df['FLS_File_Creation_Date'] = pd.to_datetime(df['FLS_File_Creation_Date'],infer_datetime_format=True)
    df['FLS_File_Access_Date'] = pd.to_datetime(df['FLS_File_Access_Date'],infer_datetime_format=True)

    return df

ext =   'txt'  # regular expression 
containsTxt=[]
path = 'C:\myFolder'
df = findFilesInFolderYieldandGetDf(path,  ext, containsTxt, subFolders = True)
DougR
  • 2,780
  • 1
  • 25
  • 28
22

path.py is another alternative: https://github.com/jaraco/path.py

from path import path
p = path('/path/to/the/directory')
for f in p.files(pattern='*.txt'):
    print f
Anuvrat Parashar
  • 2,752
  • 5
  • 27
  • 51
19

Python has all tools to do this:

import os

the_dir = 'the_dir_that_want_to_search_in'
all_txt_files = filter(lambda x: x.endswith('.txt'), os.listdir(the_dir))
Xxxo
  • 1,620
  • 1
  • 13
  • 24
  • 1
    If you want all_txt_files to be a list: `all_txt_files = list(filter(lambda x: x.endswith('.txt'), os.listdir(the_dir)))` – Ena Jun 11 '18 at 15:39
19

To get all '.txt' file names inside 'dataPath' folder as a list in a Pythonic way:

from os import listdir
from os.path import isfile, join
path = "/dataPath/"
onlyTxtFiles = [f for f in listdir(path) if isfile(join(path, f)) and  f.endswith(".txt")]
print onlyTxtFiles
Arsen Khachaturyan
  • 7,335
  • 4
  • 37
  • 38
ewalel
  • 1,686
  • 18
  • 24
10

I did a test (Python 3.6.4, W7x64) to see which solution is the fastest for one folder, no subdirectories, to get a list of complete file paths for files with a specific extension.

To make it short, for this task os.listdir() is the fastest and is 1.7x as fast as the next best: os.walk() (with a break!), 2.7x as fast as pathlib, 3.2x faster than os.scandir() and 3.3x faster than glob.
Please keep in mind, that those results will change when you need recursive results. If you copy/paste one method below, please add a .lower() otherwise .EXT would not be found when searching for .ext.

import os
import pathlib
import timeit
import glob

def a():
    path = pathlib.Path().cwd()
    list_sqlite_files = [str(f) for f in path.glob("*.sqlite")]

def b(): 
    path = os.getcwd()
    list_sqlite_files = [f.path for f in os.scandir(path) if os.path.splitext(f)[1] == ".sqlite"]

def c():
    path = os.getcwd()
    list_sqlite_files = [os.path.join(path, f) for f in os.listdir(path) if f.endswith(".sqlite")]

def d():
    path = os.getcwd()
    os.chdir(path)
    list_sqlite_files = [os.path.join(path, f) for f in glob.glob("*.sqlite")]

def e():
    path = os.getcwd()
    list_sqlite_files = [os.path.join(path, f) for f in glob.glob1(str(path), "*.sqlite")]

def f():
    path = os.getcwd()
    list_sqlite_files = []
    for root, dirs, files in os.walk(path):
        for file in files:
            if file.endswith(".sqlite"):
                list_sqlite_files.append( os.path.join(root, file) )
        break



print(timeit.timeit(a, number=1000))
print(timeit.timeit(b, number=1000))
print(timeit.timeit(c, number=1000))
print(timeit.timeit(d, number=1000))
print(timeit.timeit(e, number=1000))
print(timeit.timeit(f, number=1000))

Results:

# Python 3.6.4
0.431
0.515
0.161
0.548
0.537
0.274
user136036
  • 9,320
  • 5
  • 41
  • 46
  • 1
    The Python 3.6.5 documentation states : The os.scandir() function returns directory entries along with file attribute information, giving better performance [ than os.listdir() ] for many common use cases. – Bill Oldroyd Apr 16 '18 at 08:23
  • 1
    I am missing the scaling extent of this test how many files did you use in this test? how do they compare if you scale the number up/down? – N4ppeL Nov 11 '19 at 12:47
9
import os
import sys 

if len(sys.argv)==2:
    print('no params')
    sys.exit(1)

dir = sys.argv[1]
mask= sys.argv[2]

files = os.listdir(dir); 

res = filter(lambda x: x.endswith(mask), files); 

print res
mrgloom
  • 17,637
  • 28
  • 146
  • 263
7

To get an array of ".txt" file names from a folder called "data" in the same directory I usually use this simple line of code:

import os
fileNames = [fileName for fileName in os.listdir("data") if fileName.endswith(".txt")]
Kamen Tsvetkov
  • 399
  • 4
  • 8
6

This code makes my life simpler.

import os
fnames = ([file for root, dirs, files in os.walk(dir)
    for file in files
    if file.endswith('.txt') #or file.endswith('.png') or file.endswith('.pdf')
    ])
for fname in fnames: print(fname)
praba230890
  • 2,074
  • 19
  • 35
5

Use fnmatch: https://docs.python.org/2/library/fnmatch.html

import fnmatch
import os

for file in os.listdir('.'):
    if fnmatch.fnmatch(file, '*.txt'):
        print file
yucer
  • 3,691
  • 3
  • 30
  • 38
5

A copy-pastable solution similar to the one of ghostdog:

def get_all_filepaths(root_path, ext):
    """
    Search all files which have a given extension within root_path.

    This ignores the case of the extension and searches subdirectories, too.

    Parameters
    ----------
    root_path : str
    ext : str

    Returns
    -------
    list of str

    Examples
    --------
    >>> get_all_filepaths('/run', '.lock')
    ['/run/unattended-upgrades.lock',
     '/run/mlocate.daily.lock',
     '/run/xtables.lock',
     '/run/mysqld/mysqld.sock.lock',
     '/run/postgresql/.s.PGSQL.5432.lock',
     '/run/network/.ifstate.lock',
     '/run/lock/asound.state.lock']
    """
    import os
    all_files = []
    for root, dirs, files in os.walk(root_path):
        for filename in files:
            if filename.lower().endswith(ext):
                all_files.append(os.path.join(root, filename))
    return all_files

You can also use yield to create a generator and thus avoid assembling the complete list:

def get_all_filepaths(root_path, ext):
    import os
    for root, dirs, files in os.walk(root_path):
        for filename in files:
            if filename.lower().endswith(ext):
                yield os.path.join(root, filename)
Martin Thoma
  • 108,021
  • 142
  • 552
  • 849
  • The main flaw in the @ghostdog answer is case sensitivity. The use of `lower()` here is critical in many situations. Thanks! But I"m guessing the doctest won't work, right A solution using `yield` might also be better in many situations. – nealmcb Jul 15 '21 at 19:09
  • 1
    @nealmcb I don't know how to write a brief doctest for a function that makes use of the local file system For me, the primary purpose of the docstring is communication to a human. If the docstring helps to understand what the function is doing, it's a good docstring. – Martin Thoma Jul 15 '21 at 20:58
  • 1
    About yield: Yes, that's a good idea for sure! Adjusting it to use `yield` is trivial. I would like to keep the answer beginner-friendly which means to avoid yield... maybe I add it later – Martin Thoma Jul 15 '21 at 21:00
4

I suggest you to use fnmatch and the upper method. In this way you can find any of the following:

  1. Name.txt;
  2. Name.TXT;
  3. Name.Txt

.

import fnmatch
import os

    for file in os.listdir("/Users/Johnny/Desktop/MyTXTfolder"):
        if fnmatch.fnmatch(file.upper(), '*.TXT'):
            print(file)
Nicolaesse
  • 2,189
  • 8
  • 46
  • 64
4

Here's one with extend()

types = ('*.jpg', '*.png')
images_list = []
for files in types:
    images_list.extend(glob.glob(os.path.join(path, files)))
Efreeto
  • 1,789
  • 1
  • 22
  • 24
3

Functional solution with sub-directories:

from fnmatch import filter
from functools import partial
from itertools import chain
from os import path, walk

print(*chain(*(map(partial(path.join, root), filter(filenames, "*.txt")) for root, _, filenames in walk("mydir"))))
2

In case the folder contains a lot of files or memory is an constraint, consider using generators:

def yield_files_with_extensions(folder_path, file_extension):
   for _, _, files in os.walk(folder_path):
       for file in files:
           if file.endswith(file_extension):
               yield file

Option A: Iterate

for f in yield_files_with_extensions('.', '.txt'): 
    print(f)

Option B: Get all

files = [f for f in yield_files_with_extensions('.', '.txt')]
tashuhka
  • 4,758
  • 3
  • 43
  • 64
1

use Python OS module to find files with specific extension.

the simple example is here :

import os

# This is the path where you want to search
path = r'd:'  

# this is extension you want to detect
extension = '.txt'   # this can be : .jpg  .png  .xls  .log .....

for root, dirs_list, files_list in os.walk(path):
    for file_name in files_list:
        if os.path.splitext(file_name)[-1] == extension:
            file_name_path = os.path.join(root, file_name)
            print file_name
            print file_name_path   # This is the full path of the filter file
Rajiv Sharma
  • 6,032
  • 48
  • 52
1

Many users have replied with os.walk answers, which includes all files but also all directories and subdirectories and their files.

import os


def files_in_dir(path, extension=''):
    """
       Generator: yields all of the files in <path> ending with
       <extension>

       \param   path       Absolute or relative path to inspect,
       \param   extension  [optional] Only yield files matching this,

       \yield              [filenames]
    """


    for _, dirs, files in os.walk(path):
        dirs[:] = []  # do not recurse directories.
        yield from [f for f in files if f.endswith(extension)]

# Example: print all the .py files in './python'
for filename in files_in_dir('./python', '*.py'):
    print("-", filename)

Or for a one off where you don't need a generator:

path, ext = "./python", ext = ".py"
for _, _, dirfiles in os.walk(path):
    matches = (f for f in dirfiles if f.endswith(ext))
    break

for filename in matches:
    print("-", filename)

If you are going to use matches for something else, you may want to make it a list rather than a generator expression:

    matches = [f for f in dirfiles if f.endswith(ext)]
kfsone
  • 22,620
  • 2
  • 39
  • 70