9

How to check if a particular file is present inside a particular directory in my S3? I use Boto3 and tried this code (which doesn't work):

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
key = 'dootdoot.jpg'
objs = list(bucket.objects.filter(Prefix=key))
if len(objs) > 0 and objs[0].key == key:
    print("Exists!")
else:
    print("Doesn't exist")
Sairam Krish
  • 8,268
  • 3
  • 48
  • 60
AshuGG
  • 505
  • 1
  • 6
  • 10

6 Answers6

10

While checking for S3 folder, there are two scenarios:

Scenario 1

import boto3

def folder_exists_and_not_empty(bucket:str, path:str) -> bool:
    '''
    Folder should exists. 
    Folder should not be empty.
    '''
    s3 = boto3.client('s3')
    if not path.endswith('/'):
        path = path+'/' 
    resp = s3.list_objects(Bucket=bucket, Prefix=path, Delimiter='/',MaxKeys=1)
    return 'Contents' in resp
  • The above code uses MaxKeys=1. This it more efficient. Even if the folder contains lot of files, it quickly responds back with just one of the contents.
  • Observe it checks Contents in response

Scenario 2

import boto3

def folder_exists(bucket:str, path:str) -> bool:
    '''
    Folder should exists. 
    Folder could be empty.
    '''
    s3 = boto3.client('s3')
    path = path.rstrip('/') 
    resp = s3.list_objects(Bucket=bucket, Prefix=path, Delimiter='/',MaxKeys=1)
    return 'CommonPrefixes' in resp
  • Observe it strips off the last / from path. This prefix will check just that folder and doesn't check within that folder.
  • Observe it checks CommonPrefixes in response and not Contents
Sairam Krish
  • 8,268
  • 3
  • 48
  • 60
  • @AshuGG By mistake I rejected your edit request. Your correction was right. I tried to accept that edit again but there was no such options. Thanks for the input – Sairam Krish Aug 27 '21 at 13:38
  • Check if a given file is present or checking if a directory is not empty it is not the same thing – Cr4zyTun4 Jan 31 '22 at 11:39
4
import boto3
import botocore

client = boto3.client('s3')
def checkPath(file_path):
  result = client.list_objects(Bucket="Bucket", Prefix=file_path )
  exists=False
  if 'Contents' in result:
      exists=True
  return exists

if the provided file_path will exist then it will return True. example: 's3://bucket/dir1/dir2/dir3/file.txt' file_path: 'dir1/dir2' or 'dir1/' Note:- file path should start with the first directory just after the bucket name.

2

Basically a directory/file is S3 is an object. I have created a method for this (IsObjectExists) that returns True or False. If the directory/file doesn't exists, it won't go inside the loop and hence the method return False, else it will return True.

import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('<givebucketnamehere>')

def IsObjectExists(path):
    for object_summary in bucket.objects.filter(Prefix=path):
        return True
    return False

if(IsObjectExists("<giveobjectnamehere>")):
   print("Directory/File exists")
else:
   print("Directory/File doesn't exists")

Note that if you are checking a folder, make sure that you end the string with / . One use case is that when you try to check for a folder called Hello and if the folder doesn't exist, rather there is a folder called Hello_World. In such case, method will return True. In this case you have to add / character to the end of folder name while coding. You can see how this is handled in the below example

foldername = "Hello/"
if(IsObjectExists(foldername))
    print("Directory/File exists")
Sarath KS
  • 18,103
  • 11
  • 73
  • 80
2
import boto3
import botocore

client = boto3.client('s3')

result= client.list_objects_v2(Bucket='athenards', Prefix = 'cxdata')


for obj in result['Contents']:
    if obj['Key'] == 'cxdata/':
        print("true")
0

Please try this code as following

Get subdirectory info folder¶

folders = bucket.list("","/")
for folder in folders:
    print (folder.name)

PS reference URL(How to use python script to copy files from one bucket to another bucket at the Amazon S3 with boto)

Community
  • 1
  • 1
Willie Cheng
  • 6,537
  • 11
  • 43
  • 63
  • Thanks @Willie Cheng – AshuGG Sep 20 '19 at 11:49
  • 2
    The way to get a 'folder' list in boto3 is ```objects = s3.list_objects_v2(Bucket=BUCKET_NAME, Delimiter='/', Prefix='')``` – Vinayak Mar 23 '20 at 02:30
  • @Vinayak for more detail please check my reference [URL](https://stackoverflow.com/questions/53664405/how-to-use-python-script-to-copy-files-from-one-bucket-to-another-bucket-at-the) – Willie Cheng Mar 24 '20 at 09:52
  • I saw that. It's for boto and not boto3. Since this question is about boto3, my earlier comment seems to be the way to get it to work. – Vinayak Mar 24 '20 at 15:38
0

The following code should work...

import boto3
import botocore

def does_exist(bucket_name, folder_name):
    s3 = boto3.resource(
        service_name='s3',
        region_name='us-east-2',
        aws_access_key_id='********************',
        aws_secret_access_key='********************'
    )
    objects = s3.meta.client.list_objects_v2(Bucket=bucket_name, Delimiter='/', Prefix='')
    # print(objects)
    folders = objects['CommonPrefixes']

    folders_in_bucket = []
    for f in folders:
        print(f['Prefix'])
        folders_in_bucket.append(f['Prefix'])
    return folder_name in folders_in_bucket

print("does it exist?", does_exist('images-bucket','ddd/'))

As @Vinayak mentioned in one of the answer's comment in march, 2020...

The way to get a 'folder' list in boto3 is objects = s3.list_objects_v2(Bucket=BUCKET_NAME, Delimiter='/', Prefix='')

While running this with the latest versions of boto3 and botocore in August 2021 - '1.18.27', '1.21.27' respectively, gives the following error:

AttributeError: 's3.ServiceResource' object has no attribute 'list_objects_v2'

This happens since you are using s3 = s3.resource("mybucketname", credential-params) and s3.ServiceResource will not have s3.list_objects_v2() method. Instead, ServiceResource is having a meta attribute that will further have client type object from where you can apply Client object's methods on ServiceResource Object. like this - s3.meta.client.list_objects_v2()

Hope that helps!