1

I am looking to find the size of each object in my S3 AWS account. Alternatively, list out objects that are more than 2 GB in Size.

I have tried listing out by bucket and I am able to get the total size:

s3 = boto3.resource('s3')
bucket = s3.Bucket('bucket-name')
size = 0

for o in bucket.objects.all():    
        size += o.size    
print ('s3 size = %.3f GB' % (size/1024/1024/1024))

I am trying to find the output as similar to the AWS CLI command which gives the object name and size.

I know S3 lists up to to 1K object (paginated) based on the request and I would have to parse it. Also, if the bucket is huge (high millions to billions) listing is going to be really rough.

Would really appreciate any inputs here.

Thanks

Ron
  • 11
  • 1
  • Can you save yourself a trouble of doing this in python and use [S3 inventory](https://docs.aws.amazon.com/AmazonS3/latest/dev/storage-inventory.html) instead to get the size of all your objects? – Marcin Oct 17 '20 at 03:46
  • Full code and IAM role can be found here : https://stackoverflow.com/a/58220730/9931092 – Amit Baranes Oct 17 '20 at 10:16
  • Yes we have been considering using S3 inventory too. – Ron Oct 19 '20 at 21:03
  • Thanks Amit for the code link. Will look into it and respond for further questions. – Ron Oct 19 '20 at 21:04

1 Answers1

0

Print all objects and their size:

for o in bucket.objects.all():    
  print(o.key, o.size)   

To only print objects larger than 2GB:

for o in bucket.objects.all():  
  if o.size > 2 * 1024 * 1024 * 1024:  
    print(o.key, o.size)   

However, if you have millions of objects, I would recommend Amazon S3 Inventory, which can provide a daily or weekly CSV file listing all objects (including their size).

Jonny5
  • 1,245
  • 1
  • 11
  • 36
John Rotenstein
  • 203,710
  • 21
  • 304
  • 382
  • I think I tried that before and got this error: AttributeError: 's3.ObjectSummary' object has no attribute 'content_length' – Ron Oct 19 '20 at 21:09
  • Oops! Should be `size`. Fixed! – John Rotenstein Oct 19 '20 at 23:31
  • I haven't seen `AllAccessDisabled` before. Is it happening on just one object, or is it a whole bucket? I wonder if there might be a Bucket Policy that is denying access? – John Rotenstein Oct 22 '20 at 21:22
  • Hi John, That was an error on my side. fixed it with a for loop. the bucket name i was giving was incorrect. IAM policy is fine. – Ron Oct 22 '20 at 22:17
  • Thank you for your inputs! Really appreciate it. – Ron Oct 22 '20 at 22:17