can AWS athena partition or columns be created from S3 bucket sub-directories?

Question

Our s3 buckets generally have a number of sub-directories, so that the path to a bucket is something like s3:functional-group/service/org-tenant-company-id/entity-id/actual-data

We're looking into Athena to be able to query against data on that /actual-data level, but within the org-tenant-company-id. So it seems like we need a way to either create a column or partition for that org-tenant-company-id. Is this possible?

I've read the page on partitions in the Athena docs. Seems like we may have to manually create partitions via the JDBC driver?

score 0 · Answer 1 · answered Jun 14 '17 at 18:40

Yes you can manually create the partitions, but if you set up you folder structure in hive format for example (s3:functional-group/service/org-tenant-company-id=xxxx/), the you can simply do a "MSCK REPAIR TABLE" command and Athena will automatically create all partitions for you.

score 0 · Answer 2 · answered Apr 15 '20 at 13:53

0

You can use the path as an attribute (How to get input file name as column in AWS Athena external tables) and use CTAS to create partitions.

answered Apr 15 '20 at 13:53

Cornelius Roemer

1,853
13
32

can AWS athena partition or columns be created from S3 bucket sub-directories?

2 Answers2