1

Our s3 buckets generally have a number of sub-directories, so that the path to a bucket is something like s3:functional-group/service/org-tenant-company-id/entity-id/actual-data

We're looking into Athena to be able to query against data on that /actual-data level, but within the org-tenant-company-id. So it seems like we need a way to either create a column or partition for that org-tenant-company-id. Is this possible?

I've read the page on partitions in the Athena docs. Seems like we may have to manually create partitions via the JDBC driver?

John Rotenstein
  • 203,710
  • 21
  • 304
  • 382
user26270
  • 6,553
  • 13
  • 59
  • 90

2 Answers2

0

Yes you can manually create the partitions, but if you set up you folder structure in hive format for example (s3:functional-group/service/org-tenant-company-id=xxxx/), the you can simply do a "MSCK REPAIR TABLE" command and Athena will automatically create all partitions for you.

Ted
  • 532
  • 6
  • 8
0

You can use the path as an attribute (How to get input file name as column in AWS Athena external tables) and use CTAS to create partitions.

Cornelius Roemer
  • 1,853
  • 13
  • 32