1

following the answers to this question Load S3 Data into AWS SageMaker Notebook I tried to load data from S3 bucket to SageMaker Jupyter Notebook.

I used this code:

import pandas as pd

bucket='my-bucket'
data_key = 'train.csv'
data_location = 's3://{}/{}'.format(bucket, data_key)

pd.read_csv(data_location)

I replaced 'my-bucket' by the ARN (Amazon Ressource name) of my S3 bucket (e.g. "arn:aws:s3:::name-of-bucket") and replaced 'train.csv' by the csv-filename which is stored in the S3 bucket. Regarding the rest I did not change anything at all. What I got was this ValueError:

ValueError: Failed to head path 'arn:aws:s3:::name-of-bucket/name_of_file_V1.csv': Parameter validation failed:
Invalid bucket name "arn:aws:s3:::name-of-bucket": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:s3:[a-z\-0-9]+:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\-]{1,63}$"

What did I do wrong? Do I have to modify the name of my S3 bucket?

Tobitor
  • 1,080
  • 9
  • 29
  • I found it: I just had to replace `my-bucket` by `name-of-bucket` without the complete ARN, so without `arn:aws:s3:::`. :-D – Tobitor Feb 17 '21 at 10:39

1 Answers1

1

The path should be:

data_location = 's3://{}/{}'.format(bucket, data_key)

where bucket is <bucket-name> not ARN. For example bucket=my-bucket-333222.

Marcin
  • 168,023
  • 10
  • 140
  • 197