1

How should the files on my server be uploaded to google cloud storage?

the code I have tried is given below, however, it throws a type error, saying, the expected type is not byte for:

the expected type is not byte for:
blob.upload_from_file(file.file.read()).

Although upload_from_file requires a binary type.

@app.post("/file/")
async def create_upload_file(files: List[UploadFile] = File(...)):
    storage_client = storage.Client.from_service_account_json(path.json)
    bucket_name = 'data'
    try:
        bucket = storage_client.create_bucket(bucket_name)
    except Exception:
        bucket = storage_client.get_bucket(bucket_name)
    for file in files:        
        destination_file_name = f'{file.filename}'
        new_data = models.Data(
            path=destination_file_name
        )
        try:
            blob = bucket.blob(destination_file_name)
        blob.upload_from_file(file.file.read())
        except Exception:
            raise HTTPException(
                status_code=500,
                detail="File upload failed"
            )

1 Answers1

0

Option 1

As per the documentation, upload_from_file() supports a file-like object; hence, you could use the .file attribute of UploadFile (which represents a SpooledTemporaryFile instance). For example:

blob.upload_from_file(file.file)  

Option 2

You could read the contents of the file and pass them to upload_from_string(), which supports data in bytes or string format. For instance:

blob.upload_from_string(file.file.read())

or, since you defined your endpoint with async def (see this answer for def vs async def):

contents = await file.read()
blob.upload_from_string(contents)

Option 3

For the sake of completeness, upload_from_filename() expects a filename which represents the path to the file. Hence, the No such file or directory error was thrown when you passed file.filename (as mentioned in your comment), as this is not a path to the file. To use that method (as a last resort), you should save the file contents to a NamedTemporaryFile, which "has a visible name in the file system" that "can be used to open the file", and once you are done with it, delete it. Example:

from tempfile import NamedTemporaryFile
import os

contents = await file.read() # or, contents = file.file.read()
temp = NamedTemporaryFile(delete=False)
try:
    with temp as f:
        f.write(contents);
    blob.upload_from_filename(temp.name) 
finally:
    temp.close()
    os.unlink(temp.name)

Note:

If you are uploading a rather large file to Google Cloud Storage that may require some time to completely upload, and have encountered a timeout error, please consider increasing the amount of time to wait for the server response, by changing the timeout value, which - as shown in upload_from_file() for example - by default is set to timeout=60 seconds. To change that, use e.g., blob.upload_from_file(file.file, timeout=180), or you could also set timeout=None (meaning that it will wait until the connection is closed).

Chris
  • 4,940
  • 2
  • 7
  • 28
  • Thank you very much for the all mentioned options, and for your time. However, there is still an "Internal server error" when I try to upload large files (up to 1 GB) with provided solutions, do you think I should send them as chunks? or the reason is something else? – Random aier Apr 17 '22 at 07:22
  • 1
    It was related to the default timeout limit of upload_from_file(). initializing it to None solved the problem. Thank you very much for your response again. – Random aier Apr 19 '22 at 07:36