15

I want to run a simple background task in FastAPI, which involves some computation before dumping it into the database. However, the computation would block it from receiving any more requests.

from fastapi import BackgroundTasks, FastAPI

app = FastAPI()
db = Database()

async def task(data):
    otherdata = await db.fetch("some sql")
    newdata = somelongcomputation(data,otherdata) # this blocks other requests
    await db.execute("some sql",newdata)
   


@app.post("/profile")
async def profile(data: Data, background_tasks: BackgroundTasks):
    background_tasks.add_task(task, data)
    return {}

What is the best way to solve this issue?

Chris
  • 4,940
  • 2
  • 7
  • 28
Gary Ong
  • 418
  • 1
  • 3
  • 8
  • If the computation is heavy and do not involve IO it is better to use multiprocessing. – alex_noname May 19 '21 at 12:03
  • i am using the docker fastapi for deployment it's using all cpu cores for the server by default. I dont want to use another service like celery as the product is still in prototyping phase and has no users. – Gary Ong May 19 '21 at 13:31

2 Answers2

43

Your task is defined as async, which means fastapi (or rather starlette) will run it in the asyncio event loop. And because somelongcomputation is synchronous (i.e. not waiting on some IO, but doing computation) it will block the event loop as long as it is running.

I see a few ways of solving this:

  • Use more workers (e.g. uvicorn main:app --workers 4). This will allow up to 4 somelongcomputation in parallel.

  • Rewrite your task to not be async (i.e. define it as def task(data): ... etc). Then starlette will run it in a separate thread.

  • Use fastapi.concurrency.run_in_threadpool, which will also run it in a separate thread. Like so:

    from fastapi.concurrency import run_in_threadpool
    async def task(data):
        otherdata = await db.fetch("some sql")
        newdata = await run_in_threadpool(lambda: somelongcomputation(data, otherdata))
        await db.execute("some sql", newdata)
    
    • Or use asyncios's run_in_executor directly (which run_in_threadpool uses under the hood):
      import asyncio
      async def task(data):
          otherdata = await db.fetch("some sql")
          loop = asyncio.get_running_loop()
          newdata = await loop.run_in_executor(None, lambda: somelongcomputation(data, otherdata))
          await db.execute("some sql", newdata)
      
      You could even pass in a concurrent.futures.ProcessPoolExecutor as the first argument to run_in_threadpool to run it in a separate process.
  • Spawn a separate thread / process yourself. E.g. using concurrent.futures.

  • Use something more heavy-handed like celery. (Also mentioned in the fastapi docs here).

mihi
  • 1,879
  • 12
  • 22
  • 1
    I am facing the same problem here and I wonder why not just using `asyncio.create_task(task(data))`? I am doing some tests and seems to be the solution. – Misael Alarcon Dec 16 '21 at 22:26
  • You mean instead of using `BackgroundTasks`? Are you sure that works? Because `asyncio.create_task` will run the task (and therefore `somelongcomputation`) in the event loop, which will then be blocked, just like in the question. The reason that `run_in_threadpool` works is that it runs the computation in the underlying threadpool directly, sidestepping the event loop. – mihi Dec 18 '21 at 16:31
  • if not using `async` spawns another thread, isn't this better than using `async`? – Crashalot Jan 25 '22 at 08:42
  • @Crashalot depends on the situation. Have a look at some of the answers here: https://stackoverflow.com/questions/27435284/multiprocessing-vs-multithreading-vs-asyncio-in-python-3, and maybe here: https://discuss.python.org/t/what-are-the-advantages-of-asyncio-over-threads/2112/6. – mihi Jan 27 '22 at 16:22
  • where would one need to pass ```concurrent.futures.ProcessPoolExecutor``` in exactly? In ```newdata = await loop.run_in_executor(ProcessPoolExecutor(), lambda: somelongcomputation(data, otherdata))```? – ben Feb 04 '22 at 10:00
  • @ben yep, have a look at the documenation for some examples (https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor) – mihi Feb 04 '22 at 21:48
0

Read this issue.

Also in the example below, my_model.function_b could be any blocking function or process.

TL;DR

from starlette.concurrency import run_in_threadpool

@app.get("/long_answer")
async def long_answer():
    rst = await run_in_threadpool(my_model.function_b, arg_1, arg_2)
    return rst
Zhivar Sourati
  • 395
  • 3
  • 8