13

I have a Rust async server based on the Tokio runtime. It has to process a mix of latency-sensitive I/O-bound requests, and heavy CPU-bound requests.

I don't want to let the CPU-bound tasks monopolize the Tokio runtime and starve the I/O bound tasks, so I'd like to offload the CPU-bound tasks to a dedicated, isolated threadpool (isolation is the key here, so spawn_blocking/block_in_place on one shared threadpool are insufficient). How can I create such a threadpool in Tokio?

A naive approach of starting two runtimes runs into an error:

thread 'tokio-runtime-worker' panicked at 'Cannot start a runtime from within a runtime. This happens because a function (like block_on) attempted to block the current thread while the thread is being used to drive asynchronous tasks.'

use tokio; // 0.2.20

fn main() {
    let mut main_runtime = tokio::runtime::Runtime::new().unwrap();
    let cpu_pool = tokio::runtime::Builder::new().threaded_scheduler().build().unwrap();
    let cpu_pool = cpu_pool.handle().clone(); // this is the fix/workaround!

    main_runtime.block_on(main_runtime.spawn(async move {
        cpu_pool.spawn(async {}).await
    }))
    .unwrap().unwrap();
}

Can Tokio allow two separate runtimes? Is there a better way to create an isolated CPU pool in Tokio?

Kornel
  • 94,425
  • 32
  • 211
  • 296
  • I'm aware of `block_in_place`, but it doesn't give isolation guarantee I'm looking for. – Kornel May 12 '20 at 13:25
  • *isolation guarantee I'm looking for* and what guarantee is that? Please [edit] your question to state **all** of your requirements. – Shepmaster May 12 '20 at 13:33
  • 2
    @Shepmaster I think it's already there: "I don't want to let the CPU-bound task monopolize the Tokio runtime and starve the I/O bound tasks." – Sven Marnach May 12 '20 at 13:34

3 Answers3

13

While Tokio already has a threadpool, the documentation of Tokio advises:

If your code is CPU-bound and you wish to limit the number of threads used to run it, you should run it on another thread pool such as rayon. You can use an oneshot channel to send the result back to Tokio when the rayon task finishes.

So, if you want to create a threadpool to make heavy use of CPU, a good way is to use a crate like Rayon and send the result back to the Tokio task.

Shepmaster
  • 326,504
  • 69
  • 892
  • 1,159
Stargateur
  • 20,831
  • 8
  • 51
  • 78
  • 3
    [`spawn_blocking`](https://docs.rs/tokio/0.2.20/tokio/task/fn.spawn_blocking.html) likewise says: *to run your CPU-bound computations on only a few threads, you should use a separate thread pool such as rayon rather than configuring the number of blocking threads.* – Shepmaster May 12 '20 at 13:51
  • 2
    Use of rayon requires sending results back via async channels, which is quite cumbersome ;( – Kornel May 12 '20 at 14:10
  • 1
    @Kornel I don't think channel is cumbersome, but whatever what difference with tokio ? I *think* tokio spawn to something quite similar. You need a way to send back the result. Don't forget that the point of multi threading is to use more CPU to do more work, but the overhead exist. The point is that people should only use multithreading if this overheat is worse the gain with multi threading. – Stargateur May 12 '20 at 14:18
5

Starting a Tokio runtime already creates a threadpool. The relevant options are

Roughly speaking, core_threads controls how many threads will be used to process asynchronous code. max_threads - core_threads is how many threads will be used for blocking work (emphasis mine):

Otherwise as core_threads are always active, it limits additional threads (e.g. for blocking annotations) as max_threads - core_threads.

You can also specify these options through the tokio::main attribute.

You can then annotate blocking code with either of:

See also:

spawn_blocking can easily take all of the threads available in the one and only runtime, forcing other futures to wait on them

You can make use of techniques like a Semaphore to restrict maximum parallelism in this case.

Shepmaster
  • 326,504
  • 69
  • 892
  • 1,159
  • 2
    This doesn't answer my question. I'm looking for a **dedicated** thread pool, which doesn't share threads with my main runtime. I'm already using `block_in_place`, but this ruins my latency and blocked threads starve network connections. – Kornel May 12 '20 at 13:42
  • @Kornel please also explain why `spawn_blocking` doesn't accomplish the goal? – Shepmaster May 12 '20 at 13:47
  • 3
    because I have many very long CPU-bound tasks, so `spawn_blocking` can easily take all of the threads available in the one and only runtime, forcing other futures to wait on them. – Kornel May 12 '20 at 14:07
4

Tokio's error message was misleading. The problem was due to Runtime object being dropped in an async context.

The workaround is to use Handle, not Runtime directly, for spawning tasks on the other runtime.

fn main() {
    let mut main_runtime = tokio::runtime::Runtime::new().unwrap();
    let cpu_pool = tokio::runtime::Builder::new().threaded_scheduler().build().unwrap();

    // this is the fix/workaround:
    let cpu_pool = cpu_pool.handle().clone(); 

    main_runtime.block_on(main_runtime.spawn(async move {
        cpu_pool.spawn(async {}).await
    }))
    .unwrap().unwrap();
}
Kornel
  • 94,425
  • 32
  • 211
  • 296