-1

Say I want to make parallel API post requests.

In a for loop I can append the http post call into a list of tasks, (each task invoked using Task.Run) and then wait for all to finish using await Task.WhenAll. Thus the control will go to caller while waiting for the network request to complete. Effectively the API request will be made in parallel.

Similarly I can use Parallel.ForEachAsync which will automatically do the WhenAll and return control to caller. So I want to ask whether ForEachAsync is a replacement to a plain for loop list (async await Task.Run) and WhenAll?

Theodor Zoulias
  • 24,585
  • 5
  • 40
  • 69
variable
  • 6,123
  • 5
  • 52
  • 127
  • 2
    No, it's not. `Parallel.ForEach` does a *lot* more than just use multiple tasks. - it partitions the data so that each worker task won't have to synchronize with others to access the data. Then it uses as many workers as there are cores to process those partitions. There's little point in starting 100 workers if there are only 4 cores. The other 96 workers will simply do nothing except add to the scheduling overhead – Panagiotis Kanavos Jul 27 '21 at 11:57
  • `which will automatically do the WaitAll` that's not what happens. `Parallel` will use the current thread to process data, and since all cores are busy crunching data, it appears as if the thread is "blocked". It's not – Panagiotis Kanavos Jul 27 '21 at 11:58
  • A threading equivalent to `Parallel.ForEachAsync` is an `ActionBlock` with a DOP equal to the number of cores, using an async lambda. Even then , an ActionBlock doesn't deal with partitioning, nor does it dynamically alter the number of workers, or handle load balancing the way `Parallel.ForEach` does – Panagiotis Kanavos Jul 27 '21 at 12:01
  • 2
    In fact, an ActionBlock would be a *lot* better than a loop and WaitAll. With an ActionBlock you can limit the number of concurrent connections easily. Neither servers nor clients have infinite bandwidth or CPU, so trying to send 100 HTTP requests concurrently can easily be *slower* than making just 10 at a time – Panagiotis Kanavos Jul 27 '21 at 12:03
  • 1
    PS: `ForEachAsync` returns a `Task`, so it behaves as if you called `WhenAll`, not `WaitAll` – Panagiotis Kanavos Jul 27 '21 at 12:04
  • 1
    I found the [Github issue where ForEachAsync was discussed](https://github.com/dotnet/runtime/issues/1946) and it sounds like partitioning is *not* used. `ForEach` and `ForEachAsync` pass state between workers and iterations though, something not possible with either a loop of tasks or `ActionBlock. And as [the source shows](https://github.com/dotnet/runtime/blob/57bfe474518ab5b7cfe6bf7424a79ce3af9d6657/src/libraries/System.Threading.Tasks.Parallel/src/System/Threading/Tasks/Parallel.ForEachAsync.cs#L88) it doesn't just start some tasks. – Panagiotis Kanavos Jul 27 '21 at 12:21
  • Somewhat related: [Parallel.ForEach vs Task.Run and Task.WhenAll](https://stackoverflow.com/questions/19102966/parallel-foreach-vs-task-run-and-task-whenall) – Theodor Zoulias Jul 29 '21 at 18:55
  • So parallel for each used 1 core per task. Where as async await makes use of 1 thread per task? – variable Aug 21 '21 at 03:35

1 Answers1

1

No, the Parallel.ForEachAsync API has quite a lot of differences compared to a trivial use of the Task.WhenAll API:

  1. The elephant in the room: the await Task.WhenAll returns an array with the results of the asynchronous operations. On the contrary the Parallel.ForEachAsync returns a naked Task. If you want the results you must rely on side-effects, like updating a ConcurrentQueue<T> as part of the asynchronous operation.

  2. The Parallel.ForEachAsync invokes the supplied asynchronous delegate in parallel, on ThreadPool threads (configurable). On the contrary the common pattern of using the Task.WhenAll is to create the Tasks sequentially, on the current thread. This raises concerns about using the Parallel.ForEachAsync in ASP.NET applications, where offloading work on the ThreadPool might have scalability implications.

  3. The Parallel.ForEachAsync invokes the asynchronous delegate and awaits the generated tasks, while enforcing a maximum level of concurrency equal to Environment.ProcessorCount. This behavior is configurable through the MaxDegreeOfParallelism option. On the contrary the common pattern of using the Task.WhenAll is to create all the tasks at once, imposing no concurrency limitation.

  4. The common pattern of using the Task.WhenAll is to assume that creating all the tasks is impossible to fail midway, and so to take no precautions against this possibility. In case this actually happens, fire-and-forget tasks might be leaked. This is not possible with the Parallel.ForEachAsync API.

  5. The Parallel.ForEachAsync will stop invoking the asynchronous delegate as soon as the first error occurs on either an asynchronous delegate invocation, or a created Task, and then propagates a failure containing all the errors that have occurred so far, after awaiting all the already created tasks. It also provides a mechanism for canceling the other tasks that are in-flight when the error occurs (the CancellationToken that is passed as second argument in the lambda). On the contrary the Task.WhenAll waits invariably for all the tasks to complete. This means that you might have to wait for a lot longer, before eventually receiving an AggregateException containing the errors of all the tasks that have failed.

Theodor Zoulias
  • 24,585
  • 5
  • 40
  • 69