7

I wanted to parallelize a for loop and found out about std::for_each as well as its execution policies. Surprisingly it didn't parallelize when using GCC:

#include <iostream>
#include <algorithm>
#include <execution>
#include <chrono>
#include <thread>
#include <random>

int main() {
    std::vector<int> foo;
    foo.reserve(1000);
    for (int i = 0; i < 1000; i++) {
        foo.push_back(i);
    }

    std::for_each(std::execution::par_unseq,
                  foo.begin(), foo.end(),
                  [](auto &&item) {
                      std::cout << item << std::endl;
                      std::random_device dev;
                      std::mt19937 rng(dev());
                      std::uniform_int_distribution<std::mt19937::result_type> dist6(10, 100);
                      std::this_thread::sleep_for(std::chrono::milliseconds(dist6(rng)));
                      std::cout << "Thread ID: " << std::this_thread::get_id() << std::endl;
                  });
}

This code still runs sequentially.

Using MSVC the code is parallelized and finishes much quicker.

GCC:

$ gcc --version
gcc (Ubuntu 10.1.0-2ubuntu1~18.04) 10.1.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

MSVC:

>cl.exe
Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29112 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

usage: cl [ option... ] filename... [ /link linkoption... ]

CMakeLists.txt:

cmake_minimum_required(VERSION 3.17)
project(ParallelTesting)

set(CMAKE_CXX_STANDARD 20)

add_executable(ParallelTesting main.cpp)

Is there anything specific I need to do to enable parallelization with GCC as well?

ldd output of my binary:

$ ldd my_binary
    linux-vdso.so.1 (0x00007ffe9e6b9000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f79efaa0000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f79ef881000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f79ef4ad000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f79ef295000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f79eeea4000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f79f041a000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f79eeb06000)

The debug and release version of the binary overall have the same ldd output.

BullyWiiPlaza
  • 14,779
  • 7
  • 95
  • 158
  • 3
    Have you tried changing GCC's optimization level? My reading of the cppreference docs indicates that parallelization is permitted, not required, when `par_unseq` is specified. – Dr. Watson Dec 29 '20 at 17:37
  • 3
    Libstdc++ (likely used by your GCC) doesn't have its own version of parallelized algorithms. Instead, it uses Intel TBB as a backend. Is TBB linked to your program? Libstdc++ may be also configured to use a serial backend, for more details, look here: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/pstl/parallel_backend.h. – Daniel Langr Dec 29 '20 at 17:37
  • @Dr.Watson: Compiling with full optimizations does not help either. Daniel, I'm using a "regular" `GCC 10` version so I'm not sure how it is configured specifically and whether `TBB` is linked to it. I added the `ldd` output of my binary to the question for further investigation. – BullyWiiPlaza Dec 30 '20 at 21:08
  • 1
    @BullyWiiPlaza, I think Daniel's got the right idea. The cppreference library feature support [page](https://en.cppreference.com/w/cpp/compiler_support) suggests that GCC only supports the Parallelism TS when compiled with -ltbb. So you'll want the `tbb` library somewhere in your project, and then you'll want to `add_library` and `target_link_libraries`, as with any other lib. – Dr. Watson Dec 31 '20 at 06:25
  • 1
    `sudo apt-get install libtbb-dev` and then link with `-ltbb`, otherwise you'll get undefined symbols. – metalfox Dec 31 '20 at 11:41

1 Answers1

4

I solved it by firstly upgrading my WSL Ubuntu distribution from version 18.04 to 20.04 since after running sudo apt install gcc libtbb-dev to install TBB I still got the following error: #error Intel(R) Threading Building Blocks 2018 is required; older versions are not supported. This is caused by TBB being too old.

Now with TBB 2002.1-2 installed it's working as expected:

$ sudo apt install libtbb-dev
[sudo] password for ubuntu:
Reading package lists... Done
Building dependency tree
Reading state information... Done
libtbb-dev is already the newest version (2020.1-2).
0 upgraded, 0 newly installed, 0 to remove and 10 not upgraded.

This answer describes all the details very well.

Since I'm using CMake I also had to add the following line to my CMakeLists.txt:

# Link against the dependency of Intel TBB (for parallel C++17 algorithms)
target_link_libraries(${PROJECT_NAME} tbb)
BullyWiiPlaza
  • 14,779
  • 7
  • 95
  • 158