5

Recently, I discovered void __builtin_assume(bool) for clang, which can provide additional information about the state of the program to the compiler. This can make a huge difference, like for example:

#include <cstddef>

// compiles to about 80 instructions at -O3
unsigned sum(unsigned data[], size_t count) {
    unsigned sum = 0;
    for (size_t i = 0; i < count; ++i) {
        sum += data[i];
    }
    return sum;
}

// compiles to about 10 instructions at -O3
unsigned sum_small(unsigned data[], size_t count) {
    __builtin_assume(count <= 4);
    unsigned sum = 0;
    for (size_t i = 0; i < count; ++i) {
        sum += data[i];
    }
    return sum;
}

I am forced to use GCC at this time and I am curious whether there exists an equivalent builtin. Unfortunately I could not find __builtin_assume in the GCC documentation. Maybe there exists a builtin but it just has a different name?

If there doesn't exist an equivalent builtin, is there maybe a way to produce the same result without __builtin_assume, such as intentionally invoking undefined behavior when the condition is not true?

Ideally, I would like a macro which is always safe to call like:

#if ... // detect clang
#define MY_ASSUME(condition) __builtin_assume(condition)
#elif ... // detect GCC
#define MY_ASSUME(condition) __gcc_builtin_assume_equivalent(condition)
#else
#define MY_ASSUME(condition)
#endif

Whatever the solution is, it should also work in a constexpr function.

Jan Schultke
  • 5,808
  • 1
  • 22
  • 57
  • 1
    Related: [How to guide GCC optimizations based on assertions without runtime cost?](https://stackoverflow.com/questions/44054078/how-to-guide-gcc-optimizations-based-on-assertions-without-runtime-cost) – Evg Aug 19 '20 at 19:51

2 Answers2

6

I've used __builtin_unreachable() which indicates that it is Undefined Behavior for control flow to reach here. You can wrap it in an if to essentially write an assertion. The condition can be any invariant that is false, so in your case you would put the opposite condition.

Example :

// Basically `assert(count <= 4);`
if ( !(count <= 4) ) {
    __builtin_unreachable();
}

Edit : In response to the comment, you can convert this into an assertion macro like this :

// Line break for readability
#define my_assert( condition ) \
    { if(!(condition)) __builtin_unreachable(); }

Based on the code in the question, you would use it like this :

unsigned sum_small(unsigned data[], size_t count) {
    my_assert(count <= 4); // <--- Changed here
    unsigned sum = 0;
    for (size_t i = 0; i < count; ++i) {
        sum += data[i];
    }
    return sum;
}
François Andrieux
  • 26,465
  • 6
  • 51
  • 83
  • Could you make that work inside of a safe macro as mentioned in the question? If it's in a macro then it can't use `if` because it would interfere with `if-else` chains.. – Jan Schultke Aug 19 '20 at 19:58
  • @J.Schultke • `#define ASSUME(c) do { if (!(c)) __builtin_unreachable(); } while(false)` won't interfere with if-else chains. – Eljay Aug 19 '20 at 20:00
  • this is insufficient to match `__builtin_assume`, because when doing this gcc will often attempt to evaluate the expression, no matter what amount of effort your pour in to tell it to assume the expression has no side effects. this should be fine if you don't perform function calls in the assertion, but may really harm performance if you have involved checks the compiler cannot manage to eliminate!! – Asu Aug 09 '21 at 14:08
  • @Asu Can you please elaborate or provide an example? When I try it [here](https://godbolt.org/z/a6dT9joMK) the macro seems to be optimized out as expected and the condition is not evaluated at runtime. – François Andrieux Sep 21 '21 at 13:58
  • @FrançoisAndrieux https://godbolt.org/z/6crecvx8a in this case it works as expected, even if you make `compare` `gnu::noinline`. but add a call to an undefined `bar();` inside of `compare` and it won't get optimized away. (i believe this could be due to exceptions/longjmp that could skip the if branch) as it turns out though clang's implementation of `__builtin_assume` seems worse than i thought: in that usecase it will not use the assumption, and if you annotate `compare` as `gnu::pure` it causes a function call to be emitted... urgh. – Asu Sep 22 '21 at 11:05
1

I feel like going trough undefined behavior here is completely unneeded. The very simple if check couple with abort is well-defined and gives optimizer enough food for thought:

#include <cstddef>
#include <cstdlib>

// compiles to about 10 instructions at -O3
unsigned sum_small(unsigned data[], size_t count) {
    if (count > 4)
        std::abort();
    unsigned sum = 0;
    for (size_t i = 0; i < count; ++i) {
        sum += data[i];
    }
    return sum;
}

No need to summon nasal demons when none are required.

SergeyA
  • 59,974
  • 5
  • 72
  • 130
  • 1
    [That doesn't work on clang unfortunately.](https://godbolt.org/z/sh54xc). The compiler just checks the condition and calls `abort`. Otherwise this is also a good solution. I wonder why clang treats `abort` differently though. On second look it does the same on GCC, but it optimizes the rest of the function. – Jan Schultke Aug 19 '20 at 20:41
  • @J.Schultke I believe you asked about a replacement for gcc? I feel it is not hard to incorporate that into macro and use for clang. – SergeyA Aug 19 '20 at 21:03