7

I have come across a strange behavior on signed bit-fields:

#include <stdio.h>

struct S {
    long long a31 : 31;
    long long a32 : 32;
    long long a33 : 33;
    long long : 0;
    unsigned long long b31 : 31;
    unsigned long long b32 : 32;
    unsigned long long b33 : 33;
};

long long f31(struct S *p) { return p->a31 + p->b31; }
long long f32(struct S *p) { return p->a32 + p->b32; }
long long f33(struct S *p) { return p->a33 + p->b33; }

int main() {
    struct S s = { -2, -2, -2, 1, 1, 1 };
    long long a32 = -2;
    unsigned long long b32 = 1;
    printf("f31(&s)       => %lld\n", f31(&s));
    printf("f32(&s)       => %lld\n", f32(&s));
    printf("f33(&s)       => %lld\n", f33(&s));
    printf("s.a31 + s.b31 => %lld\n", s.a31 + s.b31);
    printf("s.a32 + s.b32 => %lld\n", s.a32 + s.b32);
    printf("s.a33 + s.b33 => %lld\n", s.a33 + s.b33);
    printf("  a32 +   b32 => %lld\n",   a32 +   b32);
    return 0;
}

Using Clang on OS/X, I get this output:

f31(&s)       => -1
f32(&s)       => 4294967295
f33(&s)       => -1
s.a31 + s.b31 => 4294967295
s.a32 + s.b32 => 4294967295
s.a33 + s.b33 => -1
  a32 +   b32 => -1

Using GCC on Linux, I get this:

f31(&s)       => -1
f32(&s)       => 4294967295
f33(&s)       => 8589934591
s.a31 + s.b31 => 4294967295
s.a32 + s.b32 => 4294967295
s.a33 + s.b33 => 8589934591
  a32 +   b32 => -1

The above output shows 3 types of inconsistencies:

  • different behavior for different compilers;
  • different behavior for different bit-field widths;
  • different behavior for inline expressions and equivalent expressions wrapped in a function.

The C Standard has this language:

6.7.2 Type specifiers

...

Each of the comma-separated multisets designates the same type, except that for bit-fields, it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int.

Bit-fields are notoriously broken in many older compilers...
Is the behavior of Clang and GCC conformant or are these inconsistencies the result of one or more bugs?

Community
  • 1
  • 1
chqrlie
  • 114,102
  • 10
  • 108
  • 170
  • 1
    You might want to take a step back, and just print the value of the six bit fields. – user3386109 Nov 13 '19 at 22:57
  • 1
    The code also has two warnings about mismatched arguments and format specifiers in the `printf`s. Until those are fixed, the code has undefined behavior, and is therefore allowed to do anything. – user3386109 Nov 13 '19 at 23:04
  • 1
    [godbolt](https://godbolt.org/z/4PAAnE) `gcc` looks like is masking every calculation on `long long : 33` bit-fields with a `2<<33-1` mask before and after calculation. `clang` just sign-extends `a33` and uses `rax` to calculate it - `clang` doesn't mask it with `2<<33-1`. I don't know if this is correct - should be a `long long : 33` bitfield promoted to `long long` or could be promoted to some implementation supported "__uint33_t"` type. – KamilCuk Nov 14 '19 at 00:10
  • 2
    One issue that's confusing this is that you're ignoring the warnings about incompatible arguments for your format specifiers. You need to cast the results of your inline additions to `long long` to get consistent results in the 4th, 5th, and 6th `printf` calls, e.g. `(long long) (s.a31 + s.b31)` Fixing this gives consistent results for the function calls vs. the inline computations, at least with `gcc`. – Tom Karzes Nov 14 '19 at 00:31
  • The compiler behavior seems unintuitive to me. One thing I noticed is that, with `gcc`, `sizeof(s.a31 + 0)` and `sizeof(s.a32 + 0)` are both 4 on my system, but `sizeof(s.a33 + 0)` is 8. I would have thought they would all be unpacked into `long long` and have size 8, but apparently not. – Tom Karzes Nov 14 '19 at 00:44
  • 4
    * Do not use bit field types other than `signed int`, `unsigned`, `_Bool`. Anything else is trouble. "A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type." C11 §6.7.2.1 5 – chux - Reinstate Monica Nov 14 '19 at 04:21

1 Answers1

0

Please have a look to the proposed code which works correctly and as expected.

For the practical purpose, I would suggest, just make sure that

  • compatible types are added,
  • correct types are returned and
  • correct types are in the printf statement.

That's it.

For more information, see also Ref.[1] and [2], below.

#include <stdio.h>

struct S {
    long long a31 : 31;
    long long a32 : 32;
    long long a33 : 33;
    
    unsigned long long b31 : 31;
    unsigned long long b32 : 32;
    unsigned long long b33 : 33;
};

long long f31(struct S *p) { return ((long long)p->a31 + (long long)p->b31); }
long long f32(struct S *p) { return ((long long)p->a32 + (long long)p->b32); }
long long f33(struct S *p) { return ((long long)p->a33 + (long long)p->b33); }

int main() {
    struct S s = { -2, -2, -2, 1, 1, 1 };
    long long a32 = -2;
    unsigned long long b32 = 1;
    
    printf("p->a31       => %lld\n", (long long)(s.a31));
    printf("p->a32       => %lld\n", (long long)(s.a32));
    printf("p->a33       => %lld\n", (long long)(s.a33));
    
    printf("p->b31       => %lld\n", (long long)(s.b31));
    printf("p->b32       => %lld\n", (long long)(s.b32));
    printf("p->b33       => %lld\n", (long long)(s.b33));
    
    
    printf("f31(&s)       => %lld\n", (long long)(f31(&s)));
    printf("f32(&s)       => %lld\n", (long long)(f32(&s)));
    printf("f33(&s)       => %lld\n", (long long)(f33(&s)));
    printf("s.a31 + s.b31 => %lld\n", ((long long)s.a31 + (long long)s.b31));
    printf("s.a32 + s.b32 => %lld\n", ((long long)s.a32 + (long long)s.b32));
    printf("s.a33 + s.b33 => %lld\n", ((long long)s.a33 + (long long)s.b33));
    printf("  a32 +   b32 => %lld\n", (long long) (a32 +   b32));
    return 0;
}

p->a31       => -2
p->a32       => -2
p->a33       => -2
p->b31       => 1
p->b32       => 1
p->b33       => 1
f31(&s)       => -1
f32(&s)       => -1
f33(&s)       => -1
s.a31 + s.b31 => -1
s.a32 + s.b32 => -1
s.a33 + s.b33 => -1
  a32 +   b32 => -1

References

[1] Signed to unsigned conversion in C - is it always safe?

[2] https://www.geeksforgeeks.org/bit-fields-c/ "We cannot have pointers to bit field members as they may not start at a byte boundary."

sidcoder
  • 206
  • 1
  • 6