0

I found this code on StructToIEEE754. The lines [ unsigned int mantissa:23; ] as a part (:23) that I didn’t learn about until now. Though, obviously the meaning must most probably mean that the length of mantissa is determined to be 23 bits, I have no clue as to how one can cram exactly 23 bits into a type int.

union flt      {  
struct ieee754                  {
   unsigned int mantissa:23 ;
   unsigned int exponent:8  ;
   unsigned int sign:1      ;   }   raw   ;  //  23 + 8 + 1 = 32 bits
float f;       }                             //               32 bits

https://www.cplusplus.com makes no reference to it , nor anywhere else I could find on the web. Can please someone point to me where I can read intricate details about this newly acquired programming trick ?
Is type identifier : odd_size ; when utilized in a union definition, a defined behavior ?

Fred Cailloux
  • 121
  • 1
  • 7
  • 3
    Read up on *bit-fields*. – Eugene Sh. Nov 17 '21 at 19:46
  • 1
    Read about [Bit fields](https://en.cppreference.com/w/cpp/language/bit_field) on cppreference.com (cplusplus.com is not widely regarded as a decent reference site). "*obviously the meaning must most probably mean that the length of mantissa is determined to be 23 bits*" - yes, it does. "*I have no clue as to how one can cram exactly 23 bits into a type int*" - you don't. That is not what is happening here. The `mantissa`, `exponent`, and `sign` are not individual `int`s of different sizes in memory, they are actually sharing different bits of a single 32bit `int` in memory. – Remy Lebeau Nov 17 '21 at 19:48
  • The Internet is often useless when looking up C++ syntax. Books help out lot because they will teach you what the symbol is named. – user4581301 Nov 17 '21 at 19:49
  • 1
    That question is about C. You can't use the same technique in C++. It might work for you when you try it, but it is Undefined Behavior and it could break at any time for no apparent reason. – François Andrieux Nov 17 '21 at 19:49
  • Incidental to the question, I believe this kind of type punning with a `union` is allowed in **C**, but is not allowed in **C++**. – Eljay Nov 17 '21 at 19:49
  • 1
    And of course one of the valid behaviours of undefined behaviour is you get exactly what you wanted. It's just not guaranteed. Use UB carefully, after a lot of research and testing, and only if you have to. – user4581301 Nov 17 '21 at 19:51
  • REmy Lebeau, are you sure about that ? mantissa, exponent and sign are part of a structure. The structure is part of a union , which implement memory sharing. But the structure in itself does not share the same memory . All 3 identifiers have their own memory location, which is shared with the float. At least , that is what I understand. Pls , correct me on this one if I'm wrong – Fred Cailloux Nov 17 '21 at 19:52
  • @FredCailloux These are indeed bit fields. They do not have their own address, if you try to get their address the compiler should complain. This is because their storage is not byte-aligned and the compiler has to insert special instructions to read from or write to these. Some compiler support them and will pack the members as intended, some wont and will silently cause the struct to be bigger than expected. In either case, this whole strategy doesn't work in C++, you are using a C solution in C++. – François Andrieux Nov 17 '21 at 19:53
  • 1
    @FredCailloux yes, I am quite sure. The `mantissa`, `exponent`, and `sign` are bit fields, which are stuffed inside of a single `unsigned int` inside the `struct`, which is then sharing the same memory as the `float` due to the `union`. So, effectively, the bit fields are being overlaid on top of the `float` to provide easy access to its separate components. – Remy Lebeau Nov 17 '21 at 19:54
  • @FredCailloux those are entities called bit field, not "identifiers". They are specifically described as sharing memory location as a biggest sequence of fields with non-zero length and with total size less than one of specified type. You can have an (unnamed) field of zero length, it would snap following fields to next location. – Swift - Friday Pie Nov 17 '21 at 20:05
  • @RemyLebeau, OK , I get it now. I read your first comment a bit too fast. From what I understand now, each mantissa, exponent, and sign will occupy some bits of memory, all crammed into one single type int. That same memory location can also be utilized for storing one type float. However, user4581301 is sharing information that make me doubt a safe usage of this technique. ??? – Fred Cailloux Nov 17 '21 at 20:16
  • 1
    @FredCailloux type-puning in this manner is *technically* UB in C++ (but not in C). In C++, assigning to `f` and then reading from `mantissa`/etc is illegal, since `mantissa`/etc is not the active field, `f` still is. You can only read from the active field that was last written to. But, in *practice*, this will *usually* work in most C++ compilers. Just be aware of it. The *safe* and *portable* way to handle this is to `memcpy()` the `float` into an `unsigned int` and then bit-shift the values out of it as needed, not using bit fields (which are implementation-defined and not portable). – Remy Lebeau Nov 17 '21 at 20:23
  • Quick and unrelated question: How do you color the background of a word when commenting ? – Fred Cailloux Nov 17 '21 at 21:23
  • @FredCailloux https://stackoverflow.com/editing-help#comment-formatting – François Andrieux Nov 17 '21 at 21:54

0 Answers0