30

I'm studying the C++ language and i have some doubt about type conversion, could you explain me what happens in an expression like this :

unsigned int u = 10; 
int a = -42; 
std::cout << u - a << std::endl;

Here i know that the result will be 52 if i apply the rules when we have two mathematical operators.But i wonder what happens when the compiler to convert a to an unsigned value creates a temporary of unsigned type, what happens after ? The expression now should be 10 -4294967254.

Piero Borrelli
  • 1,071
  • 2
  • 10
  • 15

3 Answers3

43

In simple terms, if you mix types of the same rank (in the sequence of int, long int, long long int), the unsigned type "wins" and the calculations are performed within that unsigned type. The result is of the same unsigned type.

If you mix types of different rank, the higher-ranked type "wins", if it can represent all values of lower-ranked type. The calculations are performed within that type. The result is of that type.

Finally, if the higher-ranked type cannot represent all values of lower-ranked type, then the unsigned version of the higher ranked type is used. The result is of that type.

In your case you mixed types of the same rank (int and unsigned int), which means that the whole expression is evaluated within unsigned int type. The expression, as you correctly stated, is now 10 - 4294967254 (for 32 bit int). Unsigned types obey the rules of modulo arithmetic with 2^32 (4294967296) as the modulo. If you carefully calculate the result (which can be expressed arithmetically as 10 - 4294967254 + 4294967296), it will turn out as the expected 52.

AnT
  • 302,239
  • 39
  • 506
  • 752
  • sorry i lost myself, when the expressio becomes : unsigned int temporary = 10 - 4294967254 ( ok i've understood this ) but i can't understand why the expression becomes 10 - 4294967254 + 4294967296 (why you add to the expression the modulo arithmetic ? ). – Piero Borrelli Sep 02 '14 at 13:37
  • @Piero Borrelli: One way to calculate the `modulo N` equivalent of a negative value `V` is to add `N` to it as many times as necessary (`V + N`, `V + 2N`, `V + 3N` and so on) until you hit the first non-negative value. In case of C++ additive operations a mathematically negative result needs the modulo value added only once to arrive at the proper unsigned result. – AnT Sep 02 '14 at 14:12
  • @Piero Borrelli: Of course, this is a purely arithmetic rule. The compiler does not have to do anything like that. It does not have to worry about it at all. If the negative values are represented through 2's complement, a simple reinterpretation of that representation as unsigned one immediately provides the correct result. – AnT Sep 02 '14 at 16:17
  • Can you define what you mean by "rank"? C++ doesn't use rank in that way, making this answer ambiguous at best, nonsensical at worst. – Adrian Nov 18 '19 at 15:59
  • @Adrian: Actually, it does. I'm referring to the concept of *integer conversion rank*, as it is used in the description of *usual arithmetic conversions*. The description in my answer is not the exact quote from the standard, since it is intended to be tailored to the specific case of `u - a` from the original question. – AnT Nov 18 '19 at 16:28
8

1) Due to standard promotion rules, the signed type a is promoted to an unsigned type prior to subtraction. That promotion happens according to this rule (C++ standard 4.7/2):

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type).

Algebraically a becomes be a very large positive number and certainly larger than u.

2) u - a is an anonymous temporary and will be a unsigned type. (You can verify this by writing auto t = u - a and inspecting the type of t in your debugger.) Mathematically this will be a negative number but on implicit conversion to the unsigned type, a wraparound rule similar to above is invoked.

In short, the two conversion operations have equal and opposite effects and the result will be 52. In practice, the compiler might optimise out all these conversions.

Bathsheba
  • 227,678
  • 33
  • 352
  • 470
-2

here is the disassemble code says:

it first sets -42 to its complement and do the sub operation. so the result is 10 + 42 0x0000000000400835 <+8>: movl $0xa,-0xc(%rbp) 0x000000000040083c <+15>: movl $0xffffffd6,-0x8(%rbp) 0x0000000000400843 <+22>: mov -0x8(%rbp),%eax 0x0000000000400846 <+25>: mov -0xc(%rbp),%edx 0x0000000000400849 <+28>: sub %eax,%edx 0x000000000040084b <+30>: mov %edx,%eax

  • 6
    In general case disassembled code cannot serve as a meaningful source for understanding the language-level semantics. Code generation is one-way function. It is not possible to "trace it back". i.e. to figure out what the compiler was actually trying to do by looking at generated code. – AnT Sep 01 '14 at 16:23
  • 1
    Thanks for your comment. – Jinghui.You Sep 02 '14 at 02:03