0

I'm using a AMD64 computer(Intel Pentium Gold 4415U) to compare some assembly instructions converted from C language(of course, exactly, disassembly).

With Windows 10, I used Visual Studio 2017(15.2) with their C compiler. My example code is shown below:

int main() {
    int i = 0;
    if(++i == 4);
    if(i++ == 4);
    return 0;
}

The disassembly shows as below:

mov         eax,dword ptr [i]  // if (++i == 4);
inc         eax  
mov         dword ptr [i],eax  

mov         eax,dword ptr [i]  // if (i++ == 4);
mov         dword ptr [rbp+0D4h],eax    ; save old i to a temporary
mov         eax,dword ptr [i]  
inc         eax  
mov         dword ptr [i],eax  
cmp         dword ptr [rbp+0D4h],4      ; compare with previous i
jne         main+51h (07FF7DDBF3601h)  
mov         dword ptr [rbp+0D8h],1  
jmp         main+5Bh (07FF7DDBF360Bh)  
*mov         dword ptr [rbp+0D8h],0

07FF7DDBF3601 goes to the last line instruction(*).
07FF7DDBF360B goes to 'return 0;'.

In if (++i == 4), the program doesn't observes whether 'added' i satisfies the condition.

However in if (i++ == 4), the program saves the 'previous' i to the stack, and then does the increment. After, the program compare 'previous' i with the constant integer 4.

What was the cause of the difference of two C codes? Is it just a compiler's mechanism? Will it be different with more complex code?

I tried to find about this with Google, however I failed to find the origin of the difference. Have to I understand 'This is just a compiler behavior'?

Peter Cordes
  • 286,368
  • 41
  • 520
  • 731

1 Answers1

2

Like Paul says, the program has no observable side-effects, and with optimization enabled MSVC or any of the other major compilers (gcc/clang/ICC) will compile main to simply xor eax,eax / ret.

i's value never escapes the function (not stored to a global or returned), so it can be optimized away entirely. And even if it was, constant-propagation is trivial here.


It's just a quirk / implementation detail that MSVC's debug-mode anti-optimized code-gen decides not to emit a cmp/jcc over an empty if body; even in debug mode that wouldn't be helpful for debugging at all. It would be a branch instruction that jumps to the same address it falls through to.

The point of debug-mode code is that you can single-step through source lines, and modify C variables with a debugger. Not that the asm is a literal and faithful transliteration of C into asm. (And also that the compiler generates it quickly, without spending any effort on quality, to speed up edit/compile/run cycles.) Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?

Exactly how braindead the compiler's code-gen is doesn't depend on any language rules; there are no actual standards that define what compilers have to do in debug-mode as far as actually using a branch instruction for an empty if body.


Apparently with your compiler version, the i++ post-increment was enough to make the compiler forget that the loop body was empty?

I can't reproduce your result with MSVC 19.0 or 19.10 on the Godbolt compiler explorer, with 32 or 64-bit mode. (VS2015 or VS2017). Or any other MSVC version. I get no conditional branches at all from MSVC, ICC, or gcc.

MSVC does implement i++ with an actual store to memory for the old value, like you show, though. So terrible. GCC -O0 makes significantly more efficient debug-mode code. Still pretty braindead of course, but within a single statement it's sometimes a lot less bad.

I can reproduce it with clang, though! (But it branches for both ifs):

# clang8.0 -O0
main:                                   # @main
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 4], 0       # default return value

        mov     dword ptr [rbp - 8], 0       # int i=0;

        mov     eax, dword ptr [rbp - 8]
        add     eax, 1
        mov     dword ptr [rbp - 8], eax
        cmp     eax, 4                       # uses the i++ result still in a register
        jne     .LBB0_2                      # jump over if() body
        jmp     .LBB0_2                      # jump over else body, I think.
.LBB0_2:

        mov     eax, dword ptr [rbp - 8]
        mov     ecx, eax
        add     ecx, 1                       # i++ uses a 2nd register
        mov     dword ptr [rbp - 8], ecx
        cmp     eax, 4
        jne     .LBB0_4
        jmp     .LBB0_4
.LBB0_4:

        xor     eax, eax                     # return 0

        pop     rbp                          # tear down stack frame.
        ret
Peter Cordes
  • 286,368
  • 41
  • 520
  • 731