Why are values passed through useless copies?

Question

So, say that I have the following code, which gives three examples of what I believe to be unnecessary copies of values.

mov    QWORD PTR [rbp-0x18],rdi
mov    rdx,QWORD PTR [rbp-0x18]
lea    rax,[rbp-0x10]
mov    rsi,rdx
mov    rdi,rax
call   4003e0 <strcpy@plt>

Why is the value in rdi copied to memory at rbp-0x18, then copied back to rdx ? It's then copied to rsi (2 extra copies).

Finally, why the lea + mov for rbp-0x10 to rax, then to rdi ? Is there any reason the following code wasn't generated ?

mov    rsi,rdi
lea    rdi,[rbp-0x10]
call   4003e0 <strcpy@plt>

(My guess is that this is just an artifact of the code generation in the compiler, but I'm making sure there's not some rules of x86-64 that I'm missing.)

yaspr · Accepted Answer · 2014-06-05T06:08:43.440

There are no artifacts and surely the compiler, and I mean GCC, can generate a better and faster code if told so. The first version of your generated code is non optimized. Why ? Either because -O0 flag (0 level optimizations ==> No optimizations) was specified, or because no optimization flags were specified and by default GCC turns optimizations off.

Below you'll find two versions of the same code. Version 1 with -O0 flag. Version 2 with -O2 flag.

Version 1:

 55                      push   rbp
 48 89 e5                mov    rbp,rsp
 48 81 ec 10 04 00 00    sub    rsp,0x410
 89 bd fc fb ff ff       mov    DWORD PTR [rbp-0x404],edi
 48 89 b5 f0 fb ff ff    mov    QWORD PTR [rbp-0x410],rsi
 48 8b 85 f0 fb ff ff    mov    rax,QWORD PTR [rbp-0x410]
 48 83 c0 08             add    rax,0x8
 48 8b 10                mov    rdx,QWORD PTR [rax]
 48 8d 85 00 fc ff ff    lea    rax,[rbp-0x400]
 48 89 d6                mov    rsi,rdx
 48 89 c7                mov    rdi,rax
 e8 40 fe ff ff          call   400400 <strcpy@plt>
 48 8d 85 00 fc ff ff    lea    rax,[rbp-0x400]
 48 89 c7                mov    rdi,rax
 e8 41 fe ff ff          call   400410 <puts@plt>
 b8 00 00 00 00          mov    eax,0x0
 c9                      leave
 c3                      ret
 66 2e 0f 1f 84 00 00    nop    WORD PTR cs:[rax+rax*1+0x0]
 00 00 00

Version 2:

 48 81 ec 08 04 00 00    sub    rsp,0x408
 48 8b 76 08             mov    rsi,QWORD PTR [rsi+0x8]
 48 89 e7                mov    rdi,rsp
 e8 ad ff ff ff          call   400400 <strcpy@plt>
 48 89 e7                mov    rdi,rsp
 e8 b5 ff ff ff          call   400410 <puts@plt>
 31 c0                   xor    eax,eax
 48 81 c4 08 04 00 00    add    rsp,0x408
 c3                      ret
 0f 1f 00                nop    DWORD PTR [rax]

If you're interested in the optimizations performed by GCC you should read this link, and this one too. You can also check the GCC summit publications.

Ok, seems to be what I was assuming. I was posting code from a wargame challenge, and I suppose no optimization makes sense there. — David, Jun 05 '14 at 17:32

Why are values passed through useless copies?

1 Answers1

Linked

Related