0

I'm practicing windows disassembly on x64 notepad.exe

I've managed to find buffer allocation and string resources load. But after that follows code, which I can't fully understand:
rbx = 0, rdx = 0 before this part

            mov     r9, cs:str_res_guid_18 ;  dq 12h
            lea     rcx, some_data         ;  some address
            mov     edx, 28h
            sub     r9, rcx        
            mov     r8, rdx

loc_100003150: 
            lea     rax, [r8+7FFFFFD6h]    ; what is this number?
            cmp     rax, rbp
            jz      short loc_100003173
            movzx   eax, word ptr [rcx+r9] ; str_res_guid_18
            cmp     ax, bp
            jz      short loc_100003173
            mov     [rcx], ax
            add     rcx, 2
            sub     r8, 1
            jnz     short loc_100003150

loc_100003173: 
            cmp     r8, rbp
            jz      loc_1000058A0

What does 7FFFFFD6h stands for? And why is it compared against null?

2 Answers2

1

This seems to be an optimized wchar_t string copy. The compiler took advantage of the fact that there is a fixed offset between source and destination. Ignoring the lea check, the assembly could be represented by this pseudocode:

wchar_t *dst = &some_data; //rcx
wchar_t *src = ?; //r9
delta = src-dst; //r9 =r9-dst
maxc = 0x28; //r8

loop:
 wchar_t ch = dst[delta]; (dst+(src-dst) = src, so ch=src[i];)
 if ( ch==0 ) break;
 *dst = ch;
 dst++;
 maxc--;
 if ( maxc!=0 ) goto loop;

Now let's look at the mysterious lea. First thing you need to remember that it's basically just a fancy mov, and the operands are not necessarily addresses.

so let's take lea rax, [r8+7FFFFFD6h] and do some math:

rax = r8+0x7FFFFFD6

multiply both sides by 2:

2*rax = 2*r8+0xFFFFFFAC

treat number as signed:

2*rax = 2*r8-0x54

now divide:

rax = r8 - 0x2a

so testing rax==0 is the same as r8==0x2a. Possibly the code is checking that number of characters to copy is not greater than the buffer size.

Igor Skochinsky
  • 36,553
  • 7
  • 65
  • 115
  • Thanks Igor, that seems to be the case and it all fits. However, can you clarify, why is there a multiplication in lea. – Vlad Fedyaev Mar 02 '18 at 22:37
  • @VladFedyaev there is no multiplication, it’s been factored out by the compiler. basically, 0x7FFFFFD6 is 0xFFFFFFAC divided by two without sign extension. I multiplied the equation so it’s easier to see where 0x2A is coming from. – Igor Skochinsky Mar 02 '18 at 22:49
  • see also a similar trick here: https://reverseengineering.stackexchange.com/a/6272 – Igor Skochinsky Mar 02 '18 at 22:52
0

ignore the first part: 7FF..FF is a sequence of 1's The interesting part is the D6 which equals 1101 0110, or 214 in decimal. So it check's whether a couple of bytes after the addrress in R8 is a Zero

Also, have a look at section OPERATING SYSTEMS here: https://software.intel.com/en-us/articles/introduction-to-x64-assembly

dns43
  • 1