Why is the RDI register missing in this "Hello world" assembly program?

Question

I found this "Hello" (shellcode) assembly program:

SECTION .data
SECTION .text
  global main
main:
  mov rax, 1
  mov rsi, 0x6f6c6c6548 ; "Hello" is stored in reverse order "olleH"
  push rsi
  mov rsi, rsp
  mov rdx, 5
  syscall
  mov rax, 60
  syscall

And I found that mov rdi, 1 is missing. In other "hello world" programs that instruction appears so I would like to understand why this happens.

It probably depends on a functioning value for `rdi` to be initialised by the caller (refer to calling convention), which could be `0`, `1`, or `2` (stdin, stdout, stderr). Besides, you can just write `mov rsi, "Hello"` in NASM assembly source and the assembler will do the right thing. (Unlike MASM family assemblers, which "reverse" the string as written to memory.) — ecm, Mar 19 '22 at 17:15

score 2 · Answer 1 · answered Mar 19 '22 at 17:41

I was going to say it's an intentional trick or hack to save code bytes, using argc as the file descriptor. (1 if you run it from the shell without extra command line args). main(int argc, char**argv) gets its args in EDI and RSI respectively, in the x86-64 SysV calling convention used on Linux.

But given the other choices, like mov rax, 1 instead of mov eax, edi, it's probably just a bug that got overlooked because the code happened to work.

It would not work in real shellcode for a code-injection attack, where execution would probably reach this code with garbage other than 0, 1, or 2 in EDI. The shellcode test program on the tutorial you linked calls a const char[] of machine code as the only thing in main, which will normally compile to asm that doesn't touch RDI.

This code wouldn't work for code-injection attacks based on strcpy or other C-string overflows either, since the machine code contains 00 bytes as part of mov eax, 1, mov edx, 5, and the end of that character string.

Also, modern linkers don't link .rodata into an executable segment, and -zexecstack only affects the actual stack, not all readable memory. So that shellcode test won't work, although I expect it did when written. See How to get c code to execute hex machine code? for working ways, like using a local array and compiling with -zexecstack.

That tutorial is overall not great, probably something this guy wrote while learning. (But not as bad as I expected based on this bug and the use of Kali; it's at least decently written, just missing some tricks.)

Since you're using NASM, you don't need to manually waste time looking up ASCII codes and getting the byte order correct. Unlike some assemblers, mov rsi, "Hello" / push rsi results in those bytes being in memory in source order.

You also don't need an empty .data section, especially when making shellcode which is just a self-contained snippet of machine code which can't reference anything outside itself.

Writing a 32-bit register implicitly zero-extends to 64-bit. NASM optimizes mov rax,1 into mov eax,1 for you (as you can see in the objdump -d AT&D disassembly; objdump -drwC -Mintel to use Intel-syntax disassembly similar to NASM.)

The following should work:

  global main
main:
  mov   rax, `Hello\n  `  ; non-zero padding to fill 8 bytes
  push  rax
  mov rsi, rsp

  push   1                ; push imm8
  pop    rax              ; __NR_write
  mov    edi, eax         ; STDOUT_FD is also 1
  lea    edx, [rax-1 + 6]    ; EDX = 6;  using 3 bytes with no zeros
  syscall

  mov    al, 60    ; assuming write success, RAX = 5, zero outside the low byte
  ;lea    eax, [rdi-1 + 60]    ; the safe way that works even with ./hello >&-  to return -EBADF
  syscall

This is fewer bytes of machine code than the original, and avoids \x00 bytes which strcpy would stop on. I changed the string to end with a newline, using NASM backticks to support C-style escape sequences like \n as 0x0a byte.

Running normally (I linked it into a static executable without CRT, despite it being called main instead of _start. ld foo.o -o foo):

$ strace ./foo > /dev/null
execve("./foo", ["./foo"], 0x7ffecdc70a20 /* 54 vars */) = 0
write(1, "Hello\n", 6)                  = 6
exit(1)                                 = ?

Running with stdout closed to break the mov al, 60 __NR_exit hack:

$ strace ./foo >&-
execve("./foo", ["./foo"], 0x7ffe3d24a240 /* 54 vars */) = 0
write(1, "Hello\n", 6)                  = -1 EBADF (Bad file descriptor)
syscall_0xffffffffffffff3c(0x1, 0x7ffd0b37a988, 0x6, 0, 0, 0) = -1 ENOSYS (Function not implemented)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xffffffffffffffda} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)

To still exit cleanly, use lea eax, [rdi-1 + 60] (3 bytes) instead of mov al, 60 (2 bytes) to set RAX according to the unmodified EDI, instead of depending on the upper bytes of RAX being zero which they aren't after an error return.

See also https://codegolf.stackexchange.com/questions/132981/tips-for-golfing-in-x86-x64-machine-code

Why is the RDI register missing in this "Hello world" assembly program?

1 Answers1