0

I'm trying to rewrite the C code below in Assembly x86

int myFn( char * v, char c, int size ) {
  int i;
  for(i=0; i < size; i++ )
    if( v[i] == c )
      return i;
  return -1;
}

I've tried to use this code in x86:

myFn:
  mov esi, 0
  mov ebx, [esp + 8]
  mov ecx, [esp + 12]

  FOR:
    mov eax, -1
    cmp esi, [esp + 4]
    jge ENDFOR

    cmp [ecx + esi], ebx
    je EQUAL

    inc esi
    jmp FOR
  
  EQUAL:
    mov eax, [esi]
  
  ENDFOR:

  ret

I've also created this program to test the function:

section .data
  fmt: db "strfind: %d", 10, 0
  str: db "test", 0

section .text
  global main
  extern printf

main:
  mov eax, 's'
  mov ebx, 4

  push str
  push eax
  push ebx
  call myFn
  add esp, 12

  push eax
  push fmt
  call printf
  add esp, 8

  ret

myFn:
  mov esi, 0
  mov ebx, [esp + 8]
  mov ecx, [esp + 12]

  FOR:
    mov eax, -1
    cmp esi, [esp + 4]
    jge ENDFOR

    cmp [ecx + esi], ebx
    je EQUAL

    inc esi
    jmp FOR
  
  EQUAL:
    mov eax, [esi]
  
  ENDFOR:

  ret

I'm getting Segmentation Fault error or the wrong result when trying to test it. I believe the problem is when comparing the character of the string with the character I want to find

Rafa
  • 1
  • 1
    Multiple problems in there but the immediate cause is likely the `mov eax, [esi]`. You want `mov eax, esi` since `esi` is an index not an address. You ar also comparing 4 bytes instead of 1, clobber callee saved registers and misalign the stack to name a few additional problems. – Jester Feb 02 '22 at 22:18
  • 1
    There's two comparisons here: `cmp esi, [esp + 4]` which is comparing `i < size` and should be an `int` sized compare.  But the other one `cmp [ecx + esi], ebx` for `v[i] == c` should be a `char`/byte-sized compare. – Erik Eidt Feb 02 '22 at 22:18
  • Yes, so the major problem is with the second comparison, how can I fix it? – Rafa Feb 02 '22 at 22:27
  • This is memchr; look for simple examples of memchr (although most of what you find if you search for that will be highly optimized). Or look at compiler output from a C compiler for your C code, although that won't use NASM syntax. Still, should make it clear when it's using byte operand-size. [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) – Peter Cordes Feb 03 '22 at 03:46

1 Answers1

0

ASCII chars are all 8 bits so you do not need 32 bit registers to store the chars. You should use 8 bit registers like al to hold the char. Below is a program that I wrote in nasm 64-bit that works fine.

global    _start

section   .data
str:  db  "test", 0    

section   .text

_start:
    mov al, 's'
    mov bl, 4
    mov cl, 0

    mov rsi, str

; iterate over the string
L1:
    cmp [rsi], al
    je ENDL1 ; if found, jmp to ENDL1 label
    inc rsi
    inc cl
    cmp cl, 4
    jbe L1
    mov cl, -1 ; if not found


ENDL1:
    xor rax, rax ; set rax register to 0
    mov al, cl ; rax equal index in string i.e rax = 0x2

The reason you are getting the SEGMENTATION FAULT is probably because you didn't use the exit syscall at the end of the program. To fix this, write the following at the end of your main function:

; exit syscall
mov rax, 60                 ; system call for exit
xor rdi, rdi                ; exit code 0
syscall

Adjust the code for 32-bit mode as per your requirement.

  • 1
    There are various ways this could be more efficient, e.g. only increment a pointer inside the loop, and subtract outside. Also, [avoid writing partial registers](https://stackoverflow.com/questions/41573502/why-doesnt-gcc-use-partial-registers), e.g. `mov ebx, 4` instead of `mov bl, 4`. (Or remove that because you hard-coded `cmp cl,4` instead of `bl`). And `xor ecx,ecx` instead of `mov cl,0`. But the simplest and most obvious improvement would be to replace `xor`-zero and `mov al,cl` with `movzx eax, cl` to zero-extend CL into RAX with one instruction instead of two. Otherwise good, though – Peter Cordes Feb 04 '22 at 19:28