0

I have a bit scan function that uses inline assembly and clang is producing weird output.

Code:

uint64_t bit_scan_forward(uint64_t input) {
    uint64_t result;
    asm("MOVQ %[immidiate], %[result];"
        "BSF %[input], %[result];"
        :[result] "=r" (result)
        :[input] "r"  (input)
        ,[immidiate] "N" (64));
    return result;
}

Output:

sub  rsp, 0x10
mov  qword ptr [rsp+0x8], rcx
mov  rax, qword ptr [rsp+0x8] //set input
mov  rax, 0x40                //set result(which uses the same register as input for some reason)
bsf  rax, rax                 //do bsf
mov  qword ptr [rsp], rax
mov  rax, qword ptr [rsp]
add  rsp, 0x10
ret

It is mixing up the input and result registers, which produces a wrong result.

First time writing inline assembly, am I doing something wrong?

Is it not safe to assume that it will use different registers? If so. how do I tell it to use different registers?

  • I guess you're assuming that if the input is 0, the result register will be left unchanged with the value 64; but AFAIK the result in such a case is "undefined", not "unchanged", unless you know something special about your machine. I would just precede the asm with `if (input == 0) return 64;` and then omit the initial `MOVQ`. – Nate Eldredge Jan 27 '21 at 16:44
  • That will also fix your other problem, since then it will not matter if the input and result go in the same register. – Nate Eldredge Jan 27 '21 at 16:45
  • @NateEldredge: AMD actually documents that case as well-defined unchanged. Intel implements the same behaviour. It's somewhat questionable to rely on it for portable / future-proof because Intel could in theory change without violating their documented behaviour, but basically safe on current CPUs. Except maybe for 32-bit operand size if you care about the high half of the destination: https://en.wikipedia.org/wiki/X86-64#Differences_between_AMD64_and_Intel_64. Still, definitely safe on AMD which documents it, so maybe change the function name to `bsf_amd`. – Peter Cordes Jan 27 '21 at 18:35

2 Answers2

0

Could you try?

:[result] "=&r" (result)

Jeremy
  • 196
  • 6
0

clang follows the same rules for inline asm as gcc:

Use the & constraint modifier (see Modifiers) on all output operands that must not overlap an input. Otherwise, GCC may allocate the output operand in the same register as an unrelated input operand, on the assumption that the assembler code consumes its inputs before producing outputs.

Nate Eldredge
  • 36,841
  • 4
  • 40
  • 60