1

In short, GCC is allocating registers wrongly for this code.

typedef intptr_t Int;

Int add(Int *x, Int n, Int *y, Int m) {
    Int r0, r1, r2;
    asm (
    R"(xor %k0, %k0
    xor %k1, %k1
L0%=:
    mov %2, [%[y] + %0 * 8]
    neg %1
    adc [%[x] + %0 * 8], %2
    sbb %1, %1
    .
    .
    .)"
        : "=r"(r0), "=r"(r1), "=r"(r2)
        : [x]"r"(x), [n]"r"(n), [y]"r"(y), [m]"r"(m)
        : "memory"
    );
    printf("%ld %ld\n", n, r0);
    return r0;
}

/*
add:
        push    rbx
        xor ebx, ebx
        xor edx, edx ; killing `y`
L017:
        mov rcx, [rdx + rbx * 8] ; segfault
        neg rdx
        adc [rdi + rbx * 8], rcx
        sbb rdx, rdx
*/

More weird thing is removing the printf line makes the code compile normally.

The code is a basic implementation of bignum addition. The code itself could also have a bug, but that's apart from this problem.

link to Godbolt


I read the answers in this question, and now I understand that the & is necessary to tell the input is reused after being consumed. I also read the manual where it states,

GCC may allocate the output operand in the same register as an unrelated input operand, on the assumption that the assembler code consumes its inputs before producing outputs.

But I still don't get why GCC thinks it's okay to overwrite y before it's "consumed", when there is no &. You can see in the code above that GCC is zeroing y before any value is ever read from it.


full code

#include <stdio.h>
#include <stdint.h>

#define asm __asm__ volatile

typedef intptr_t Int;

Int add(Int *x, Int n, Int *y, Int m) {
    Int r0, r1, r2;
    asm (
    R"(xor %k0, %k0
    xor %k1, %k1
L0%=:
    mov %2, [%[y] + %0 * 8]
    neg %1
    adc [%[x] + %0 * 8], %2
    sbb %1, %1
    inc %0
    cmp %0, %[m]
    jl L0%=
    neg %1
    jz end%=
carry%=:
    inc %0
    add qword ptr [%[x] + %0 * 8 - 8], 1
    jc carry%=
    cmp %0, %[n]
    cmovl %0, %[n]
end%=:)"
        : "=&r"(r0), "=&r"(r1), "=&r"(r2)
        : [x]"r"(x), [n]"r"(n), [y]"r"(y), [m]"r"(m)
        : "memory"
    );
    printf("%ld %ld\n", n, r0);
    return r0;
}
Peter Cordes
  • 286,368
  • 41
  • 520
  • 731
xiver77
  • 1,299
  • 1
  • 1
  • 11
  • 2
    This looks like the kind of code that would be better placed in an independent `.S` file. Is there a reason you're not doing that? Anyway, I can't give a full answer without a MCVE but I suspect your problem is you didn't tell GCC that the input and output registers cannot overlap. Reread the section of the manual where it talks about "early clobbers". – zwol Feb 28 '22 at 13:12
  • @4386427 I'm suspecting GCC has a bug, but I could have done something wrong. That's why I'm asking here. I still don't know a lot about GCC's inline asm syntax. – xiver77 Feb 28 '22 at 13:13
  • @zwol There's no special reason for not writing in a separate `.S` file, it's just some code that came out from writing random things in my personal time, I'll read the manual while waiting for some answers. – xiver77 Feb 28 '22 at 13:16
  • 1
    Oh, now I see you posted an un-abbreviated example on Godbolt. Please copy that here -- we want all questions to be self-contained. Looking at that, does changing all three instances of `"=r"` to `"=&r"` fix the problem? If it does, I will write a proper answer with an explanation. – zwol Feb 28 '22 at 13:20
  • @zwol Yes that does solve the problem. Quoting from the manual, "GCC may allocate the output operand in the same register as an unrelated input operand, on the assumption that the assembler code consumes its inputs before producing outputs", but I still don't get why GCC thinks it's okay to overwrite `y` before it's "consumed", when there is no `&`. Yes, an answer with explanation would be great. – xiver77 Feb 28 '22 at 13:28
  • You just quoted the reason. gcc by default thinks you use up all the inputs before you modify any outputs. It's not gcc overwriting `y`, it's you. gcc just allocated `y` and `r1` to the same register. – Jester Feb 28 '22 at 13:42
  • @Jester Oh, so GCC assumes that I'm using *all* inputs before modifying *any* output. Okay, seems strange.., but I get it now. – xiver77 Feb 28 '22 at 13:46
  • 3
    *But I still don't get why GCC thinks it's okay to overwrite y before it's "consumed", when there is no &* - It doesn't read your asm, it *only* looks at the constraints. And you didn't tell it that some of the outputs are written before all the inputs are read, so it assumed that wasn't the case. Inline asm is designed to wrap a single instruction with maximum efficiency, using the same syntax as machine-description files. If you lie to the compiler about how your asm works, that's UB, and the results can obviously be broken. – Peter Cordes Feb 28 '22 at 13:54
  • 2
    Coming back to find that Peter Cordes has already said basically what I was going to say, I just want to re-emphasize the sentence "Inline asm is designed to wrap a *single instruction*." This is why the default assumption that all the inputs are read, before any output is written, makes sense -- that's how single instructions behave, most of the time. This is also why I said your code looked like it belonged in an `.S` file. – zwol Feb 28 '22 at 14:50

0 Answers0