-1

I am new to assembly, but could anyone teach me how to convert strings to 64-bit integers?

The program should read integers as strings and convert them to 64-bit integers by using 2 registers. And when I input 32-bit integers, the program loads the values correctly, but if I input 64-bit integers i.e. 4294967295, it doesn't.

The commented clang output for strtou64 was copied from another SO answer, just commenting out the ret instructions to inline it.

    .eqv SYS_EXITO, 10
    .eqv CON_PRTSTR, 4
    .eqv CON_RDSTR, 8
    .eqv BUFSIZE, 100
    .data
prompt:
    .asciz "Input 64 bit integer:"
result:
    .asciz "Output 64 bit added:"
    
buf:
    .space BUFSIZE
    .text

main:
    la a0, prompt
    li a7, CON_PRTSTR
    ecall 

    la a0, buf
    li a1, BUFSIZE
    li a7, CON_RDSTR
    ecall

    # rv32gc clang 14.0  -O3
strtou64:
        mv      a2, a0
        mv  s1, a0
        lbu     a0, 0(a0)         # load the first char
        addi    a3, a0, -48       # *p - '0' a3 is the each digit
        li      a0, 9
        bltu    a0, a3, .LBB0_4   # return 0 if the first char is a non-digit
        li      a0, 0               # should have done these before the branch
        li      a1, 0               # so a separate ret wouldn't be needed
        addi    a2, a2, 1           # p++
        li      a6, 10              # multiplier constant
.LBB0_2:                            # do{
        mulhu   a5, a0, a6            #  high half of (lo(total) * 10)
        mul     a1, a1, a6            # hi(total) * 10 
        add     a1, a1, a5            # add the high-half partial products
        mul     a5, a0, a6            # low half of  (lo(total) * 10)
        lbu     a4, 0(a2)                # load *p 
        add     a0, a5, a3            # lo(total) =  lo(total*10) + digit
        sltu    a3, a0, a5            # carry-out from that
        add     a1, a1, a3            # propagate carry into hi(total)
        addi    a3, a4, -48             # digit = *p - '0'
        addi    a2, a2, 1                # p++ done after the load; clang peeled one pointer increment before the loop
        bltu    a3, a6, .LBB0_2     # }while(digit < 10)
        #ret
.LBB0_4:
        li      a0, 0               # return 0 special case
        li      a1, 0               # because clang was dumb and didn't load these regs before branching
        #ret
fin:
    la a0, result
    li a7, CON_PRTSTR
    ecall
    
    la a0, buf
    la a1, strtou64
    li a7, CON_PRTSTR
    ecall
    
    li a7, SYS_EXITO
    ecall
Peter Cordes
  • 286,368
  • 41
  • 520
  • 731
  • I don't see how this could ever work for any number but zero. With both `ret` instructions commented out, execution falls through the `return 0` part that sets the 64-bit return value in `a1:a0` to zero. BTW, I edited your question to give attribution to my answer where you copied the commented code from, and edited my linked answer to point out that it returns in a1:a0, not a single 64-bit register if run on RV64. – Peter Cordes May 18 '22 at 13:11
  • You say it worked for numbers up to `4294967295` (which is the max 32-bit unsigned value, only needing 64-bit if treated as signed), so maybe you used a different version of this where you only looked at the `a0` low half of the return value. So it's not a [mcve] of anything that would break only for large numbers. – Peter Cordes May 18 '22 at 13:13
  • I also don't see any code that even tries to print a1:a0 or even the a0 low half: the first thing you do after it is overwrite `a0` with a message you could have printed before even running the conversion. And then you try to pass two pointers(?) to the print-string system call, a0 = the input text buffer, and a1 = a pointer to the machine code in the middle of your main function. But a1 isn't an input to print_string, it'll ignore it, fortunately. So this should just output whatever text the user typed, after zeroing out the str->int results. – Peter Cordes May 18 '22 at 13:16
  • BTW, printing a 64-bit integer back to an ASCII string of decimal digits is one of the harder things you could do with it, because you'd have to do it yourself with extended-precision division. Much easier to output the two halves separately in hex. (32-bit register width is a multiple of base-16's 4-bit groups, and base 16 is a power of 2, so each hex digit depends only on a few bits, not all bits in the 64-bit register pair). **You already are converting strings to uint64_t; single-step with a debugger to see**. You're just zeroing the results. You didn't say what you want to do next. – Peter Cordes May 18 '22 at 13:19

0 Answers0