6

Hello reverse engineers,

I'm analysing a fat Macho-O binary, and it has an ADRP and an ADD instruction in it. I'm talking about these instructions:

__text:00000001002E050C                 ADRP            X8, #some_function@PAGE
__text:00000001002E0510                 ADD             X8, X8, #some_function@PAGEOFF

The ADRP instruction has the bytes "08 00 00 90".

The ADD instruction has the bytes "08 61 0D 91" How could I get the value out of the 2 instructions? This is my program to calculate the address of some_function: It should sign extend a 21-bit offset, shift it left by 12, and add it to PC with the 12 bottom bits cleared. Then I should get the last 12 bits from the ADD instruction, and add it to this value.

    int instr = 0x90000008;
    //int instr = 0x80000090;
    int value = 0x1fffff & instr;
    int mask = 0x100000;
    if(mask & instr)
    {
            value += 0xffe00000;
    }
    printf("value : %08x\n", value);
    value = value << 12;
    printf("value : %08x\n", value);
    int instr2 = 0x910d6108;
    //int instr2 = 0x08610d91;
    value += (instr2 & 0xfff); //get the last 12 bits from instr2
    printf("value : %08x\n", value);

After executing the instructions, the value 00000001002E0358 should be in X8, because that is the address of the function we want to calculate. The output of my program is:

value : 00000008
value : 00008000
value : 00008108

What am I doing wrong?

Conclusion: I was reading the wrong ARM manual. The official AArch64-manual from ARM is the one you should use.

The final code :

    const int tab32[32] = {
     0,  9,  1, 10, 13, 21,  2, 29,
    11, 14, 16, 18, 22, 25,  3, 30,
     8, 12, 20, 28, 15, 17, 24,  7,
    19, 27, 23,  6, 26,  5,  4, 31};

    int log2_32 (uint32_t value)
    {
        value |= value >> 1;
        value |= value >> 2;
        value |= value >> 4;
        value |= value >> 8;
        value |= value >> 16;
        return tab32[(uint32_t)(value*0x07C4ACDD) >> 27];
    }

    uint64_t get_page_address_64(uint64_t addr, uint32_t pagesize)
    {
            int bits_page_offset;
            bits_page_offset = log2_32(pagesize);
            return (addr >> (bits_page_offset - 1)) << (bits_page_offset - 1);
    }

    uint64_t get_adrp_add_va(unsigned char *adrp_loc, uint64_t va){
        uint32_t instr, instr2, immlo, immhi;
        int32_t value;
        int64_t value_64;
        //imm12 64 bits if sf = 1, else 32 bits
        uint64_t imm12;
        instr = *(uint32_t *)adrp_loc;
        immlo = (0x60000000 & instr) >> 29;
        immhi = (0xffffe0 & instr) >> 3;
        value = (immlo | immhi) << 12;
        //sign extend value to 64 bits
        value_64 = value;
        //get the imm value from add instruction
        instr2 = *(uint32_t *)(adrp_loc + 4);
        imm12 = (instr2 & 0x3ffc00) >> 10;
        if(instr2 & 0xc00000)
        {
                imm12 <<= 12;

        }
        return get_page_address_64(va, PAGE_SIZE) + value_64 + imm12;
    }
exploiter
  • 95
  • 1
  • 6

1 Answers1

8

For the first instruction (0x90000008) it matches the opcode below for PC relative addressing instruction.

pc-relative addressing opcode

0x90000008 = 0b10010000000000000000000000001000 so we have op=1 (ADRP), immlo=0, immhi=0 and Rd=8 (X8). The instruction decodes to ADRP X8, #0. This is going take the current page the instruction pointer is at, add 0<<12, and store in register X8 so you would have

X8 = page_address_of(0x00000001002E050C) + 0<<12 = 0x00000001002E0000

The next instruction 0x910d6108 matches the instruction for ADD/SUBTRACT immediate.

add/sub opcode

0x910d6108 = 0b10010001000011010110000100001000 so we have sf=1 (64-bit variant), op=0 (add), S=0 (non-saturating), shift=0 (LSL #0), imm12=0x358, Rn=8 (X8), Rd=8 (X8).

It decodes to ADD X8, X8, #0x358 which would add 0x358 to X8 so you would have

X8 = X8 + 0x358 = 0x00000001002E0358

cimarron
  • 1,331
  • 1
  • 9
  • 13