6

Consider the following instruction: 8D 8C 4E B0 2F FF FF LEA ECX, [ESI+ECX*2-0xD050]

Using IDAPython, how can I extract the structure of the second operand? I'd like to know things like:

  • ESI is the base register
  • ECX is the index register
  • 2 is the index constant
  • -0xD050 is the displacement constant

Its ok if I have to make a bunch of IDAPython API calls together. So far, I've had to resort to string parsing, and I'd really like to get rid of this.


The most relevant API function I've found is idautils.DecodeInstruction(), yet it doesn't seem to completely cover the structure of the second operand. See below for my exploration:

i = idautils.DecodeInstruction(<ea from above>)

# operand type
assert i.Op2.type == idc.o_disp

# operand value type
assert i.Op2.dtyp == idc.dt_dword

# operand flags
assert i.Op2.flags == idc.OF_SHOW

# structure of o_displ operand is like:
#
#    Memory Reg [Base Reg + Index Reg + Displacement].

def get_reg_const(reg):
    '''
    fetch register number from string name.
    '''
    ri = idaapi.reg_info_t()
    idaapi.parse_reg_name(reg, ri)
    return ri.reg

# we probably expect to find these constants in the operand structure
assert get_reg_const('ecx') == 1
assert get_reg_const('esi') == 6

# the operand structure
assert unsigned2signed(i.Op2.addr)  ==  0xD050  # displacement
assert i.Op2.n          == 1   # operand number
assert i.Op2.phrase     == 4   # "number of register phrase", don't know what this means
assert i.Op2.reg        == 4   # "number of register", don't see how this applies
assert i.Op2.specflag1  == 1   # unknown interpretation, could be "ecx"!?!
assert i.Op2.specflag2  == 78  # 0x4E, unknown interpretation
assert i.Op2.specflag3  == 0   # probably empty
assert i.Op2.specflag4  == 0   # probably empty
assert i.Op2.specval    == 0x200000  # unknown interpretation
assert i.Op2.value      == 0   # "outer displacement" (none here)
Willi Ballenthin
  • 185
  • 2
  • 11

1 Answers1

7

There doesn't seem to be an elegant solution to this. Looks like if you would be writing a plugin in C you would be able to call sib_base, sib_index, sib_scale to get the info.

Here's how you could do it in Python.

from idautils import DecodeInstruction
from idaapi import get_reg_name

ea = 0x20AC5 # Assuming this ea is a lea
i = DecodeInstruction(ea)

hasSIB = i.Op2.specflag1
sib = i.Op2.specflag2

if hasSIB:
    base = sib  & 7
    index = (sib >> 3) & 7
    scale = (sib >> 6) & 3
    size = 4 if i.Op2.dtyp == idaapi.dt_dword else 8
    print '[{} + {}{} + {:x}]'.format(
        get_reg_name(base, size),
        get_reg_name(index, size),
        '*{}'.format(2**scale) if scale else '',
        i.Op2.addr
    )

Example Output: [ebx + eax*4 + 8c]

Jongware
  • 2,364
  • 2
  • 16
  • 30
Bambu
  • 556
  • 2
  • 8