I have absolutely no experience in reversing real-world binary codes, so I wonder how the obfuscated codes prevent reversers. I doubt that the reverses always find some ways to understand what are hiden inside, even for heavy obfuscated codes, but I do not know how they think about them.
That partly comes from this question in detecting recursive calls where both two answers give a static approach: looking recursively in the functions called by the original function whether it is re-called.
In somehow theoretical manner, this approach can be bypassed if the programmer uses the continuation passing style, that is because there is no more explicit
call myself
inside the code. The following program I have implemented to test out this idea:
template<typename T>
auto obf_if(bool p, T a, T b) -> T
{
T* pts[4] = { &a, &b, &a + 1, &b + 1 };
return *pts[int{ p }];
}
template<typename T>
auto obf_cmp(T a, T b) -> int
{
return obf_if<int>(a == b, 0, obf_if<int>(a < b, -1, 1));
}
using obf_strcmp_t = std::function < int(char*, char*) >;
auto h_strcmp(obf_strcmp_t func, char* str1, char* str2) -> int
{
return obf_if<int>((*str1 == *str2) && (*str1 != 0),
func(str1 + 1, str2 + 1), obf_cmp<int>(*str1, *str2));
}
using h_strcmp_t = decltype(h_strcmp);
obf_strcmp_t y_strcmp(h_strcmp_t func)
{
return std::bind(func, std::bind(y_strcmp, func),
std::placeholders::_1, std::placeholders::_2);
}
int main(int argc, char* argv[])
{
char str1[] = "ab";
char str2[] = "ac";
return y_strcmp(h_strcmp)(str1, str2);
}
This is a trivial implementation of strcmp using the y combinator. But this piece of codes leads to the fact that there is no more direct call inside the implementation (even no conditional jump), except the first one
y_strcmp(h_strcmp)(str1, str2)
As an amateur, I have even loaded the binary code (compiled by VS2013) in IDA and see a big mess where calls are replaced by
call edx
However because I write it I know how to detect this (e.g. the implicit recursive calls are detected by tracing the arguments passed into the function, the value of edx can only be one of passed arguments), and I think so do the reversers. So my question is:
Suppose that you do not know this trick, does it prevent you in understanding the binary code?
NB Because w-s has suggested that this question is only an opinion-based one, so it will be closed sooner or later, but I very appreciate if someone gives an idea.