6

What is the largest program that has been analyzed by a semantics-based static binary analysis?

By semantics-based, I mean an analysis that examines the meaning of the program, and does not simply perform a computation on the syntax of the program, such as computing an md5sum.

Ed McMan
  • 944
  • 8
  • 9

1 Answers1

4

The largest static analysis example I am aware of is 32,405 PPC instructions from this paper on CFG recovery. There is no analysis performed besides recovering the CFG.

Ed McMan
  • 944
  • 8
  • 9
  • 1
    Is decompilation considered a static analysis? It should be, seeing as it does not execute the program. And in which case, Hex-Rays does bigger jobs than that every day. – Rolf Rolles Mar 27 '13 at 22:14
  • 1
    Also, in the deobfuscation example in my recent presentation [1], which is a static analysis, I have applied it to huge binaries on a very regular basis. I think in order for this question or its answers to have any meaning, the term "static binary analysis" needs to be precisely defined. – Rolf Rolles Mar 27 '13 at 22:32
  • @Syzygy Decompilation seems like fair game to me. The link to your presentation does not seem to work, by the way. – Ed McMan Mar 27 '13 at 22:37
  • Whoops ... that reference is [1] http://www.ruxconbreakpoint.com/assets/Uploads/bpx/semantics-based-methods-ruxcon.pdf ;-) – Rolf Rolles Mar 27 '13 at 22:41
  • @Syzygy So how big is "huge"? – Ed McMan Mar 28 '13 at 02:00
  • 1mb binaries, no problem. I've run it on larger (20-90mb) obfuscated binaries as well with some performance tweaking. You can probably tell in looking at the description of the algorithm that there's nothing inherently slow about it; it's about as fast as you might imagine that a static analysis could be. Hence my request for clarification of what we mean by "scalability of static binary analysis". – Rolf Rolles Mar 28 '13 at 04:55