What data-structure is best suited for escape analysis?

Question

I'm making a compiler using LLVM and LLVM's shadow-stack. I thought I might easy the load of the GC by using escape analysis and allocate some variables on the stack. Can I use my typed AST for this algorithm, or must I use a graph data-structure?

score 5 · Accepted Answer · answered Oct 13 '17 at 16:03

Escape analysis is a static analysis that determines whether the lifetime of data exceeds its static scope. As such, it is not really about stack vs heap allocation, just about lifetimes.

I don't think you can use your typed AST for escape analysis, because what you are analyzing is the data flow of the program. The appropriate structure for that is a graph, not a tree, because the data flow is cyclic. So what you want is to build a control flow graph (cfg), and perform data flow analysis on it.

Data flow analysis is a method or family of methods used for implementing a whole range of compiler optimizations. Among them, escape analysis. In it, you keep track of all objects, variables and references of your program in a points-to graph. Each object and variable is a node in the graph and, depending on how you implement it, references between them can either be nodes or edges.

For example, consider this Java method:

public Pair myMethod(Pair[] pairs) {
    Foo f = new Foo();
    Pair p1 = new Pair(1, 2);
    Pair p2 = new Pair(3, 4);
    p1.value = f;
    pairs[0] = p1;
    return new Pair(5, 6);
}

Escape analysis would work roughly like this:

Line 1: An array of type Pair is passed into the method and a Pair is returned from it. Add two nodes to the graph, one labelled pairs and another labelled RETURN both with flagged foreigner.
Line 2: An object called f is created in the myMethod method- Add a node labelled f with flag native to the graph.
Line 3-4: Add two more native nodes labelled p1 and p2.
Line 5: The field value of p1 is set to refer to f. Add an edge from node p1 to f.
Line 6: First element of the pairs array is set to refer to p1. Add an edge from pairs to p1.
Line 7: A new pair object is returned. Add an anonymous native node representing the pair object to the graph. Then add an edge from the RETURN node to this anonymous node.

After this step, traverse the graph beginning at the foreigner nodes and flag all nodes visited as foreigner. It is exactly the same task as the mark step in a mark and sweep garbage collector.

Then all objects that weren't reached by the traversal retain their native flag and are eligible for stack allocation provided that the compiler can statically determine their sizes. The rest must be allocated on the heap.

I highly recommend watching the Youtube lectures of this professor:

The paper Escape analysis for Java also provides a fairly good description on how the algorithm works.

What data-structure is best suited for escape analysis?

1 Answers1