What is the best way to implement closures over goto targets/labels in a compiler?

Question

Consider the following Common Lisp code:

(let ((closure))
  (defun weird ()
    (tagbody
        (go :start)
      :here
        (write-line "Got :here")
        (return-from weird)
      :start
        (setf closure (lambda ()
                        (go :here)))
        (activate)))

  (defun activate ()
    (funcall closure)))

A call to weird results in Got :here being printed. Thus the function stored in closure closes over the go tag :here. Common Lisp's tagbody and go have the restriction that the target of the transfer must be in a lexically visible tagbody form; for example, the following is invalid:

(defun wrong ()
  (go :inner)
  (tagbody :inner
    (write-line "Can't happen")))

Of course, with Lisp's syntax, this makes a lot of sense; we can't go to :inner because the go is not within its scope. However, languages like Algol and C do not have this restriction:

void right(void)
{
        goto inner;
        {
inner:          printf("Can happen");
        }
}

Suppose that I wanted to implement a compiler for an Algol-like language with closures, and that I am producing byte code for a stack-based virtual machine. (The point is that I'm not limited by some real computer's architecture or even the architecture of an existing virtual machine.) This language requires declaring labels as in Pascal, although the declaration can be anywhere, not just at the beginning of a function/procedure, and labels are symbolic, not numeric.

The “binding” of a label/tag to a location in the code has dynamic extent in Common Lisp, and this should be the case in this language as well; the transfer point becomes invalid once the block containing the declaration of the label is exited. For example, a call to activate after the tagbody in weird would cause this constraint to be violated. Otherwise, however, a label may be placed anywhere within the lexical scope of its declaration, including in inner blocks.

For an actual use of this facility, see the sample expansion of Common Lisp's handler-case given in the language's standard (in the Notes section), or that of restart-case (If you aren't already familiar with Common Lisp's condition system, then this won't be very meaningful.) I'm not sure of a reasonable use for closed-over goto's into an inner block…however, to omit the functionality would be a random restriction.

Transferring control into a closure is not meaningful, nor is transferring into the body of any procedure.

What is the best way to support closing over goto labels, in a situation like this? Closing over variable bindings is simple enough, but what is necessary to allow transfer points to be closed over as well?

I suppose that there would be a label environment along with the lexical environment. What would this environment look like (if this is the way to go)? The complications seem to arise when the goto is targeting a block at a deeper level than the point of the closure's definition. What about making the transfer point invalid after the label has become unreachable? Perhaps CLISP's bytecode (see the instructions for tagbody and go) can serve as a model; it implements the desired semantics, save for the ability to go to an inner label.

Common Lisp also has the block operator, which establishes a named exit point that can also be closed over. However, I believe that this can be implemented using the same mechanism as for closed-over goto.

(Incidentally, I'm pretty sure an equivalent of weird is possible to define in Algol 68, with only syntactic differences. In fact, I'm pretty sure that an Algol 68 implementation needs to solve partially the problem described in this question, although Algol 68 does not have closures in general.)

The Wikipedia article on goto states the following:

In Scheme, continuations can even move control from an outer context to an inner one if desired. This almost limitless control over what code is executed next makes complex control structures such as coroutines and cooperative multitasking relatively easy to write.

This suggests to me that continuations are definitely a viable strategy. I was hesitant before, mostly out of ignorance, but now that I've done some further research the idea is more appealing, especially because coroutines sound nice to have for “free”.

My first thought is to have a label environment that consists of continuations; a transfer to a label is performed by executing (activating?) the corresponding continuation. These continuations must be set up dynamically at run time. A closure that closes over a set of labels will get a copy of the label environment.

However, there seem to be a few problems with this idea. As it stands, these continuations would have indefinite extent—I think this can be fixed by keeping in the environment mere references to the continuations, which can be invalidated when the label environment is torn down after the declarations go out of scope. A more pressing issue is that to make this work a substantial rethinking of the execution model appears to be required, in order to accommodate the inclusion of continuations. I should mention that I don't want first-class continuations; they are far scarier than goto.

Without closed-over labels, this language could be implemented with a virtual machine similar to the classic Algol machine: There is a “display” to keep track of the lexical environment, and a closure stores a copy of the display at the time of its creation. With closed-over labels, everything gets much more complicated, in my eyes. I feel as though I am missing a reasonable method for dealing with them, one that doesn't bring something as heavy as continuations into the mix. I would also like for it to be possible to compile normal goto statements into simple transfers.

(Note:I am fully willing to accept that my assumption is wrong, and there is no easy way out. I also understand that this is really a needless “problem”, since the language does not need a goto statement. It's just a snag I hit while trying to graft Common Lisp semantics onto Algol and I want to solve it for the sake of consistency.)

Do you have label variables (label references) or just label constants? — Erik Eidt, May 31 '21 at 15:51
@ErikEidt At the language level? Label constants (if I understand your question correctly), because closures over labels can take the place of label references. — texdr.aft, May 31 '21 at 15:58
But isn't a closure it's own scope, one which execution can't fall into on its own? In a C++ lambda, you can't goto from inside to outside or vice versa, I don't think. (I haven't read the full question yet, maybe that's no relevant). But Erik was asking if you had something like GNU C `void *target = &&label;` / `goto *target` https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html that would make jumps from any scope to any other scope within the same function possible, without necessarily knowing at compile-time which scope the jump-target was going to be. — Peter Cordes, May 31 '21 at 22:09
@PeterCordes This language and Common Lisp allow transferring out of a closure, so long as the transfer target is active (i.e., the surrounding block is currently being executed). Transferring *into* a closure is something I had not considered; I think it's best to leave the consequences of it undefined, because I can't think of any reasonable meaning. This language does not have first-class labels, as with standard C. The way to achieve the same effect would be to define a closure for each target label, and choose one, e.g., by assigning it to a variable (analogous to `target`). — texdr.aft, May 31 '21 at 22:23
@PeterCordes This page gives a good overview of the way Common Lisp's closures work: https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node43.html. (Search for the definition of `contorted-example` to see some truly ridiculous code.) I am trying to achieve the same semantics, more or less. — texdr.aft, May 31 '21 at 22:39

What is the best way to implement closures over goto targets/labels in a compiler?

0 Answers0

Linked