13

For a correctness proof, I'm looking for a usable notion of program equivalence $\cong$ for Barendregt's pure type systems (PTSs); missing that, for enough specific type systems. My goal is simply to use the notion, not to investigate it for its own sake.

This notion should be "extensional" — in particular, to prove that $t_1 \cong t_2$, it should be enough to prove that $t_1\; v \cong t_2\; v$ for all values $v$ of the appropriate type.

Denotational equivalence

Denotational equivalence easily satisfies all the right lemmas, but a denotational semantics for arbitrary PTS seems rather challenging — it'd appear hard already for System F.

Contextual/observational equivalence

The obvious alternative are then various forms of contextual equivalence (two terms are equivalent if no ground context can distinguish them), but its definition is not immediately usable; the various lemmas aren't trivial to prove. Have they been proved for PTS? Alternatively, would the theory be an "obvious extension", or is there reason to believe the theory would be significantly different?

EDIT: I didn't say what's hard above.

Easy part: the definition

Defining the equivalence is not too hard, and the definition appears in many papers (starting at least from Plotkin 1975's study of PCF, if not earlier — the source might be Morris's PhD thesis from 1968). We $t_1 \cong t_2$ if for all ground contexts $C$, $C[t_1] \simeq C[t_2]$ — that is, $C[t_1]$ and $C[t_2]$ give the same result. You have a few choices here with lots of alternatives: For instance, in a strongly normalizing language, if you have a ground type of naturals, you can say that ground contexts are the ones that return naturals, and then $a \simeq b$ means that $a$ and $b$ evaluate to the same number. With nontermination, for reasonable languages it is enough to use "X terminates" as observation, because if two programs are equivalent when observing termination, they're also equivalent when observing the result.

Hard part: the proofs

However, those papers often don't explain how hard it is to actually use this definition. All the references below show how to deal with this problem that, but the needed theory is harder than one thinks. How do we prove that $t_1 \cong t_2$? Do we actually do case analysis and induction on contexts? You don't want to do that.

As Martin Berger points out, you want to use, instead, either bisimulation (as done by Pitts) or a logical equivalence relation (that Harper simply calls "logical equivalence").

Finally, how do you prove extensionality as defined above?

Harper solves these questions in 10 pages for System T, through considerable cleverness and logical relations. Pitts takes more. Some languages are yet more complex.

How to deal with this

I'm actually tempted to make my proofs conditionally on a conjectured theory of equivalence for PTS, but the actual theories require nontrivial arguments, so I'm not sure how likely such a conjecture would be to hold.

I'm aware (though not in detail) of the following works:

  • Andrew Pitts (for instance in ATTAPL for an extended System F, and in a few papers, such as the 58-page "Operationally-Based Theories of Program Equivalence").
  • Practical Foundations of Programming Languages (chapters 47-48), which is inspired by Pitts (but claims to have simpler proofs).
  • A logical study of program equivalence. I can't find an English abstract, but it seems to spend a great deal of effort for side effects (references), which seems an orthogonal complication.
Charles Stewart
  • 4,316
  • 28
  • 45
Blaisorblade
  • 2,059
  • 21
  • 29
  • 1
    For type-theories defining an operational contextual congruence should be easy, since the programs all terminate. Define a notion of observation at a base type (e.g. termination, written $\Downarrow$, at type Unit), and then say $P \cong Q$ for all well-typed and closing contexts $C[\cdot]$ of base type, we have $C[P] \Downarrow$ iff $C[Q] \Downarrow$. With PTSs it's a bit more complicated, as may get non-termination. – Martin Berger Mar 06 '15 at 15:47
  • @MartinBerger: that's the idea I'm hinting at, but proving it directly is suprisingly hard, because you need to do proofs for all C (I'll explain that better in the question). Also, if all programs terminate, the definition you use, as given, identifies all programs. – Blaisorblade Mar 06 '15 at 19:36
  • Does your PTS only have functions as computational types? If then, then this excellent question (and answers) seems to indicate that $\beta\eta$-equivalence is enough for terminating pure type systems -- and it nicely explains how to define contextual equivalence for terminating calculi. I think using ground values is the right way to define contextual equivalence, and termination is only a convenient shortcut of dubious merit. – gasche Mar 06 '15 at 19:41
  • @gasche: thanks for the compliment to my question, but I'll need naturals and more (I'm trying to generalize ILC to a language bigger than STLC, and Cai is approaching the question using PTSs). But the problem is not the definition of contextual equivalence — see my edits clarifying what's hard. – Blaisorblade Mar 06 '15 at 20:21
  • 1
    @Blaisorblade Sorry yes, if you use termination as observable, then you are right. Sorry, I was cutting and pasting the definition for turing complete deterministic languages. If you have terminating functions you can use a different basic observable. For example at booleans: ... $C[P] \Downarrow true$ iff $C[Q] \Downarrow true$. The quantification over all contexts is always a problem. The standard way of dealing with it is to define a second relation that (1) is sound w.r.t. $\cong$ and (2) easy to handle, e.g. some notion of bisimilarity, or logical relation. Depends on the application. – Martin Berger Mar 06 '15 at 20:50
  • In my experience showing equivalences between programs is considerably easier if everything terminates, because then you can use induction on the length of derivations as a proof technique. But it's always a bit of a struggle. – Martin Berger Mar 06 '15 at 20:52
  • To prove that $t_1~s_1 \equiv t_2~s_2$, you use transitivity: first $t_1~s_1 \equiv t_1~s_2$ (context $t_1~\square$) then $t_1~s_2 \equiv t_2~s_2$ (context $\square~s_2$). – gasche Mar 06 '15 at 21:31
  • @gasche: you're right of course — and I did that yesterday in a minute! I shouldn't try to make up examples on the spot. – Blaisorblade Mar 06 '15 at 22:12
  • @MartinBerger: When you say "the standard way", it sounds like we'd probably have to do this on our own for PTS :-(. I'll take a look. – Blaisorblade Mar 06 '15 at 22:54
  • 1
    @Blaisorblade Probably. Concurrency theorists have been doing this intensively for a long time, because with concurrent processes it's a lot less clear what notion of equivalence to use. This has lead to a division of labour: use a reduction based semantics with quantification over contexts to define the notion of equivalence and then use bisimulations or traces over labelled transitions to prove equivalence (or the absence thereof). A big open research question in concurrency theory is how to go from a the former to the latter algorithmically. – Martin Berger Mar 06 '15 at 23:01
  • @Blaisorblade In my experience it's worthwhile to tailor the sound approximation of the canonical congruence to the theorem to be proved: use as weak an approximation as you can get away with. That tends to simpify proofs. – Martin Berger Mar 07 '15 at 01:25
  • Thanks. Luckily I'm not dealing with concurrency (my language is very boring) — in fact, it seems that the logical equivalence I need is the one used in the theory of parametricity (PFPL, Chapter 49). Since parametricity has been extended to (consistent) PTSs (by Bernardy and Lasson), it should be possible to get the lemmas I need. – Blaisorblade Mar 07 '15 at 02:13

3 Answers3

4

A compositional denotational semantics $[\![ {-} ]\!]$ of a programming language (a domain-theoretic or game-theoretic one, for instance) is adequate if semantically equal terms imply that they are observationally equivalent: $$[\![ t_1 ]\!] = [\![ t_2 ]\!] \implies t_1 \cong t_2.$$ It often happens that it is far easier to calculate denotations that to prove observational equivalence. This is a common technique with many known variants. Adequacy is defined already in Plotkin's PCF paper.

Andrej Bauer
  • 28,823
  • 1
  • 80
  • 133
  • There is no particular need for the adequate semantics to use some other formalism. You can also soundly approximate $\cong$ within the programming language under discussion, e.g. using operational methods. – Martin Berger Mar 07 '15 at 20:22
  • Thanks for the answer, but -1: While I agree, the question mentions pure type systems — AFAICS, a denotational semantics for pure type systems is an open problem, so I think an answer should point to some denotational semantics. (In fact, if I had a denotational semantics, I'd dispense with the operational one altogether, as mentioned in the question). (But sorry for the overly long question.) – Blaisorblade Mar 08 '15 at 00:08
  • @MartinBerger, I do not understand your criticism. The OP asks for methods of showing observational equivalence, I mention a common one, and then you object that other ways exist which avoid the method? – Andrej Bauer Mar 08 '15 at 10:02
  • 2
    @Blaisorblade, well then you'll just have to invent a denotational semantics for pure type systems, won't you? :-) But more seriously, I'll ask Alex Simpson, he would know better about denotational semantics for such things. – Andrej Bauer Mar 08 '15 at 10:03
  • @AndrejBauer It wasn't meant to be a criticism, more as an addendum. – Martin Berger Mar 16 '15 at 18:27
  • @AndrejBauer Conjecture: you can collapse arbitrary PTSs to $\lambda^*$ and use the existing denotational semantics (based on domain theory) for the latter. I assume the result won't be fully abstract most of the time, but that's probably still useful enough for some purposes. – Blaisorblade Mar 18 '16 at 13:24
2

The domain and model theories of PTSes hasn't been explored all that much I'm afraid. One detailed source is Thomas Streicher's PhD work: Semantics of Type Theory. He gives category theoretic semantics for all PTS, though I'm not sure he addresses $\eta$-conversion.

cody
  • 13,861
  • 1
  • 49
  • 103
  • 1
    I don't think Streicher's PhD is about PTS. It's about semantics of the Calculus of Constructions and independence results via reliability semantics. See here. – Martin Berger Mar 16 '15 at 18:47
  • Thanks for the clarification! I'm afraid the link is broken though (and hard to fix with the minified link). – cody Mar 16 '15 at 19:41
  • The link was to table of content of the book here. I hope this one works better. – Martin Berger Mar 16 '15 at 19:43
  • @MartinBerger: thanks for the clarification! I'll go back to assuming that a semantics for arbitrary PTSs (ranging to System U and $\lambda^*$ and all their pathologies) is beyond our reach. – Blaisorblade May 01 '15 at 19:47
  • @MartinBerger: do you mean realizability semantics? – Dominique Devriese Apr 25 '18 at 09:49
  • @DominiqueDevriese Yes, of course. Autocorrect always catches me off-guard. Thanks for pointing this out. – Martin Berger Apr 25 '18 at 11:12
  • @Blaisorblade just to be clear, domain theoretic semantics are somewhat impervious to non-termination issues such as occur in system U and its ilk. In general, there are models of polymorphic theories in domains, see e.g. Coquand et al "Domain theoretic models of polymorphism". I don't think there is any theoretical objection to extending this to arbitrary PTSes, just the work hasn't really been done. Of course, I would be more interested in models capable of implying normalization, but those do run into problems for system U. – cody Apr 25 '18 at 14:38
  • cody: I wasn't expecting difficulties for any given PTS due to non-termination, but giving one semantics parametric in multiple ones seems harder — how many universes are you going to need?

    What's more worrisome is that your semantics must be impervious to impredicativity issues. Luckily domain theory lets you solve arbitrary mixed-variance fixpoint equations, and I guess that might help; I'm currently using the other approach to those equations (based on Birkedal et al.'s work on ultrametric spaces/step-indexing) for a different impredicative system.

    – Blaisorblade Apr 26 '18 at 15:37
0

This answer suggests an approach to the problem. (Feedback is welcome).

PFPL chapter 49 discusses, at once, the equivalent notions of observational equivalence and logical equivalence. Logical equivalence is the same relation used to state parametricity, so the core of the chapter is a proof of parametricity for System F.

Work on parametricity for PTS, AFAICT, does not discuss the relation with observational equivalence. In fact, to even define observational equivalence, unless you have nontermination, you need some positive ground type (naturals, booleans) to use for observations.

However, the key theorem (PFPL 47.6, 48.3, 49.2) to relate the two relations is proven independently from the specific language:

Observational equivalence is the coarsest consistent congruence on expressions.

Then, to show that logical equivalence implies observational equivalence, one only need show that logical equivalence is a consistent congruence. However, the other direction requires some more work: in particular, to show that logical equivalence is a congruence, one does proceed by induction on contexts.

EDIT: there's a big problem with this approach: you wouldn't get n + 1 = 1 + n. Let VecN n be the type of vectors of naturals of length n (assuming that can be defined or encoded), and consider using VecN as a context (well, you'd need in fact a context ending in VecN, but bear with me). Since $n + 1 = 1 + n$ is not a definitional equivalence, types Vec (n + 1) and Vec (1 + n) are incompatible, hence n + 1 and 1 + n are not observationally equivalent.

Blaisorblade
  • 2,059
  • 21
  • 29