How can relational parametricity be motivated?

Question

Is there some natural way to understand the essence of relational semantics for parametric polymorphism?

I have just started reading about the notion of relational parametricity, a la John Reynolds' "Types, Abstraction and Parametric Polymorphism", and I am having trouble understanding how the relational semantics is motivated. Set semantics makes perfect sense to me, and I realise that set semantics is insufficient to describe parametric polymorphism, but the leap to relational semantics seems to be magic, coming completely out of nowhere.

Is there some way of explaining it along the lines "Assume relations on the base types and terms, and then the interpretation of the derived terms is just the natural relationship between ...such and such a natural thing... in your programming language"? Or some other natural explanation?

score 26 · Answer 1 · edited Nov 27 '19 at 16:15

26

Well, relational parametricity is one of the most important ideas introduces by John Reynolds, so it shouldn't be too much of a surprise that it looks like magic. Here is a fairy tale about how he might have invented it.

Suppose you're trying to formalize the idea that certain functions (identity, map, fold, reversal of lists) act "the same way on many types", i.e., you have some intuitive ideas about parametric polymorphism, and you have formulated some rules for creating such maps, i.e., the polymorphic $\lambda$-calculus or some early variant of it. You notice that naive set-theoretic semantics is not working.

For instance, we stare at the type $$\forall X : \mathsf{Type} . X \to X,$$ which should contain only the identity map, but naive set-theoretic semantics allows unwanted functions such as $$\lambda X : \mathsf{Type} . \lambda a : X . \mathsf{if}\ (X = \lbrace 0, 1 \rbrace) \ \mathsf{then} \ 0 \ \mathsf{else}\ a.$$ To eliminate this sort of thing, we need to impose some further conditions on functions. For instance, we could try some domain theory: equip each set $X$ with a partial order $\leq_X$ and require that all functions be monotone. But that doesn't quite cut it because the above unwanted function is either constant or identity, depending on $X$, and those are monotone maps.

A partial order $\leq$ is relfexive, transitive and antisymmetric. We can try to alter the structure, for instance we could try to use a strict partial order, or a linear order, or an equivalence relation, or just a symmetric relation. However, in each case some unwanted examples creep in. For instance, symmetric relations eliminate our unwanted function but allow other uwanted functions (exercise).

And then you notice two things:

The wanted examples are never eliminated, whatever relations you use in place of partial orders $\leq$.
For each particular unwanted example you look at, you can find a relation that eliminates it, but there is no single relation that eliminates all of them.

So, you have the brilliant thought that the wanted functions are those that preserve all relations, and the relational model is born.

edited Nov 27 '19 at 16:15

Radu GRIGore

4,796
30
69

answered Oct 27 '13 at 21:10

Andrej Bauer

28,823
1
80
133

1

Thanks Andrej. This raises the further question: is there any smaller subclass of relations that eliminate all the unwanted examples? – Tom Ellis Oct 27 '13 at 21:32
Well, we can probably limit the logical complexity of the relations because we only have to worry about computable maps. But I am not enough of an expert to answer. I summon @UdayReddy. – Andrej Bauer Oct 27 '13 at 22:07
2

@TomEllis. Yes, in special cases, a subclass of relations might suffice. The most immediate special case is that, if all operations are first-order, then functions (total, single-valued relations) are enough. For fields, partial isomorphisms are enough. Recall that Reynolds's leading example is the field of complex numbers, and his logical relation between Bessel and Descartes is a partial isomorphism. – Uday Reddy Nov 23 '13 at 21:53
4

@AndrejBauer. Note that $\forall X., X \to X$ has exactly one parametric element, but the ad hoc elements are too many to form a set! So, there is a lot of cutting to do. An alternative theory of how Reynolds might have gotten parametricity appears in our upcoming "Essence of Reynolds". – Uday Reddy Nov 23 '13 at 22:13
You show that if you interpret types as sets then there are unwanted functions. Doesn't the same apply to relations? \X:Type. \a:X. if X = {(0,0), (1,0), (0,1), (1,1)} then 0 else a – Jules Dec 03 '14 at 18:27
In the counterexample given : the first lambda X is a type binding, and the second lambda a is a value binding. that's trivial for most readers, not for all – nicolas Oct 21 '15 at 16:57
To answer my question about whether any smaller subclass of relations is enough, Harper seems to suggest that the "zig zag complete" relations are enough (https://www.cs.cmu.edu/~rwh/courses/chtt/pdfs/reynolds.pdf). I'm somewhat surprised I haven't seen that elsewhere (unless I'm misunderstanding or misremembering something). – Tom Ellis Jul 03 '21 at 08:45
1

@TomEllis Applying zig-zag complete relations to parametricity was first done by me, in my paper (with Derek Dreyer) Internalizing Relational Parametricity in the Extensional Calculus of Constructions. – Neel Krishnaswami Sep 30 '22 at 08:22
Thanks Neel. I'm trying to piece together my thought process from nine years ago and not having much luck ... On the same day that I asked this question I also asked "What are relations with this property called?" on MathOverflow, and noted that I was investigating them in the context of relational parametricity. I can't have read your paper by then or I'd already have known a name for them! https://mathoverflow.net/questions/146018/is-there-a-name-for-relations-with-this-property-and-the-category-of-them/357749 – Tom Ellis Oct 04 '22 at 15:44
I really can't recall what I was thinking at the time. It took until last year (8 years thence) for me to come across the idea mentioned in Harper's note (which references your paper). I find the notion of "zig zag relation" (or Quasi Partial Equivalence Relation) a much more natural setting for parametricity than arbitrary relation, yet I confess I still do not understand the role even QPERs play in parametricity. In particular, it's surprising to me that there are any meaningful parametricity results for relations that are not QPERs. – Tom Ellis Oct 04 '22 at 15:51

Uday Reddy · Answer 2 · 2013-11-23T17:29:25.220

The answer to your question is really there in Reynolds's fable (Section 1). Let me try and interpret it for you.

In a language or formalism in which types are treated as abstractions, a type variable can stand for any abstract concept whatsoever. We don't assume that types are generated via some syntax of type terms, or some fixed collection of type operators, or that we can test two types for equality etc. In such a language, if a function involves a type variable then the only thing the function can do to values of that type is to shuffle around the values it has been given. It cannot invent new values of that type, because it doesn't "know" what that type is! That is the intuitive idea of parametricity.

Then Reynolds thought about how to capture this intuitive idea mathematically, and noticed the following principle. Suppose we instantiate the type variable, say $t$, to two different concrete types, say $A$ and $A'$, in separate instantiations, and keep in our mind some correspondence $R : A \leftrightarrow A'$ between the two concrete types. Then we can imagine that, in one instance, we provide a value $x \in A$ to the function and, in the other instance, a corresponding value $x' \in A'$ (where "corresponding" means that $x$ and $x'$ are related by $R$). Then, since the function knows nothing about the types we are supplying for $t$ or the values of that type, it has to treat $x$ and $x'$ in exactly the same way. So, the results we get from the function should again correspond by the relation $R$ we have kept in our mind, i.e., wherever the element $x$ appears in the result of one instance, the element $x'$ must appear in the other instance. Thus, a parametrically polymorphic function should preserve all possible relational correspondences between possible instantiations of type variables.

This idea of preservation of correspondences is not new. Mathematicians have known about it for a long time. In the first instance, they thought that polymorphic functions should preserve isomorphisms between type instantiations. Note that isomorphism means some idea of a one-to-one correspondence. Apparently, isomorphisms were originally called "homomorphisms". Then they realized that what we now call "homomorphisms", i.e., some idea of many-to-one correspondences, would be preserved too. Such preservation goes by the name of natural transformation in category theory. But, if we think about it keenly, we realize that preservation of homomorphisms is utterly unsatisfying. The types $A$ and $A'$ we mentioned are completely arbitrary. If we pick $A$ as $A'$ and $A'$ as $A$, we should get the same property. So, why should "many-to-one correspondence", an asymmetric concept, play a role in formulating a symmetric property? Thus, Reynolds took the big step of generalizing from homomorphisms to logical relations, which are many-to-many correspondences. The full impact of this generalization is not yet fully understood. But the underlying intuition is fairly clear.

There is one further subtlety here. Whereas the instantiations of type variables can be arbitrarily varied, constant types should stay fixed. So, when we formulate the relational correspondence for a type expression with both variable types and constant types, we should use the chosen relation $R$ wherever the type variable appears and the identity relation $I_K$ wherever a constant type $K$ appears. For instance, the relation expression for the type $t \times Int \to Int \times t$ would be $R \times I_{Int} \to I_{Int} \times R$. So, if $f$ is a function of this type, it should map a pair $(x,n)$ and a related $(x',n)$ to some pair $(m,x)$ and related $(m,x')$. Note that we are obliged to test the function by putting the same values for constant types in the two cases, and we are guaranteed to get the same values for constant types in the outputs. So, in formulating relational correspondences for type expressions, we should make sure that, by plugging in identity relations (representing the idea that those types are going to be consant), we get back identity relations, i.e., $F(I_{A_1},\ldots,I_{A_n}) = I_{F(A_1,\ldots,A_n)}$. This is the crucial identity extension property.

To understand parametricity intuitively, all you need to do is to pick some samplee function types, think of what functions can be expressed of those types, and think about how those functions behave if you plug in different instantiations of type variables and different values of those instantiation types. Let me suggest a few function types to get you started: $t \to t$, $t \to Int$, $Int \to t$, $t \times t \to t \times t$, $(t \to t) \to t$, $(t \to t) \to (t \to t)$.

@AndrejBauer. Hmm, I didn't get a summon actually. It may be that the @ UdayReddy incantation works only at the beginning of the comment. In any case, no summons needed. "Parametricity" is among my filters. — Uday Reddy, Nov 23 '13 at 21:45
"the only thing the function can do to values of that type is to shuffle around the values it has been given" - actually, apart from the shuffling, the function can erase the given value (weakening) and copy it (contraction). Since these operations are always available, the value is not as abstract as it may seem. — Łukasz Lew, Jun 07 '18 at 20:30
@ŁukaszLew, you are right. I don't know if that can be characterized as loss of "abstraction" though. — Uday Reddy, Jun 09 '18 at 18:06
@UdayReddy I've removed the commend and asked this as a stand-alone question. — Łukasz Lew, Jun 09 '18 at 20:32

score 3 · Answer 3 · answered Oct 28 '13 at 18:19

Another possible answer different from Andrej's is the given by the example of the $\omega$-set model of polymorphism. Since every function in the polymorphic calculus is computable, it's natural to interpret a type by a set of numbers which represent the computable functions of that type.

Furthermore, it's tempting to identify functions with the same extensional behavior, thus leading to an equivalence relation. The relation is partial if we exclude the "undefined" functions, that is the functions which "loop" for some well-formed input.

The PER models are a generalization of this.

Another way to see these models are as a (very) special case of the simplicial set models of Homotopy Type Theory. In that framework, types are interpreted as (a generalization of), sets with relations, and relations between those relations, etc. At the lowest level, we simply have the PER models.

Finally, the field of constructive mathematics has seen the appearance of related notions, in particular the Set Theory of Bishop involves describing a set by giving both elements and an explicit equality relation, which must be an equivalence. It's natural to expect some principles of constructive mathematics make their way into type theory.

Ah, but the PER models are not very nice and can contain uwnanted polymorphic functions. One has to pass to the relational PER models to get rid of them. — Andrej Bauer, Oct 28 '13 at 21:01
@cody. I agree. I think of PERs as a way of building in relations into the "set theory" so that we can get impredicative models in the first place. Thanks for mentioning Homotopy type theory. I didn't know it had similar ideas. — Uday Reddy, Nov 23 '13 at 22:24
@UdayReddy: the ideas are very similar! In particular, the idea of "compatible dependent implementations" which relate abstract types with dependencies can be understood through the lens of the univalent equality. — cody, Nov 24 '13 at 00:50

winitzki · Answer 4 · 2023-01-06T14:15:41.823

Here is a straightforward, practice-driven motivation for relational parametricity.

What is relational parametricity?

Relational parametricity is a technique for proving "free theorems". This technique is based on using arbitrary (many-to-many) binary relations instead of ordinary functions (which are equivalent to many-to-one relations).

"Free theorems" are laws that are automatically satisfied by polymorphic functions when the code of those functions is "fully polymorphic". Knowing that these laws hold, the programmer can refactor code safely, as the result values are guaranteed to remain the same.

Code is "fully polymorphic" when all types other than Unit are treated as type parameters, there are no side effects, no external libraries or "built-in" functions, no run-time type identification or reflection. The code must treat all types as "opaque type parameters", that is, as arbitrary types about which nothing is known. The code may not even assume that those types are inhabited.

For instance, any lambda-term from System F is "fully polymorphic code" by this definition. So, any given lambda-term from System F will automatically satisfy a certain law. The form of that law depends only on the type of the term.

If we want to prove that this property is true, the technique of relational parametricity is pretty much the only known technique that is powerful enough and works in all cases. This technique can be generalized to many programming languages other than System F (lambda calculus with linear types, or with dependent types, etc.).

Details

In the practice of functional programming, one often uses polymorphic functions, that is, functions parameterized by a type, for example $\textrm{safeListHead}: \textrm{List}\,a \to \textrm{Maybe}\,a $. We notice that any function $\phi$ of type $\forall a.\,F\, a\to G\,a$, where $F$ and $G$ are covariant functors, will be automatically a natural transformation as long as the code is written in a purely functional, fully parametric manner: no side effects, no runtime type reflection, no external libraries. Such a function $\phi$ will automatically satisfy a naturality law:

$$ \forall (f:a\to b).\,\, \phi \circ \textrm{fmap}_F\, f = \textrm{fmap}_G\, f \circ \phi $$

Most of the examples in Wadler's paper "Theorems for free" are naturality laws of this kind. However, Wadler does not formulate the "free theorem" property in that way and does not point out which of his "free theorems" are naturality laws. The reason is that, for more complicated type signatures, the "free theorem" is not a naturality law but a more complicated law that is not necessarily written as a single equation satisfied by functions.

Now, we ask: how can we prove that any lambda-term of type $\forall a.\,F\, a\to G\,a$ satisfies the appropriate naturality law?

It turns out that the proof is quite hard unless we first generalize the statement we are proving. The proof must go by induction on the structure of the lambda-term. But when we look at some arbitrary code for a function $\phi$, we will usually find sub-expressions that are not themselves natural transformations with type signatures $\forall a.\,F\, a\to G\,a$. Sub-expressions can have arbitrary types, say, of the form $$\forall a.\,F\, a\to G\,a$$ where $F$ and $G$ are type constructors that is not necessarily covariant.

If we want to prove the naturality law by induction on the term, we have to use the inductive assumption that the naturality law already holds for all sub-expressions of that term. But, so far, we cannot formulate a naturality law for sub-expressions that are not themselves natural transformations.

So, we are obliged to ask: what is the generalization of the naturality law for a function of type $\forall a.\,F\, a\to G\,a$ where $F$ and $G$ are arbitrary type constructors (not necessarily covariant or contravariant)?

The naturality law for functors involves lifting an arbitrary function $f: a \to b$ to a covariant functor $F$, which gives us a function $\textrm{fmap}_F\,f$ of type $F \,a \to F \,b $. The type signature of $\textrm{fmap}_F$ is:

$$ \textrm{fmap}_F: (a \to b) \to (F\,a \to F\,b) $$

But we cannot do this lifting when $F$ is an arbitrary type constructor because $\textrm{fmap}_F$ only exists for covariant type constructors $F$.

Now, the key technique in the Reynolds-Wadler approach is to replace an arbitrary function $f: a\to b$ by an arbitrary binary relation $r$ between values of types $a$ and $b$. A binary relation $r$ is in general a many-to-many relation and is not equivalent to any function of type $a\to b$ or $b\to a$. Let us denote the relation types by $r: a \leftrightarrow b$.

Why does it help to replace functions by relations? Because, as it turns out, there is a function $\textrm{rmap}_F$ that lifts a relation $r: a \leftrightarrow b$ to a relation between values of types $F\,a$ and $F\,b$. The type signature of $ \textrm{rmap}_F$ is:

$$ \textrm{rmap}_F: (a \leftrightarrow b) \to (F\, a \leftrightarrow F \,b)$$

The $\textrm{rmap}_F$ is well-defined even if $F$ is neither covariant nor contravariant (unlike $\textrm{fmap}_F$ that exists only when $F$ is covariant). The function $\textrm{rmap}_F$ can be constructed for any $F$ that is built up from fully parametric type constructions (such as, in System F, the unit type, free type parameters, products, co-products, function types, recursive types, universally quantified types).

Now, the lifted relation $\textrm{rmap}(r)$ can be used to formulate the "relational naturality law":

$$ \forall (r: a\leftrightarrow b).\,\,\textrm{ if } (x,y)\in \textrm{rmap}_F\,r\textrm{ then } (\phi(x),\phi(y))\in \textrm{rmap}_G\,r $$

When $F$ and $G$ are covariant and when we choose the relation $r$ to be a function graph then the "rmap" operation reduces to "fmap". So, the relational naturality law reproduces the usual law of natural transformations.

But now we have a law that applies to any type signature, not necessarily restricted to covariant or contravariant type constructors. At this point, one can prove by straightforward induction that any lambda-term satisfies the relational naturality law.

The main technical difficulty in this approach is the definition of the operation $\textrm{rmap}_F$ for an arbitrary type constructor $F$. This definition must proceed by induction on the type structure of $F$. The paper by Wadler does not contain an adequately detailed explanation of how $\textrm{rmap}_F$ must be defined; neither does the paper by Reynolds. Those papers hide the details and may create an impression that $\textrm{rmap}_F$ is somehow already defined, or that it is obvious how it should be defined. But actually it is neither obvious nor easy to define $\textrm{rmap}_F$.

In my understanding, when we want to define the core construction of relational parametricity (mapping a type to a binary relation) for arbitrary type constructors, an explicit use of $\textrm{rmap}_F$ is required. But neither Reynolds nor Wadler talk about arbitrary type constructors. Wadler only gives some examples with List but does not explain what to do with other type constructors.

The definition of $\textrm{rmap}_F$ is especially nontrivial when $F$ is a recursively defined type constructor, or when $F\,x$ contains universally quantified types inside, for example: $$F\,x = \forall a. ((a\to x)\to a)\to a$$

After looking through many papers and tutorials and blog posts on relational parametricity, I did not find any sources that work out the details of "rmap" or explain why it is required in formulating the parametricity theorems.

Here is an example of a property one can derive via the parametricity theorem:

Given any (covariant) polynomial functor $L$, any parametrically polymorphic function $\phi$ with type signature: $$\phi: \forall a.\, L (a \to a) \to a \to a$$ satisfies the following law: for all $p: L(a \to a)$, we have: $$ \phi\, p = \phi (\textrm{fmap}_L\, j\, p)\, \textrm{id} $$ where the function $j: (a \to a) \to (a \to a) \to (a \to a)$ is the function composition: $j\, h\, k = h\, \circ\, k$

To prove this property, one needs to be able to handle an arbitrary polynomial type constructor $L$. The papers of Reynolds and Wadler, or any other papers I have seen, do not demonstrate adequate techniques for proving properties of this kind.

Summary

The main motivation for considering relational parametricity is to obtain a powerful technique for deriving properties and laws for functions whose types involve universally quantified types. In particular, one can prove the naturality laws and other such laws that apply to fully polymorphic code with type parameters.

Without relational parametricity, the proof of naturality laws becomes extremely complicated and cannot be easily generalized to different kinds of lambda calculus.

The main reason relational parametricity works is that one can define a function $\textrm{rmap}_F$ transforming a relation of type $a \leftrightarrow b$ into a relation of type $(F\, a \leftrightarrow F \,b)$. The transformation $\textrm{rmap}_F$ can be defined for an arbitrarily complicated type constructor $F$, not necessarily covariant or contravariant. Using $\textrm{rmap}_F$, one can define a general "relational naturality law" for terms of arbitrarily complicated types. Then it becomes straightforward to prove that any fully parametric code will automatically satisfy that law.

The price paid for this power is that the result of the relational parametricity theorem is a law for relations, not for functions. A programmer needs somehow to derive an equation for functions from the relational law. This additional step is not trivial; practitioners need to build intuition and learn specific techniques for doing that in various cases.

There are currently no tutorials or books that explain this material in detail.

To understand the logic used in Wadler's Theorems for Free, you need to read Plotkin and Abadi's A Logic for Parametric Polymorphism. To understand the model construction justifying Plotkin-Abadi logic, you need to read Bainbridge, Freyd, Scedrov and Scott's Functorial Polymorphism. However, the first half of this paper is about an idea that didn't really work out, so you need to ignore focus on the second half of the paper, which is about the relational PER semantics (which does work). — Neel Krishnaswami, Sep 30 '22 at 08:20
@NeelKrishnaswami Thank you for the references. I already spent time reading those papers but failed to see how those theories would help me as a programmer to motivate the usefulness of parametricity. So, I motivate parametricity from a practical perspective: Relational parametricity is a powerful tool for proving a large number of laws that are useful in programming. I think Wadler's paper (and certainly Bainbridge's and Plotkin's papers) are largely incomprehensible to newcomers and even to advanced programmers who want to learn more theory. Is there a paper that defines "rmap" in detail? — winitzki, Sep 30 '22 at 09:08

How can relational parametricity be motivated?

4 Answers4

What is relational parametricity?

Details

Summary