23

I think I'm not understanding it, but $\eta$-conversion looks to me as a $\beta$-conversion that does nothing, a special case of $\beta$-conversion where the result is just the term in the lambda abstraction because there is nothing to do, kind of a pointless $\beta$-conversion.

So maybe $\eta$-conversion is something really deep and different from this, but, if it is, I don't get it, and I hope you can help me with it.

(Thank you and sorry, I know this is part of the very basics in lambda calculus)

Charles Stewart
  • 4,316
  • 28
  • 45
Trylks
  • 604
  • 4
  • 9

4 Answers4

24

Update [2011-09-20]: I expanded the paragraph about $\eta$-expansion and extensionality. Thanks to Anton Salikhmetov for pointing out a good reference.

$\eta$-conversion $(\lambda x . f x) = f$ is a special case of $\beta$- conversion only in the special case when $f$ is itself an abstraction, e.g., if $f = \lambda y . y y$ then $$(\lambda x . f x) = (\lambda x . (\lambda y . y y) x) =_\beta (\lambda x . x x) =_\alpha f.$$ But what if $f$ is a variable, or an application which does not reduce to an abstraction?

In a way $\eta$-rule is like a special kind of extensionality, but we have to be a bit careful about how that is stated. We can state extensionality as:

  1. for all $\lambda$-terms $M$ and $N$, if $M x = N x$ then $M = N$, or
  2. for all $f, g$ if $\forall x . f x = g x$ then $f = g$.

The first one is a meta-statement about the terms of the $\lambda$-calculus. In it $x$ appears as a formal variable, i.e., it is part of the $\lambda$-calculus. It can be proved from $\beta\eta$-rules, see for example Theorem 2.1.29 in "Lambda Calculus: its Syntax and Semantics" by Barendregt (1985). It can be understood as a statement about all the definable functions, i.e., those which are denotations of $\lambda$-terms.

The second statement is how mathematicians usually understand mathematical statements. The theory of $\lambda$-calculus describes a certain kind of structures, let us call them "$\lambda$-models". A $\lambda$-model might be uncountable, so there is no guarantee that every element of it corresponds to a $\lambda$-term (just like there are more real numbers than there are expressions describing reals). Extensionality then says: if we take any two things $f$ and $g$ in a $\lambda$-model, if $f x = g x$ for all $x$ in the model, then $f = g$. Now even if the model satisfies the $\eta$-rule, it need not satisfy extensionality in this sense. (Reference needed here, and I think we need to be careful how equality is interpreted.)

There are several ways in which we can motivate $\beta$- and $\eta$-conversions. I will randomly pick the category-theoretic one, disguised as $\lambda$-calculus, and someone else can explain other reasons.

Let us consider the typed $\lambda$-calculus (because it is less confusing, but more or less the same reasoning works for the untyped $\lambda$-calculus). One of the basic laws that should holds is the exponential law $$C^{A \times B} \cong (C^B)^A.$$ (I am using notations $A \to B$ and $B^A$ interchangably, picking whichever seems to look better.) What do the isomorphisms $i : C^{A \times B} \to (C^B)^A$ and $j : (C^B)^A \to C^{A \times B}$ look like, written in $\lambda$-calculus? Presumably they would be $$i = \lambda f : C^{A \times B} . \lambda a : A . \lambda b : B . f \langle a, b \rangle$$ and $$j = \lambda g : (C^B)^A . \lambda p : A \times B . g (\pi_1 p) (\pi_2 p).$$ A short calculation with a couple of $\beta$-reductions (including the $\beta$-reductions $\pi_1 \langle a, b \rangle = a$ and $\pi_2 \langle a, b \rangle = b$ for products) tells us that, for every $g : (C^B)^A$ we have $$i (j g) = \lambda a : A . \lambda b : B . g a b.$$ Since $i$ and $j$ are inverses of each other, we expect $i (j g) = g$, but to actually prove this we need to use $\eta$-reduction twice: $$i(j g) = (\lambda a : A . \lambda b : B . g a b) =_\eta (\lambda a : A . g a) =_\eta g.$$ So this is one reason for having $\eta$-reductions. Exercise: which $\eta$-rule is needed to show that $j (i f) = f$?

Andrej Bauer
  • 28,823
  • 1
  • 80
  • 133
  • "Also, I have heard it said that η-rule is about extensionality of functions. This is false" If you augment the $\beta$-equality axioms and inference rules with the extensionality rule (see my answer), this set of inference rules captures exactly $\beta\eta$-equality, doesn't it? (i.e two terms are equal in this theory iff they are $\beta\eta$-equal) – Marcin Kotowski Sep 19 '11 at 00:03
  • @Marcin: yes, extensionality implies $\eta$-rule, but not vice versa. How would you derive extensionality from $\beta$- and $\eta$-rules? – Andrej Bauer Sep 19 '11 at 00:06
  • 1
    let $=$ denote the smallest congruence that contains $={\beta}$ and satisfies extensionality (if $Mx=Nx$, then $M=N$). Then $M=N$ iff $M={\beta\eta}N$ (see e.g. first chapter of Urzyczyn, Sorensen "Lecture on Curry-Howard isomorphism) and in this sense $\eta$-rule captures the notion of extensionality. – Marcin Kotowski Sep 19 '11 at 00:22
  • I see, you're thinking of extensionality as a schema, i.e., we prove that it holds for every particular pair of terms $M$ and $N$. I was thinking of extensionality as a statement. I think. Now I have to think about it. – Andrej Bauer Sep 19 '11 at 04:57
  • Ok, I think I know why I am unhappy about Marcin's statement. It gives the wrong idea that $\eta$-rule proves extensionality of all functions, when it only proves extensionality of those functions which happen to be denoted by a term. So this is cheating. – Andrej Bauer Sep 19 '11 at 12:53
  • 1
    @AndrejBauer I agree that η-rule isn't full extensionality, but don't you think it is still a limited form of extensionality, i.e. it represents a class of obvious cases of extensionality. The original question is looking for motivations and concepts, and in this case I believe that thinking in terms of extensionality is useful (with some care taken of course not to go too far). – Marc Hamann Sep 20 '11 at 13:06
  • @Marc: Yes, I agree that it is a form of extensionality. Nontheless, equating $\eta$ with extensionality goes too far. – Andrej Bauer Sep 20 '11 at 14:23
  • @Andrej, regarding the exercise, it might be something like this: $$\begin{eqnarray} j\ (i\ f) & = & j\ (\lambda a : A . \lambda b : B . f\ \langle a, b \rangle)\ & = & \lambda p : A \times B . (\lambda a : A . \lambda b : B . f\ \langle a, b \rangle)\ (\pi_1\ p)\ (\pi_2\ p)\ & = & \lambda \langle x, y \rangle : A \times B . (\lambda a : A . \lambda b : B . f\ \langle a, b \rangle)\ x\ y\ & =\beta & \lambda \langle x, y \rangle : A \times B . f\ \langle x, y \rangle\ & =\eta & f \end{eqnarray}$$

    It's simplified, due to the shortness of comments.

    – Giorgio Marinelli Dec 23 '20 at 15:37
  • @GiorgioMarinelli: no that's wrong, you used the pattern $\langle x, y \rangle$ in the $\lambda$-abstraction. Such patterns do not exist in the pure $\lambda$-calculus and in general present a non-trivial extension of the calculus. – Andrej Bauer Dec 23 '20 at 23:56
  • 1
    @AndrejBauer, thanks. I had doubts about that solution. Another way would be to consider an η-reduction like $(\pi_1\ p)\ (\pi_2\ p) = p$, and we would have: $$\begin{eqnarray} j\ (i\ f) & = & j\ (\lambda a : A . \lambda b : B . f\ \langle a, b \rangle)\ & = & \lambda p : A \times B . (\lambda a : A . \lambda b : B . f\ \langle a, b \rangle)\ (\pi_1\ p)\ (\pi_2\ p)\ & =\beta & \lambda p : A \times B . f\ \langle (\pi_1\ p), (\pi_2\ p) \rangle\ & =\eta & \lambda p : A \times B . f\ p\ & =_\eta & f \end{eqnarray}$$ – Giorgio Marinelli Dec 27 '20 at 17:35
10

In order to answer this question, we can provide the following quotation from the corresponding monograph “Lambda Calculus. Its Syntax and Semantics“ (Barendregt, 1981):

The point of introducing of $\beta\eta$-reduction is to provide an axiom for provable statements in extensional $\lambda$-calculus [the $\lambda + \text{ext}$ theory, where $\text{ext}$ stands for the rule $M x = N x \Rightarrow M = N$] such that it would have Church–Rosser property.

Proposition. $M =_{\beta\eta} N \Leftrightarrow \lambda\eta \vdash M = N \Leftrightarrow \lambda + \text{ext} \vdash M = N$.

[Its proof is based on the following theorem.]

Theorem (by Curry). The theories $\lambda + \text{ext}$ and $\lambda\eta$ are equivalent… [Its proof consists of two parts: $(\text{ext}) \Rightarrow (\eta)$ and vice versa.]

One of the reasons to consider the system $\lambda\eta$ is that it has a certain property of completeness… [Namely, in the sense of the following theorem.]

Theorem. Let $M$ and $N$ both have a normal form. Then, either $\lambda\eta \vdash M = N$, or $\lambda\eta + M = N$ is inconsistent…

HP-complete [after Hilbert–Post] theories correspond to maximum consistent theories in theory of models for the first-order logic.

8

Just to add to Andrej's very good answer: the theory of untyped $\lambda$-calculus with $\beta$ and $\eta$ reduction rules satisfies some very nice properties:

  • It is consistent in the sense that there are two terms, for example $\lambda x y.x$ and $\lambda x y.y$ that are not $\beta\eta$ equivalent. This is a consequence of the confluence theorem for $\beta\eta$ reduction.

  • It is a maximal consistent theory in the following sense: if $\iota$ is an
    equivalence relation on terms such that:

    1. It is closed by congruence: $u\ =_\iota\ v \Rightarrow t\ u\ =_\iota\ t\ v$ etc.

    2. It equates 2 non $\beta\eta$ equivalent terms that are normal forms: there exists $t$ and $u$ in normal form such that $t=_\iota u$ and $t\neq_{\beta\eta}u$.

Then the theory is inconsistent: for every term $t$, $u$ in normal form, $t=_{\beta\eta\iota}u$.

This is a consequence of Böhm's theorem.

cody
  • 13,861
  • 1
  • 49
  • 103
7

$\eta$-reduction captures the notion of extensionality - two functions are considered equal iff they give the same outputs on the same inputs.

One way of formalizing this notion is the following: if we consider the relation $=_{\beta \eta}$, the transitive-reflexive closure of relation $\rightarrow_{\beta\eta}$, it is natural to characterize this relation in terms of inference rules of an equational theory (e.g.: rules of the form: if $M=N$, then $\lambda x.M = \lambda x.N$ and so on - characterizing $=_{\beta}$ requires about 7 rules of this kind).

Now, replacing $=_{\beta}$ with $=_{\beta\eta}$ amounts to introducing the axiom $\lambda x. Mx = M$, which is equivalent to the extensionality rule: if $Mx=Nx$, then $M=N$. This is exactly the notion that two functions equal on all input arguments should be considered identical.

Marcin Kotowski
  • 2,806
  • 19
  • 25
  • It is false that extensionality follows from $\eta$-rule. – Andrej Bauer Sep 19 '11 at 00:00
  • See Theorem 2.1.29 in the monograph by Barendregt (Lambda Calculus and its Semantics, 1985). –  Sep 19 '11 at 05:17
  • 2
    @Anton: I suppose I am not too happy about the $\xi$-rule. – Andrej Bauer Sep 19 '11 at 13:02
  • And I am in turn not too happy that happiness and “heard of”-like answers gain more attention than direct relevant quotes with the corresponding references. –  Sep 20 '11 at 11:34
  • @Anton: It's a popularity contest, didn't you know? ;-) Anyhow, what's with the $\xi$-rule that gets used in Barendregt. I don't recall anyone dragging the $\xi$-rule into discussion. We only have $\alpha$ and $\beta$. – Andrej Bauer Sep 20 '11 at 14:24
  • I have updated my answer so that it includes Anton's reference to the proof, and points out that the $\xi$-rule gets used in it. Thanks to Anton. – Andrej Bauer Sep 20 '11 at 14:36
  • @AndrejBauer: (ξ) is not a conversion rule, it is nothing to do with (β) and (η). (ξ) says only that if M = N, then λx.M = λx.N. The other similar construction rule is not named but is as follows: if M = P and N = Q then M N = P Q. –  Sep 20 '11 at 15:09
  • @Anton: Yes, I figure that out. I have nothing against $\xi$. I updated my answer again, hopefully now I explained what I wanted to say clearly enough. Thanks for your comments and for not just upvoting my answer! – Andrej Bauer Sep 20 '11 at 21:19