40

Recently I was preparing an undergrad-level proof of (a form of) the Jordan Curve Theorem, and I had forgotten just how much work is involved in it. The proof stored my head was just using Alexander duality plus some sanity-checks on the topology of the curve in question, which is a fine approach but does require a bit of algebraic topology machinery my audience didn't have access to. The more elementary proof (or at least the one I landed on) has a straightforward idea behind it, but turning that into a proper argument required slogging through quite a few details about polygons, regular neighborhoods, etc. Similarly, I couldn't help noticing just how much of a pain it is to prove the 2- and 3-dimensional versions of Stokes' theorem without some notion of manifolds, let alone the usual Stokes' theorem in some suitable setting.

Those are both elementary examples, but it got me thinking about the general topic of results that have very rough proofs from more elementary principles but have much clearer and smoother proofs with some more advanced background. Specifically, what are some examples of results from more advanced or narrow topics of mathematics can vastly simplify or explain in retrospect theorems that are encountered and proved laboriously in less specialized or more common areas of math? (If it helps clarify what I'm trying to get at, another example in my mind is May's "Concise Course in Algebraic Topology," which I think of as having the premise of, "So, now that you've gone through the standard intro algebraic topology course, here's what was secretly going on behind the scenes the whole time.")

Wojowu
  • 27,379
anomaly
  • 432
  • 10
    Szemerédi's theorem: the original proof is a very complicated, but elementary, induction; Furstenberg's ergodic theory based proof is much shorter but uses some existing theory. – Sam Hopkins Nov 05 '21 at 15:09
  • 5
    I think many topological results will fit this bill. For instance the fact that $\mathbb R^n$ are not homeomorphic for distinct $n$ has several "elementary" proofs which are rather technical, but can be also proven easily with help of homology. – Wojowu Nov 05 '21 at 15:16
  • 2
    See also https://mathoverflow.net/questions/95837/examples-of-theorems-with-proofs-that-have-dramatically-improved-over-time – Sam Hopkins Nov 05 '21 at 15:36
  • 1
    The existence of free groups can be proven by Freyd's adjoint functor theorem in one line or one can give a construction that is elementary but requires some verifications. – Benjamin Steinberg Nov 05 '21 at 15:36
  • 2
    And also see https://mathoverflow.net/questions/43820/extremely-messy-proofs – Benjamin Steinberg Nov 05 '21 at 15:37
  • 1
    @BenjaminSteinberg, do you know if the "bare-hands" construction of free groups has been formally proved correct? I've always had a hard time writing down a proof that 'feels' rigorous because it's so hard not to gesture at the test for associativity and say: c'mon, see, it's obvious! – LSpice Nov 05 '21 at 15:38
  • 2
    Fundamental theorem of algebra ... see https://mathoverflow.net/questions/10535/ways-to-prove-the-fundamental-theorem-of-algebra Of course some of those "short" proofs may be circular: the FTA was used (implicitly) in building the machinery. – Gerald Edgar Nov 05 '21 at 15:39
  • 3
    @LSpice, yes it can be done. You would never want to prove associativity of $Z/nZ$ by defining it to be ${0,\ldots, n-1}$ and the binary operation of addition followed by taking remainder. Instead, you should take the monoid of all words over X\cup X^{-1} and define an equivalence relation generated by deleting or inserting occurences xx^{-1} and x^{-1}x. Then you can prove that the equivalence classes form a group with the correct universal property in a few lines. Then you can observe that every word has a reduced form and prove by induction on the number of reductions uniqueness. – Benjamin Steinberg Nov 05 '21 at 15:42
  • 2
    @BenjaminSteinberg, thanks! I certainly didn't mean to doubt that it could be done—in fact, I've read it done in 'informal' style. I was just wondering if you knew whether it had been formalised. – LSpice Nov 05 '21 at 15:44
  • 3
    I’m pretty sure you can find it formalized somewhere using the Diamond Lemma, aka Newman's Lemma, aka Church-Rosser condition, aka complete rewriting systems. I have a typed version I give students when I cover free groups – Benjamin Steinberg Nov 05 '21 at 15:48
  • 1
    I don't know if this one really counts, but the original proof of Hopf invariant one theorem by Adams, is very long and complicated, whereas the second proof by Adams-Atiyah is much simpler. Now, in the sense that the original one only uses ordinary cohomology whereas the second one uses K-theory, one could argue that the original proof is "more elementary", but then again, Adams developed a whole machinery for the first one... – user43326 Nov 05 '21 at 15:56
  • @user43326: No, that's exactly the sort of thing I was looking for. I would expect ordinary cohomology but (probably) not $K$-theory to be part of the standard graduate curriculum, and the Hopf invariant one problem is exactly the sort of thing topological $K$-theory was (initially, anyway) desigend to solve. It's also related to the problem of which $\mathbb{R}^n$ are real normed algebras (and other incarnations of the same idea), which is another of those results that, like the Jordan Curve Theorem, are often introduced well before they're actually proved. – anomaly Nov 05 '21 at 16:01
  • 2
    Another example would be a small theorem of Fermat. Well, this isn't too complicated with an "elementary" method, but it is completely trivial if we use the fact that the order of an element of the group divides that of the group. – user43326 Nov 05 '21 at 16:36
  • 1
    A couple more examples from algebraic topology
    1. Computation of stable homotopy groups of spheres in certain range. Toda, Cartan et al. used "elementary" methods to do a long computation, which more or less becomes trivial with Adams/Novikov-Adams spectral sequence. (Of course, in lower dimensions "elementary" methods is just as quick, and in high enough dimensions, "modern" methods aren't simple.

    2. computation of (co)homology of certain classifying space/loop space. Sometimes one can do with Serre SS, which is more elementary but longer than computation by bar/cobar SS.

    – user43326 Nov 05 '21 at 16:42
  • 1
    Here is one example of which I am not sure if there really is an "elementary proof", but there should be. The canonical surjection of groups $Gl(2,Z)\to Gl(2,Z/p)$ doesn't admit a section unless $p=2$. I don't know how one can prove this without using representation theory. (But of course, even with the representation theory, one still has to work out the conjugacy classes of $Gl(2,Z/p)$ anyway). – user43326 Nov 05 '21 at 16:48
  • @user43326 That map is only a surjection for $p=2,3$. I guess you meant $SL_2$ instead of $GL_2$? – Will Sawin Nov 05 '21 at 18:07
  • @SamHopkins Don't we have shorter elementary proofs via the density Hales-Jewitt theorem nowadays? – Will Sawin Nov 05 '21 at 18:57
  • 1
    @WillSawin Actually I was thinking of $Gl(2,Z _p)→Gl(2,Z/p)$ where $Z _p$ denotes the $p$-adics. This time have I got it right? – user43326 Nov 05 '21 at 18:59
  • 1
    Another example. Burnside's theorem on solvable groups. – user43326 Nov 05 '21 at 19:02
  • 2
    @user43326 One can prove the limit doesn't exist for $p-1$ since the lift of a unipotent element must have order $p$, which means a nontrivial $p$th root of unity must satisfy a quadratic polynomial over $\mathbb Z_p$, which is impossible for $p>3$ because the cyclotomic polynomial is irreducible (Eisenstein's criterion). For $p=3$ I think the lift does exist. – Will Sawin Nov 05 '21 at 19:04
  • @WillSawin Thank you, I didn't think of using ring/field theory... For p=3, if the lift exists, then it "complexifies" to a discrete series representation, I thought this was impossible, but I am no longer so sure now. – user43326 Nov 05 '21 at 19:39
  • @user43326: I actually did use that in the same context as the original post, as an explanation of why I didn't bother providing many examples of solvable groups. (I also mentioned the Feit-Thompson theorem, which is even further in that direction.) – anomaly Nov 05 '21 at 20:02
  • 6
    Maybe Hindman's theorem? https://mathoverflow.net/questions/360924/proofs-where-higher-dimension-or-cardinality-actually-enabled-much-simpler-proof/360949#360949 – Will Brian Nov 05 '21 at 20:54
  • 3
    It seems to me that the question is problematic. It considers a proof that starts with some deep theory as only including the final step where the theory is applied, discounting the very long proofs required to establish the theory. However, the 'elementary' proof counts everything (in the extreme case, starting with the axioms). Apples and oranges. There is also a danger of circularity; one has to establish that the proof of the deep theory doesn't include a step where this theorem is required. – Brendan McKay Nov 08 '21 at 01:48
  • 1
    There's also the converse: Fermat's Last theorem has a long and advanced proof and a short and elementary one. Or at least so someone claimed... :) – Alessandro Della Corte Apr 21 '22 at 11:34
  • 1
    A simple one: the factorization of $x^3+y^3+z^3-3xyz$ into degree 1 polynomials with the Frobenius Determinant theorem. It's hard to come up with the factorization by trial and error, but straightforward with the theorem applied to $Z/3Z$. – Andrea Marino Apr 21 '22 at 14:04

11 Answers11

44

The associativity of the group law on an elliptic curve can be proved in an elementary way by explicitly manipulating algebraic expressions, but this is not very enlightening. By using more advanced geometric ideas, one can prove associativity more conceptually.

Timothy Chow
  • 78,129
  • 1
    Thanks, that's a great example. If I remember correctly, the approach in Silverman derives it from Riemann-Roch (which I think is what the top answer in the linked question is referring to, though I don't have my copy of Silverman at hand at the moment). Definitely more motivated than just presenting a formula as a fait accompli, and I think elliptic curves are often encountered well before general algebraic geometry. – anomaly Nov 05 '21 at 23:07
16

A famous example is the Abel-Ruffini theorem.

In this video, Fields medallist Richard Borcherds introduces Galois Theory and its background, mentioning that the proof by Ruffini was quite cumbersome and not quite clear, while Abel's was neat and short:

Ruffini had a sort of 500 page proof of it, except no one's really quite sure whether it was a proof of it or not and they sort of suspect it wasn't, and a bit later Abel came along and gave a very clear 6-page proof of it.

Instead of more cumbersome, geometry-based traditional methods (see e.g. this enjoyable YouTube video by Veritasium covering the Italian history of the cubic equation), knowledge on Galois theory makes much easier to see that there is no solution to quintic equations and upwards, by the analogy that their corresponding Galois group is not solvable, i.e. can't be split into Abelian groups.

fr_andres
  • 113
  • 2
    Did V. Arnold not give a nice elementary exposition to the insolubility of the 'general' quintic (i.e. without pinning a concrete quintuple of coefficients)? https://web.williams.edu/Mathematics/lg5/394/ArnoldQuintic.pdf – 5th decile Nov 07 '21 at 16:38
  • Nice! feel free to edit that into the answer, and thanks a lot for sharing – fr_andres Nov 07 '21 at 17:55
  • 1
    My comment is intended to argue that your answer does not give an example where the elementary proof is long or tedious. – 5th decile Nov 07 '21 at 21:24
  • Fair enough. I was thinking chronollogically and not considering elementary proofs that came afterwards. My bad – fr_andres Nov 07 '21 at 21:42
  • Abel’s paper was very short as he paid for the publication fees himself, and was impoverished at the time. He later published a longer version. – Carl-Fredrik Nyberg Brodda Apr 21 '22 at 11:32
  • 1
    Abel also did not use Galois theory, or field extensions, or anything of the sort, in his proof! – Carl-Fredrik Nyberg Brodda Apr 21 '22 at 11:32
  • I didn't read either version, and I can see how my post implies that Abel used Galois theory in his solution. Thanks for the remark – fr_andres Apr 22 '22 at 02:27
11

The famous example of this in my neck of the woods is Gelfand's slick proof, using Banach algebras (which should really be "Gelfand algebras"), of Wiener's theorem that the reciprocal of a nowhere-vanishing function on $\mathbb{T}$ with an absolutely summable Fourier series also has an absolutely summable Fourier series.

Nik Weaver
  • 42,041
6

The number of spanning trees of an $n$-dimensional hypercube graph is $$2^{2^n-n-1}\prod_{j=1}^nj^{n\choose j}.$$ This is a quite tricky graph theory problem to try to prove directly. But it dies almost immediately to Kirchoff's matrix tree theorem, after computing the Laplacian eigenvalues using the discrete Radon transform.

  • Could you say a bit more or give a reference for how the discrete Radon transform is used to compute the Laplacian eigenvalues? I wouldn't recognize a discrete Radon transform if it bit me... – Jim Conant Oct 15 '22 at 07:22
  • 1
    There's a good exposition in the algebraic combinatorics book by Richard Stanley. – Yonah Borns-Weil Oct 15 '22 at 07:26
6

Silver's proof that GCH cannot first fail at a singular cardinal of uncountable cofinality is a great example of this. The original one of Silver uses generic ultrapowers, so requires some nontrivial knowledge of forcing and comfortability with ultrapowers of models of set theory. For example, if we want to prove this for singular cardinals of cofinality $\omega_1$, then we take the forcing notion $P$ of stationary subsets of $\omega_1$, $G$ a $P$-generic over $V$, and so in $V[G]$ we can form the corresponding generic ultrapower $M$, from which we get an embedding $j:V\to M$. A careful, but quite short analysis gives the theorem. It's not too bad.

Alternatively, there is a longer proof that requires more bookkeeping, but requires only the most basic ideas about stationary sets and ultrafilters. A decent bit more work, but requires no knowledge of forcing, ultrapowers, working in different models of set theory, or embeddings.

For a little bit of context (though one would have to check the exact presentation), the proof using generic ultrapowers is less than a page in Jech, and the elementary proof is two and a half. Not an insane difference, but one I find memorable.

Connor W
  • 113
  • I wish I lived in a place with a first-semester undergraduate set theory course which covers stationary sets and ultrafilters! – Wojowu May 06 '23 at 05:03
  • @Wojowu I don't think it's unreasonable. I also hope the implication isn't that the inclusion of this remark in my answer is frowned upon. To be fair, I can count on two hands the number of people I know who even took an undergrad set theory class, and almost all of them did both of these things. – Connor W May 06 '23 at 05:17
  • 2
    @ConnorW: I have to admit that I'm also one of those people who didn't take any undergrad set theory classes, but I did run into ultrafilters as an undergrad in the topological context. – anomaly May 06 '23 at 05:55
  • 2
    @ConnorW I'm not arguing whether it's reasonable or not, but your comment seems to include such a course is a norm, which it definitely isn't. In my first year undergrad I had some basic overview of set theory, up to things like axiom of choice and Zorn's lemma, but no course I took, undergrad or grad, covered ultrafilters or stationary sets. I don't mean to frown upon the remark, but I do think it is just factually incorrect. – Wojowu May 06 '23 at 18:16
  • @Wojowu Fair enough! I'll edit for that reason. Thanks for the clarification. – Connor W May 06 '23 at 19:20
5

Resolution of singularity of curves (over algebraically closed fields). A somewhat long elementary proof is given e.g. in chapter 7 of Fulton's Algebraic Curves (which he made available online for free). Basically you use the primitive element theorem to reduce it to the planar case. And then for a planar curve you track the changes in multiplicity of a point under blow-ups via explicit computation. A more abstract approach, given for example in chapter I.6 of Hartshorne's, identifies a nonsingular curve with the set of discrete valuations of a function field, and shows that any projective morphism from an open subset of a nonsingular curve can be extended to the whole curve.

pinaki
  • 5,064
  • 1
    nice example, and maybe Zariski's approach, that taking the integral closure of the affine ring, desingularizes the curve in one step. this is also an illustration of how much information is lost in the short proof. i only felt I understood something about singularities of curves after going through Walker's detailed proof by quadratic transforms. – roy smith May 07 '23 at 04:26
5

The original proof of the Stone-Weierstrass theorem was elementary enough that Stone could present it in a lengthy article in Mathematics Magazine. There is a much shorter proof due to de Branges that uses the Krein-Milman theorem, the Hahn-Banach theorem, and the Riesz representation theorem.

  • 2
    Eh, I don't think the elementary proof needs to be as complicated as Stone makes it out to be. I give a two-page elementary proof in this book, which I think is actually simpler than de Branges' proof. – Nik Weaver Apr 21 '22 at 21:41
  • @NikWeaver: At the elementary level, the proof is also simpler if you restrict the result to compact metric spaces, where it follows reasonably straightforwardly from the ordinary Weierstrass Approximation Theorem. – anomaly Apr 22 '22 at 15:50
5

results that have very rough proofs from more elementary principles but have much clearer and smoother proofs with some more advanced background. Specifically, what are some examples of results from more advanced or narrow topics of mathematics can vastly simplify or explain in retrospect theorems that are encountered and proved laboriously in less specialized or more common areas of math

One alternative to proofs being long and complicated because they use only elementary methods is proofs being numerous because they use only elementary methods.

enter image description here

In this image,

  • $AB$ is any chord of a parabola,
  • $AC$ is parallel to the axis of the parabola,
  • $BC$ is tangent to the parabola at $B.$

$D$ is the midpoint of $AC$ and we view it as the fulcrum of a lever $JB.$

$D$ is also the midpoint of $JB.$

$EH$ is any straight section of the triangle $ABC$ that is parallel to $AC.$

Proposition: If the weight of the segment $EH$ rests upon the lever at $G$, and the weight of the segment $EF$ rests at $J,$ then the lever is in equilibrium. (This is just the equation of the parabola.)

Therefore, if the weight of the region bounded by the triangle $ABC$ rests at its center of gravity $I$ and the weight of the region bounded by the curve and the chord $AB$ rests at $J,$ then the lever is in equilibrium.

Bottom-line conclusion: Since it is known that the center of gravity $I$ is one-third of the way from $D$ to $B,$ the area of the triangle is exactly three times the area of the region bounded by the curve and the chord.

Archimedes wrote that argument. Today we write $\int_0^1 x^2\,dx=\frac13$ and we use an antiderivative to show that. I don't think Archimedes used antiderivatives.

By similar methods Archimedes showed that the center of gravity of the northern hemisphere is $5/8$ of the way down from the north pole to the center of the earth. For this he considered the line through the center of the earth and the north pole to be a lever with its fulcrum at the north pole, and two of the infinitesimal weights that he had resting on it were a cross-section of the sphere parallel to the equator, and the cross-section in that same plane, of a cone whose apex is the north pole and whose base is bounded by the equator. (I don't remember the details but I think he let the cross-section of the sphere rest at its intersection with the lever and that smaller cross-section rest at a point whose distance above the north pole is the diameter of the sphere.)

In his "mechanical method" he proved 15 (thus "numerous" (to quote the term I used above)) propositions in geometry by doing things like these, finding areas and volumes and centers of gravity.

(I think maybe no one before Archimedes ever thought about centers of gravity.)

Michael Hardy
  • 11,922
  • 11
  • 81
  • 119
5

Perron-Frobenius theorem. Each of both version answers the question.

Weak version : If $A\in{\bf M}_n({\mathbb R})$ has non-negative entries, then the spectral radius $\rho(A)$ is an eigenvalue, associated with an eigenvector in ${\mathbb R}_+^n$.

The elementary proof is a consequence of of the strong version, applied to $A+\epsilon In$, passing to the limit as $\epsilon\to0+$. The short proof observes that the set $$K=\{x\in{\mathbb R}_+^n|\sum_ix_i=1,\,Ax\ge\rho(A)x\}$$ is non-void, convex, compact, and the application $$x\mapsto\frac1{\sum_i(Ax)_i}\,Ax$$ maps $K$ into itself (if the map is ill-defined then $\rho(A)=0$ and the result is trivial). Then apply Brouwer's theorem.

Strong version : If the entries are $>0$, then $\rho(A)$ is a simple eigenvalue and the eigenvector has $>0$ coordinates.

Short proof: the same map as above is a strict contraction of the simplex $$K=\{x\in{\mathbb R}_+^n|\sum_ix_i=1\}$$ for the Hilbert distance !

Jochen Glueck
  • 11,625
Denis Serre
  • 51,599
  • Hmm, I don't know. I'm aware that those nonlinear proofs of PF are popular and I also find they provide an interesting perspective, but I'm not convinced that they are really significantly shorter or easier than more elementary and linear proofs. For instance, here is a very simple proof of the weak version that does not use the strong version: [...] – Jochen Glueck May 08 '23 at 08:22
  • Let $\lambda$ be an eigenvalue of maximal modulus and $z \in \mathbb{C}^n$ a corresponding eigenvector. The Neumann series gives $\lvert (r\lambda - A)^{-1}z \rvert \le (r \rho(A)-A)^{-1} \lvert z \rvert$ for each $r > 1$, so the resolvent of $A$ explodes at $\rho(A)$; hence $\rho(A)$ is an eigenvalue. To find a nonnegative eigenvector, choose a vector $y \in \mathbb{R}^n_+$ and numbers $s_n \downarrow \rho(A)$ such that the norm of $(s_n - A)^{-1} y$ converges to $\infty$. Normalizing this sequence and choosing a convergent subsequence gives an eigenvector for $\rho(A)$ in $\mathbb{R}^n_+$. – Jochen Glueck May 08 '23 at 08:23
  • 1
    By the way (my apologies for being nitpicky), the proof of the strong version via Hilbert's distance needs two additional steps in the end to give the result as stated: the Hilbert distance argument only yields that there are no other eigenvectors in $\mathbb{R}^n_+$; to get geometric simplicity one needs an additional argument to rule out the existence of other eigenvectors. A further argument (for instance, using the growth of the matrix powers) is needed to also get algebraic simplicity. – Jochen Glueck May 08 '23 at 08:37
4

Brouwer's theorem is immediately obtained as a 'by-product' of the development of integration on manifolds. The 'elementary' alternative is to prove and use Sperner's Lemma, which (to me at least) seems a more tedious road.

5th decile
  • 1,461
  • 9
  • 19
  • 4
    While Sperner's Theorem certainly works, isn't the standard proof of Brouwer via noting that $\text{id}:H^{n-1}(S^{n-1}) \to H^{n-1}(S^{n-1})$ can't factor through $H^n(D^n) = 0$? I'm considering singular rather than de Rham cohomology above, but I think most students encounter de Rham cohomology and differential forms roughly around the same time. – anomaly Nov 07 '21 at 18:11
  • @anomaly That approach and the one via Stokes theorem are closely related. – 5th decile Nov 07 '21 at 18:33
  • @anomaly Students going towards applied analysis will know Brouwer's theorem, as it leads them to the Schauder fix point theorem, which is used a lot to prove existence of solutions for all kinds of problems. But usually they know nothing about cohomology (at least not under the name) and may only have seen differential forms once from a distance. So to them the standard proof often is the one via Stokes theorem, though written explicitly, without differential forms. – mlk Nov 07 '21 at 18:59
  • @ThibautDemaerel: Of course--- and maybe the equivalence of de Rham and singular cohomology is another example of the sort of thing I was talking about the original post, at least in terms of integration of forms to detect nonzero cohomology classes. – anomaly Nov 07 '21 at 23:15
  • 1
    While slightly longer, the proof via Sperner's lemma is of surpreme beauty. And I say this as an algebraic topologists without any particular combinatorial leanings. – Lennart Meier May 08 '23 at 07:40
3

I think a good example is the 1955/1958 Adian-Rabin theorem. This says that "given a finite presentation of a group, one can deduce almost nothing about the properties of the group". For example, the problem of deciding whether a group is trivial from its presentation is undecidable.

The first proof, by Adian, of this theorem is 60 pages long and uses many intricate -- but entirely elementary -- reasoning from combinatorial semigroup theory. By contrast, more modern proofs, which go via HNN-extensions, and which can arguably be called "advanced" (at the time), are much shorter, requiring little more than 2 pages to work out the full details (at least modulo the Novikov-Boone theorem).

Adian's proof, comprising 4 papers, can be found in English translation here. This also contains my summary of the articles and some historical background.

It is worth noting that both Adian's and Rabin's proofs use HNN-extensions implicitly, and the three proofs (Adian, Rabin, modern) are all the same in spirit -- the toolbox of general results available to a modern proof makes this more advanced as well as shorter.