23

Note: I myself am not a math educator, though I plan to be one someday.

In this letter, Donald Knuth suggests an alternate way of teaching calculus, based on big-O (introduced via a related big-A notation). He says "it would be a pleasure for both students and teacher if calculus were taught in this way." Is his suggestion a good idea? Would it be easier for students to understand? How much of an improvement, if any, is this over the traditional way of teaching it?

(Since I am new to this site, please forgive me if this question is too subjective or if it doesn't meet the site's standards.)

  • Note the letter was sent March 1998, see published version here: http://www.ams.org/notices/199806/commentary.pdf –  Jun 08 '15 at 00:35
  • 3
    @MattF. That is a shorter version; at the end it gives a link to "further details". The link is slightly broken, but if you get rid of the / at the end it takes you to the thing in the OP. – Akiva Weinberger Jun 08 '15 at 00:37
  • More of a question than a comment, but, did anyone ever write Knuth's book ? – James S. Cook Jun 08 '15 at 02:14
  • @JamesS.Cook Don't believe so. (I suppose the textbook needs to be written before this can be taught. You'd need to, for example, know how to prove the Fundamental Theorem of Calculus using the $o$ or $O$ notation — probably doable, but you'd want to know how to do it when starting the course. That would be much easier if you had a textbook with you.) – Akiva Weinberger Jun 08 '15 at 02:30
  • I TAed an accelerated calculus course and they introduced Big-O notation when doing L'Hospital's rule. It's somewhat useful, I think, for these purposes, but overall, it is not incredibly useful in the grand scheme of a calculus sequence. However, it is very useful for application purposes - gives you an idea of how big your remainder is after approximation, etc. I think students haphazardly absorb this idea, even without specifically mentioning Big-O notation, particularly when doing series. I'm not sure that going out of your way to mention it makes all that big of a difference. – Cameron Williams Jun 08 '15 at 03:15
  • 2
    Asymptotic analysis is a dying art form these days since mathematics has moved further and further away from the somewhat handwavy foundations of asymptotic analysis. It's been somewhat relegated to the realms of computer science (algorithm complexity) and numerical analysis. I've used Big-O ideas in proofs myself in pure math (analysis) papers but it's not incredibly common in a math curriculum or in practice. – Cameron Williams Jun 08 '15 at 03:20
  • 4
    Advice for "someone who plans to be a math educator" ... Follow the textbook. Any small addition or change you make will confuse far more students than it will help. That said: experienced teachers of beginning calculus may have useful input to this question. – Gerald Edgar Jun 08 '15 at 14:08
  • @GeraldEdgar Thank you for the advice! – Akiva Weinberger Jun 08 '15 at 14:25
  • 2
    @columbus8myhw What is the goal of a calculus class? We cannot answer whether teaching big-O assists that goal without knowing what it is. Many courses have many different goals. Some are focused on assimilating the content as efficiently as possible. Some are focused more on learning how to learn, and the particular content is not so important. I am sure there are many other goals as well (perhaps the most common being "act as a filter to med school"). – Steven Gubkin Jun 08 '15 at 17:13
  • This is not an answer to the question, but alternative approaches that avoid limits tend just to replace one difficulty with another, usually at least as great. If I remember correctly, Marsden and Weinstein's book Calculus Unlimited uses an approach that defines the derivative of a function $f(x)$ at a point by comparing the function to linear ones in a neighbourhood of that point. The technical difficulty ends up being about the same as if you worked rigorously with limits. During the New Math period in France, the definition of the derivative was along Knuth's lines... – Keith Jun 13 '15 at 03:59
  • 1
    and from a pedagogical perspective it was unsuccessful. A typical 1971 textbook, by Cossart and Théron, has the following definition: "Let $f$ be a function defined on an open interval containing a number $x_0$. If there exists a real number $a$ and a function $\epsilon$ such that $f(x_0 + h) - f(x) = (a + \epsilon(h))h$ where $\epsilon(h)$ tends to zero when $h$ tends to zero, we say that the function $h \mapsto ah$ is the tangent linear function to $f$ at $x_0$." An excerpt from the textbook can be found on page 162 of this document: http://jpdaubelcour.pagesperso-orange.fr/chapitre4.pdf – Keith Jun 13 '15 at 04:08
  • Probably, the point of this was to emphasize the idea of linear approximation, and also to pave the way for multivariable calculus, where a similar definition of differential is used. Just a correction: this is not exactly what Knuth is talking about, as it uses $o$, not $O$. – Keith Jun 13 '15 at 04:09
  • 1
    @Keith (By the way, I'm curious if I could get a more intuitive feel for his "strong derivate" — the type with big $O$. What sort of functions have a derivative but not Knuth's strong derivative?) – Akiva Weinberger Jun 14 '15 at 01:42
  • @columbus8myhw One thing I would point out is that having a strong derivative in Knuth's sense at a point is something intermediate between being differentiable at that point and being twice differentiable at that point. An example of a function that is differentiable at $0$, but not strongly differentiable at $0$, would be $f(x) = x^{3/2} \sin(1/x)$ for $x > 0$, with $f(x) = 0$ for $x \leq 0$. – Keith Jun 14 '15 at 04:45
  • This Coursera course uses O notation (first appears in video 1. Derivatives around 4:20). It's hard to say how much the approach helps there because the course seems targetted at people who already know some calculus. – Beni Cherniavsky-Paskin Jun 22 '18 at 13:14
  • More likely to hurt than to help (more complicated). Motivation for this stuff is from logicians wanting to be fussy and exact, not from teachers wanting to teach. Neglects any insight into human learning patterns. Furthermore, it will not gibe well with other courses in diffyQs or engineering or the like, that don't use this notation. – guest Oct 12 '18 at 05:17

2 Answers2

13

This is not a complete answer, but who can say for sure what is ``the'' best way to teach calculus, or if there is a best way?

I am very tempted myself toward using big-O and small-o notations, and I have done so recently (I had to teach Taylor series to first-year undergraduates). However, I tend to think that ensuring student have a firm grip on simpler notation first might be better.

The main quality of these notations is also what make them difficult to understand to student : an expression like $o(x^2)$ is written in exactly the same form as $f(x^2)$ where $f$ is a function, but has a quite different interpretation. It is in fact pretty difficult to grasp that it means that there is some function $\varepsilon$ with certain properties (partly implicit unless on uses the more cumbersome $o_{x\to 0}(x^2)$) such that $o(x^2) = \varepsilon(x)$, but that each time when the notation appears, the function $\varepsilon$ is different.

The following confusion can then easily appear: $\sin x= x + o(x^2)$ and $\cos x= 1-x^2/2 + o(x^2)$ so $$\sin x -\cos x = x +o(x^2) -1+x^2/2 -o(x^2) = x^2/2 + x -1,$$ right?

Now, this could be overcome, and it might be a good idea to teach student how to learn such a tricky notation; but one should keep in mind they tend to have difficulties with the notions of variable, unknown, function, image of a number by a numerical function, etc. and that has to be taken into account before burdening them with an additional, subtle notations.

Benoît Kloeckner
  • 9,089
  • 27
  • 57
  • 2
    Yes, I suppose that's true. There's a risk that $o$ and $O$ become "this weird symbol with inconsistent rules that make absolutely no sense" in their minds. (Same thing with Knuth's $A$.) And if the notation is so central to the subject… – Akiva Weinberger Jun 08 '15 at 11:22
  • 3
    I'd add that Knuth's explanation of the derivative includes the phrase "whenever $\epsilon$ is sufficiently small", which also creates difficulties. –  Jun 08 '15 at 12:54
  • 1
    I think it's a mistake to throw out the big-O / little-o concept just because some people use terrible notation for it! Big-O reasoning is all about pre-ordering functions by their growth rates, so you can use notation like $x^3 \preccurlyeq e^x$ where one might traditionally write $x^3 \in O(e^x)$. Little-o notation is all about whether functions stay roughly the same size as you approach a certain point, so you can say "$\sin x \underset{0}{\sim} x$ (compared to $x^2$)" instead of "$\sin x - x \in o(x^2)$ at zero." – Vectornaut Jun 11 '15 at 17:43
  • You can see a trivial example of the $\preccurlyeq$ notation in use at http://math.stackexchange.com/a/1114348/16063. I use the $\sim$ notation in private when thinking about asymptotic series, and I find it very convenient. – Vectornaut Jun 11 '15 at 17:44
  • @columbus8myhw, I think the awful notation is totally irrelevant to the subject. My previous comments attempt to justify this... – Vectornaut Jun 11 '15 at 17:45
  • @Vectornaut True enough. What about his $A(x)$, which he uses to introduce $O(x)$ (or $\preccurlyeq$)? – Akiva Weinberger Jun 11 '15 at 18:21
  • @columbus8myhw: Personally, I find the $A$ notation rather silly. At the heart of it, Knuth is doing something which I think is useful: getting students ready for $\preccurlyeq$ by recalling the familiar relation $\le$. But he has a problem, because he's not going to use the symbol $\preccurlyeq$; he's going to use the weird and confusing $O$ notation. As a result, to make the analogy he wants, he has to introduce the weird and confusing $A$ notation for the familiar $\le$ relation. If anything, I think $A$ notation is a great demonstration of how ridiculous $O$ notation is. – Vectornaut Jun 11 '15 at 19:15
  • @Vectornaut I'm not so sure. On the first page, he has some examples that are easier to write in terms of $A$ than $\le$. – Akiva Weinberger Jun 11 '15 at 19:18
  • 2
    @Vectornaut: big-O an small-o notations are far from being ridiculous, they are very efficient notation. They are, in my opinion, far better than other notations like $\preccurlyeq$ exactly thanks to the possibility of adding a $o(f(x))$ term to an expression (writing and computing Taylor series without them would be very cumbersome !) for me to even think about it). This use as terms makes them both extremely powerful in computations, and very subtle at first. – Benoît Kloeckner Jun 11 '15 at 19:37
  • @BenoîtKloeckner "… adding a $o(f(x))$ term…" For example, Knuth's definition of the derivative. – Akiva Weinberger Jun 11 '15 at 19:40
  • @BenoîtKloeckner, I've never found writing Taylor series without $o$ to be cumbersome. In fact, as I mentioned earlier, I privately use $\sim$ in place of $o$ because for me it's the easiest way to write asymptotic series: saying that $e^x \underset{0}{\sim} 1 + x + \frac{1}{2}x^2$ (mod $x^2$) feels very natural to me. – Vectornaut Jun 11 '15 at 23:35
  • @BenoîtKloeckner, I agree that the ability to shuffle copies of $o(f)$ around in an expression is very useful. As far as I can tell, it works because, for any point $p$, the set $o_p(f)$ is an ideal in the ring of smooth functions on whatever manifold you're on. As you said, this is an extremely powerful and subtle point of view, and it's not one I would attempt to inflict on first-year calculus students. – Vectornaut Jun 11 '15 at 23:38
  • @BenoîtKloeckner: Incidentally, though I say Knuth's $A$ notation is ridiculous, I often use it in calculations. I think it's silly pedagogically because I've learned that what seems efficient and straightforward to me may not always seem so to my students. – Vectornaut Jun 11 '15 at 23:47
  • @Vectornaut: it seems we agree more than I thought: the notation is very good when mastered, and very misleading when poorly understood. I would advise against your use of $\sim$, which at least in France is used to denote equivalence (i.e. $f\sim_x g $ when $f=g+o_x(g)$). – Benoît Kloeckner Jun 12 '15 at 07:00
  • @BenoîtKloeckner: Oh, that's a good point! I see the same use of $\sim$ on English Wikipedia, so it must be common in English-speaking places too. Fortunately, there's no shortage of symbols for equivalence relations. In a teaching context, I would likely use $\approx$ anway, since students may be familiar with it, and it will mean essentially what they think it means. – Vectornaut Jun 12 '15 at 08:21
5

I'm skeptical Knuth's scheme will be much of an improvement.

I think the problem with teaching any kind of rigorous definition of limits or derivatives to the vast majority of North American students isn't going to go away by streamlining the proofs a bit.

The problem isn't the technical difficulty of the proofs so much as the idea of proof itself, and even the idea of working from precise definitions. Students have nothing that prepares them for this in their background, because from elementary school all through high school the idea of justifying statements is treated as unimportant. This is true even when the justification would be simple, such as for the rule $\log AB = \log A + \log B$. I've taught from textbooks where the authors ease their consciences by having a page titled "Proofs in Mathematics" at the end of each chapter, as if mathematics without proofs could have any meaning.

As for the technical merits of having a long discussion of $O$ notation as a prelude, let me first say that Knuth misstates things slightly. $O(f(x))$ would have to be defined as a function that is $CA(f(x))$ for some constant $C$ on some $x$-interval around the point of interest. Perhaps this is what he means when he later says "for $\epsilon$ sufficiently small." In any case, that introduces the dependence on the variable $x$, which makes this concept not much easier than defining the statement $\lim_{h \to 0} g(h) = 0$ (which itself leads directly to the definition of finite limits in general, by the condition $\lim_{h \to 0} [g(h)- l] = 0$).

And even if one were to present rigorous proofs of the laws on strong derivatives, is this really better than the traditional proofs using the limit laws? Students always find the limit laws very plausible, and once those are accepted, then the proofs of the differentiation laws become accessible, and can be particularly convincing if you write $\Delta x$ for $x - a$, $y$ for $f(x)$ and $\Delta y$ for $f(x) - f(a)$. The proofs of the differentiation laws might become more rigorous in Knuth's scheme, but I am inclined to believe that they would also be made considerably less transparent.

Ultimately, presenting proofs that are meaningful to students at their level of understanding will do more to foster a healthy respect for the idea of proof than will giving proofs that are technically irreproachable but too demanding technically. I also believe the few students who could handle the $O$-proofs would probably manage with the traditional rigorous approach anyway.

If there is any advantage to Knuth's method, it might be that of avoiding discussion of limits early in a calculus course, when you would otherwise bore students with things that seem pointless to them, like $\lim_{x \to 3} (x^2 + 5x) = 24$. You derive these results at first using the limit laws, though every example they see at first is obtained by simply substituting a value of $x$ in the expression (perhaps after simplification).

But limits need to be dealt with eventually; statements such as $\lim_{x \to 0} \frac{\sin x}{x} = 1$ or $\lim_{x \to +\infty} \frac{\ln x}{x} = 0$ are intrinsically interesting and necessary.

The real problem with limits is how to teach them in such a way that students will understand from the start that there's more to them than just substituting a value in.

The answer is probably to de-emphasize limits initially by treating them very briefly, with few exercises on them, developing only enough technical proficiency to accurately compute derivatives of rational functions, computed as limits of the form $\lim_{h \to 0} \frac{1}{h}[R(x+h) - R(x)]$, and without requiring much theory to be understood. Then return to limits when it is time to talk about less trivial ones, after they have already seen the usefulness of the concept demonstrated in the theory of derivatives. The meaning of the limit laws becomes much easier to grasp when they are used in a way that is not obvious. Continuity can be postponed until this point as well.

Keith
  • 587
  • 2
  • 7