User:IssaRice/Proof that assumes the trick

From Machinelearning
Jump to: navigation, search

Many proofs in mathematics depend on one or two "tricks". Some proofs (or ways of writing the proof) seem to deliberately hide or assume the trick so that the proof, while valid, is not very useful. (To the expert mathematician, the proof is obvious/they could have done it themselves, so what is the point of writing the proof? To the novice, the trick is assumed even if one might not know it; but the trick is what makes the proof work, so what is the point of seeing a proof that does not explain it?)

Consider Pugh's proof of the product rule for differentiation:

Since \Delta(f\cdot g) = \Delta f\cdot g(x + \Delta x) + f(x) \cdot \Delta g, continuity of g at x implies that
{\displaystyle \frac{\Delta(f \cdot g)}{\Delta x} = \frac{\Delta f}{\Delta x} g(x+\Delta x) + f(x)\frac{\Delta g}{\Delta x} \to f'(x)g(x) + f(x)g'(x)}

What Pugh means is this:

On the one hand, \Delta(f\cdot g) := (f\cdot g)(x + \Delta x) - (f\cdot g)(x) = f(x + \Delta x)g(x + \Delta x) - f(x)g(x).

We also have \Delta f := f(x + \Delta x) - f(x) and \Delta g := g(x+\Delta x) - g(x) so on the other hand we have

{\displaystyle \begin{align}\Delta f\cdot g(x + \Delta x) + f(x) \cdot \Delta g &= (f(x + \Delta x) - f(x))\cdot g(x + \Delta x) + f(x) \cdot (g(x+\Delta x) - g(x)) \\ &= f(x + \Delta x)g(x + \Delta x) - f(x)g(x + \Delta x) + f(x)g(x+\Delta x) - f(x)g(x) \\ &= f(x + \Delta x)g(x + \Delta x) - f(x)g(x)\end{align}}

so the two are indeed equal.

What is the "trick" in this proof? Well, how would we have discovered that \Delta(f\cdot g) = \Delta f\cdot g(x + \Delta x) + f(x) \cdot \Delta g if we didn't have Pugh to tell us? The trick is to add and subtract the same thing. We have

\begin{align}f(x + \Delta x)g(x + \Delta x) - f(x)g(x) &= f(x + \Delta x)g(x + \Delta x) + \underbrace{(-f(x)g(x + \Delta x) +f(x)g(x + \Delta x))}_{=0}  - f(x)g(x) \\ &= g(x+\Delta x)(f(x+\Delta x) - f(x)) + f(x)(g(x+\Delta x) - g(x))\end{align}

And this is exactly the equality that Pugh gives.

I think what Pugh intended is for the reader to write out the expression on the right and try to simplify it; then they would see the terms canceling out and would see the trick that way. You might even complain that my "discovery" of the trick is just writing out this process in the reverse order. That is certainly literally true, but I think explicitly calling attention to the trick and talking about it adds it to the student's "tricks repository" so that they are more likely to be able to use the trick in their own proofs. In the opposite direction, it's just a routine cancellation and you would just forget about it after the proof.