|
|
| Line 117: |
Line 117: |
| ===Main idea=== | | ===Main idea=== |
|
| |
|
| Let <math>(x_n)_{n=1}^\infty</math> be a sequence in <math>X \setminus \{x_0\}</math> that converges to <math>x_0</math>. Then we want to write | | Let <math>(x_n)_{n=1}^\infty</math> be a sequence taking values in <math>X \setminus \{x_0\}</math> that converges to <math>x_0</math>. Then we want to write |
|
| |
|
| <math display="block">\frac{g(f(x_n)) - g(f(x_0))}{x_n - x_0} = \frac{g(f(x_n)) - g(f(x_0))}{f(x_n) - f(x_0)} \cdot \frac{f(x_n) - f(x_0)}{x_n - x_0}</math> | | <math display="block">\frac{g(f(x_n)) - g(f(x_0))}{x_n - x_0} = \frac{g(f(x_n)) - g(f(x_0))}{f(x_n) - f(x_0)} \cdot \frac{f(x_n) - f(x_0)}{x_n - x_0}</math> |
| Line 125: |
Line 125: |
| ===Proof=== | | ===Proof=== |
|
| |
|
| Let <math>(x_n)_{n=1}^\infty</math> be a sequence in <math>X \setminus \{x_0\}</math> that converges to <math>x_0</math>. | | Let <math>(x_n)_{n=1}^\infty</math> be a sequence taking values in <math>X \setminus \{x_0\}</math> that converges to <math>x_0</math>. |
|
| |
|
| Define a function <math>\phi : Y \to \mathbf R</math> by | | Define a function <math>\phi : Y \to \mathbf R</math> by |
| Line 139: |
Line 139: |
| for all <math>x_n</math>. (Why? Consider the cases <math>f(x_n) = f(x_0)</math> and <math>f(x_n) \ne f(x_0)</math> separately.) | | for all <math>x_n</math>. (Why? Consider the cases <math>f(x_n) = f(x_0)</math> and <math>f(x_n) \ne f(x_0)</math> separately.) |
|
| |
|
| Differentiability of <math>g</math> at <math>f(x_0)</math> says that if <math>(y_n)_{n=1}^\infty</math> is a sequence in <math>Y \setminus \{y_0\}</math> that converges to <math>f(x_0)</math>, then <math>\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} \to g'(f(x_0))</math> as <math>n \to \infty</math>. What if <math>(y_n)_{n=1}^\infty</math> is instead a sequence in <math>Y</math>? Then we can say <math>\phi(y_n) \to g'(f(x_0))</math> as <math>n\to\infty</math>. To show this, let <math>\epsilon > 0</math>. Now we can find <math>N \geq 1</math> such that <math>\left\vert\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} - g'(f(x_0))\right\vert \leq \epsilon</math> for all <math>n \geq N</math>. But this means if <math>n \geq N</math>, then we have two cases: either <math>y_n \in Y</math> and <math>y_n \ne f(x_0)</math>, in which case <math>|\phi(y_n) - g'(f(x_0))| \leq \epsilon</math> as above, or else <math>y_n = f(x_0)</math>, in which case <math>\phi(y_n) = g'(f(x_0))</math> so <math>|\phi(y) - g'(f(x_0))| = 0 \leq \epsilon</math>. | | Differentiability of <math>g</math> at <math>f(x_0)</math> says that if <math>(y_n)_{n=1}^\infty</math> is a sequence taking values in <math>Y \setminus \{y_0\}</math> that converges to <math>f(x_0)</math>, then <math>\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} \to g'(f(x_0))</math> as <math>n \to \infty</math>. What if <math>(y_n)_{n=1}^\infty</math> is instead a sequence taking values in <math>Y</math>? Then we can say <math>\phi(y_n) \to g'(f(x_0))</math> as <math>n\to\infty</math>. To show this, let <math>\epsilon > 0</math>. Now we can find <math>N \geq 1</math> such that <math>\left\vert\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} - g'(f(x_0))\right\vert \leq \epsilon</math> for all <math>n \geq N</math>. But this means if <math>n \geq N</math>, then we have two cases: either <math>y_n \in Y</math> and <math>y_n \ne f(x_0)</math>, in which case <math>|\phi(y_n) - g'(f(x_0))| \leq \epsilon</math> as above, or else <math>y_n = f(x_0)</math>, in which case <math>\phi(y_n) = g'(f(x_0))</math> so <math>|\phi(y) - g'(f(x_0))| = 0 \leq \epsilon</math>. |
|
| |
|
| Differentiability of <math>f</math> at <math>x_0</math> implies continuity of <math>f</math> at <math>x_0</math>, so this means that <math>f(x_n) \to f(x_0)</math> as <math>n \to \infty</math>. Since <math>f(x_n) \in Y</math> for each <math>n \geq 1</math>, we can use <math>(f(x_n))_{n=1}^\infty</math> as our sequence in <math>Y</math> to conclude that as <math>n \to \infty</math> we have <math>\phi(f(x_n)) \to g'(f(x_0))</math>. | | Differentiability of <math>f</math> at <math>x_0</math> implies continuity of <math>f</math> at <math>x_0</math>, so this means that <math>f(x_n) \to f(x_0)</math> as <math>n \to \infty</math>. Since <math>f(x_n) \in Y</math> for each <math>n \geq 1</math>, we can use <math>(f(x_n))_{n=1}^\infty</math> as our sequence in <math>Y</math> to conclude that as <math>n \to \infty</math> we have <math>\phi(f(x_n)) \to g'(f(x_0))</math>. |
Using Newton's approximation
Main idea
The main idea of using Newton's approximation to prove the chain rule is that since f is differentiable at
we have the approximation
when
is near
. Similarly since g is differentiable at
we have the approximation
when
is near
. Since f is differentiable at
, it is continuous there also, so we know that
is near
whenever
is near
. This allows us to substitute
into
whenever
is near
. So we get

Thus we get
, which is what the chain rule says.
Proof
We want to show
is differentiable at
with derivative
. By Newton's approximation, this is equivalent to showing that for every
there exists
such that

whenever
. So let
.
Now we do some algebraic manipulation. Write

where
. This holds for every
. Since
we thus have

Similarly write

where
.
Substituting the expression for
in the expression for
we get

we can rewrite this as
Thus our goal now is to show
.
But by the triangle inequality it suffices to show
.
where we are free to choose
.
To get the bound for
(using Newton's approximation), we need to make sure
is small. But by continuity of
at
we can do this.

where again we are free to choose
.
TODO: can we do this same proof but without using the error term notation?
TODO: somehow Folland does this without explicitly using continuity of f; i need to understand if he's using it implicitly somehow or he's actually proving it when bounding
using
old proof
Since
is differentiable at
, we know
is a real number, and we can write
![{\displaystyle g(y)=g(y_{0})+g'(y_{0})(y-y_{0})+[g(y)-(g(y_{0})+g'(y_{0})(y-y_{0}))]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/21a5d9db1491e64d74696727b0c30489937ab5a5)
(there is no magic: the terms just cancel out)
If we define
we can write

Newton's approximation says that
as long as
.
Since
is differentiable at
, we know that it must be continuous at
. This means we can keep
as long as we keep
.
Since
and
, this means we can substitute
and get

Now we use the differentiability of
. We can write
![{\displaystyle f(x)=f(x_{0})+f'(x_{0})(x-x_{0})+[f(x)-(f(x_{0})+f'(x_{0})(x-x_{0}))]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4190c124dc947d09465aa4d7160af300eec39d6c)
Again, we can define
and write this as

Now we can substitute this into the expression for
to get

where we have canceled out two terms using
.
Thus we have
![{\displaystyle g(f(x))=g(y_{0})+g'(f(x_{0}))f'(x_{0})(x-x_{0})+[g'(f(x_{0}))E_{f}(x,x_{0})+E_{g}(f(x),f(x_{0}))]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/cf4b7fbafe4345cba66fea13e53c33287066db54)
We can write this as
![{\displaystyle (g\circ f)(x)-((g\circ f)(x_{0})+L(x-x_{0}))=[g'(f(x_{0}))E_{f}(x,x_{0})+E_{g}(f(x),f(x_{0}))]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ec56373a6b82d2c398e3038088aa869509c66969)
where
. Now the left hand side looks like the expression in Newton's approximation. This means to show
is differentiable at
, we just need to show that
.
The stuff in square brackets is our "error term" for
. Now we just need to make sure it is small, even after dividing by
.
But f is differentiable at
, so by Newton's approximation,

we also have

We can bound this from above using the triangle inequality:

Now we can just choose
small enough.
Limits of sequences
Main idea
Let
be a sequence taking values in
that converges to
. Then we want to write

Now use the limit laws to conclude that the limit is
. The problem is that
can be zero even when
.
Proof
Let
be a sequence taking values in
that converges to
.
Define a function
by

The idea is that we want to say
is going to
, so we just define it at the undefined points to already be at that limit.
Now we have

for all
. (Why? Consider the cases
and
separately.)
Differentiability of
at
says that if
is a sequence taking values in
that converges to
, then
as
. What if
is instead a sequence taking values in
? Then we can say
as
. To show this, let
. Now we can find
such that
for all
. But this means if
, then we have two cases: either
and
, in which case
as above, or else
, in which case
so
.
Differentiability of
at
implies continuity of
at
, so this means that
as
. Since
for each
, we can use
as our sequence in
to conclude that as
we have
.
Now by the limit laws

Since the sequence
was arbitrary, we can conclude that
.
TODO: Tao says that division by zero occurs when
, which seems strange to me.