User:IssaRice/Chain rule proofs: Difference between revisions

Revision as of 02:21, 28 November 2018

Using Newton's approximation

Main idea

The main idea of using Newton's approximation to prove the chain rule is that since f is differentiable at $x_{0}$ we have the approximation $f (x) \approx f (x_{0}) + f^{'} (x_{0}) (x - x_{0})$ when $x$ is near $x_{0}$ . Similarly since g is differentiable at $f (x_{0})$ we have the approximation $g (y) \approx g (f (x_{0})) + g^{'} (f (x_{0})) (y - f (x_{0}))$ when $y$ is near $f (x_{0})$ . Since f is differentiable at $x_{0}$ , it is continuous there also, so we know that $f (x)$ is near $f (x_{0})$ whenever $x$ is near $x_{0}$ . This allows us to substitute $f (x)$ into $y$ whenever $x$ is near $x_{0}$ . So we get

$\begin{array}{r} g (f (x)) & \approx g (f (x_{0})) + g' (f (x_{0})) (f (x) - f (x_{0})) \\ \approx g (f (x_{0})) + g' (f (x_{0})) (f' (x_{0}) (x - x_{0})) \end{array}$

Thus we get $g \circ f (x) \approx g \circ f (x_{0}) + g^{'} (f (x_{0})) f^{'} (x_{0}) (x - x_{0})$ , which is what the chain rule says.

Proof

Since $g$ is differentiable at $y_{0}$ , we know $g^{'} (y_{0})$ is a real number, and we can write

$g (y) = g (y_{0}) + g^{'} (y_{0}) (y - y_{0}) + [g (y) - (g (y_{0}) + g^{'} (y_{0}) (y - y_{0}))]$

(there is no magic: the terms just cancel out)

If we define $E_{g} (y, y_{0}) : = g (y) - (g (y_{0}) + g^{'} (y_{0}) (y - y_{0}))$ we can write

$g (y) = g (y_{0}) + g^{'} (f (x_{0})) (y - y_{0}) + E_{g} (y, y_{0})$

Newton's approximation says that $| E_{g} (y, y_{0}) | \leq ϵ | y - y_{0} |$ as long as $| y - y_{0} | \leq δ$ .

Since $f$ is differentiable at $x_{0}$ , we know that it must be continuous at $x_{0}$ . This means we can keep $| f (x) - y_{0} | \leq δ$ as long as we keep $| x - x_{0} | \leq δ^{'}$ .

Since $f (x) \in Y$ and $| f (x) - y_{0} | \leq δ$ , this means we can substitute $y = f (x)$ and get

$g (f (x)) = g (y_{0}) + g^{'} (f (x_{0})) (f (x) - y_{0}) + E_{g} (f (x), y_{0})$

Now we use the differentiability of $f$ . We can write

$f (x) = f (x_{0}) + f^{'} (x_{0}) (x - x_{0}) + [f (x) - (f (x_{0}) + f^{'} (x_{0}) (x - x_{0}))]$

Again, we can define $E_{f} (x, x_{0}) : = f (x) - (f (x_{0}) + f^{'} (x_{0}) (x - x_{0}))$ and write this as

$f (x) = f (x_{0}) + f^{'} (x_{0}) (x - x_{0}) + E_{f} (x, x_{0})$

Now we can substitute this into the expression for $g (f (x))$ to get

$g (f (x)) = g (y_{0}) + g^{'} (f (x_{0})) (f^{'} (x_{0}) (x - x_{0}) + E_{f} (x, x_{0})) + E_{g} (f (x), f (x_{0}))$

where we have canceled out two terms using $f (x_{0}) = y_{0}$ .

Thus we have

$g (f (x)) = g (y_{0}) + g^{'} (f (x_{0})) f^{'} (x_{0}) (x - x_{0}) + [g^{'} (f (x_{0})) E_{f} (x, x_{0}) + E_{g} (f (x), f (x_{0}))]$

We can write this as

$(g \circ f) (x) - ((g \circ f) (x_{0}) + L (x - x_{0})) = [g^{'} (f (x_{0})) E_{f} (x, x_{0}) + E_{g} (f (x), f (x_{0}))]$

where $L : = g^{'} (f (x_{0})) f^{'} (x_{0})$ . Now the left hand side looks like the expression in Newton's approximation. This means to show $g \circ f$ is differentiable at $x_{0}$ , we just need to show that $| g^{'} (f (x_{0})) E_{f} (x, x_{0}) + E_{g} (f (x), f (x_{0})) | \leq ϵ | x - x_{0} |$ .

The stuff in square brackets is our "error term" for $g \circ f$ . Now we just need to make sure it is small, even after dividing by $| x - x_{0} |$ .

But f is differentiable at $x_{0}$ , so by Newton's approximation,

$| g^{'} (f (x_{0})) E_{f} (x, x_{0}) | \leq | g^{'} (f (x_{0})) | ϵ_{1} | x - x_{0} |$

we also have

$| E_{g} (f (x), f (x_{0})) | \leq ϵ_{2} | f (x) - f (x_{0}) | = ϵ_{2} | f^{'} (x_{0}) (x - x_{0}) + E_{f} (x, x_{0}) |$

We can bound this from above using the triangle inequality:

$\begin{array}{r} | E_{g} (f (x), f (x_{0})) | & \leq ϵ_{2} | f' (x_{0}) (x - x_{0}) | + ϵ_{2} | E_{f} (x, x_{0}) | \\ \leq ϵ_{2} | f' (x_{0}) | | x - x_{0} | + ϵ_{2} ϵ_{1} | x - x_{0} | \end{array}$

Now we can just choose $ϵ_{1}, ϵ_{2}$ small enough.

@@ Line 17: / Line 17: @@
 (there is no magic: the terms just cancel out)
-If we define <math>E_g(\Delta y) := g(y) - (g(y_0) + g'(y_0)(y-y_0))</math> we can write
+If we define <math>E_g(y,y_0) := g(y) - (g(y_0) + g'(y_0)(y-y_0))</math> we can write
-<math display="block">g(y) = g(y_0) + g'(f(x_0))(y - y_0) + E_g(\Delta y)</math>
+<math display="block">g(y) = g(y_0) + g'(f(x_0))(y - y_0) + E_g(y,y_0)</math>
-Newton's approximation says that <math>|E_g(\Delta y)| \leq \epsilon|y-y_0|</math> as long as <math>|y-y_0|\leq \delta</math>.
+Newton's approximation says that <math>|E_g(y,y_0)| \leq \epsilon|y-y_0|</math> as long as <math>|y-y_0|\leq \delta</math>.
 Since <math>f</math> is differentiable at <math>x_0</math>, we know that it must be continuous at <math>x_0</math>. This means we can keep <math>|f(x)-y_0|\leq \delta</math> as long as we keep <math>|x-x_0|\leq \delta'</math>.
@@ Line 27: / Line 27: @@
 Since <math>f(x) \in Y</math> and <math>|f(x)-y_0|\leq \delta</math>, this means we can substitute <math>y = f(x)</math> and get
-<math display="block">g(f(x)) = g(y_0) + g'(f(x_0))(f(x) - y_0) + E_g(\Delta f)</math>
+<math display="block">g(f(x)) = g(y_0) + g'(f(x_0))(f(x) - y_0) + E_g(f(x),y_0)</math>
 Now we use the differentiability of <math>f</math>. We can write
@@ Line 33: / Line 33: @@
 <math display="block">f(x) = f(x_0) + f'(x_0)(x - x_0) + [f(x) - (f(x_0) + f'(x_0)(x-x_0))]</math>
-Again, we can define <math>E_f(\Delta x) := f(x) - (f(x_0) + f'(x_0)(x-x_0))</math> and write this as
+Again, we can define <math>E_f(x,x_0) := f(x) - (f(x_0) + f'(x_0)(x-x_0))</math> and write this as
-<math display="block">f(x) = f(x_0) + f'(x_0)(x - x_0) + E_f(\Delta x)</math>
+<math display="block">f(x) = f(x_0) + f'(x_0)(x - x_0) + E_f(x,x_0)</math>
 Now we can substitute this into the expression for <math>g(f(x))</math> to get
-<math display="block">g(f(x)) = g(y_0) + g'(f(x_0))(f'(x_0)(x - x_0) + E_f(\Delta x)) + E_g(\Delta f)</math>
+<math display="block">g(f(x)) = g(y_0) + g'(f(x_0))(f'(x_0)(x - x_0) + E_f(x,x_0)) + E_g(f(x),f(x_0))</math>
 where we have canceled out two terms using <math>f(x_0) = y_0</math>.
@@ Line 45: / Line 45: @@
 Thus we have
-<math display="block">g(f(x)) = g(y_0) + g'(f(x_0))f'(x_0)(x - x_0) + [g'(f(x_0))E_f(\Delta x) + E_g(\Delta f)]</math>
+<math display="block">g(f(x)) = g(y_0) + g'(f(x_0))f'(x_0)(x - x_0) + [g'(f(x_0))E_f(x,x_0) + E_g(f(x), f(x_0))]</math>
 We can write this as
-<math display="block">(g\circ f)(x) - ((g\circ f)(x_0) + L(x - x_0)) = [g'(f(x_0))E_f(\Delta x) + E_g(\Delta f)]</math>
+<math display="block">(g\circ f)(x) - ((g\circ f)(x_0) + L(x - x_0)) = [g'(f(x_0))E_f(x,x_0) + E_g(f(x), f(x_0))]</math>
-where <math>L := g'(f(x_0))f'(x_0)</math>. Now the left hand side looks like the expression in Newton's approximation. This means to show <math>g\circ f</math> is differentiable at <math>x_0</math>, we just need to show that <math>|g'(f(x_0))E_f(\Delta x) + E_g(\Delta f)| \leq \epsilon|x - x_0|</math>.
+where <math>L := g'(f(x_0))f'(x_0)</math>. Now the left hand side looks like the expression in Newton's approximation. This means to show <math>g\circ f</math> is differentiable at <math>x_0</math>, we just need to show that <math>|g'(f(x_0))E_f(x,x_0) + E_g(f(x), f(x_0))| \leq \epsilon|x - x_0|</math>.
 The stuff in square brackets is our "error term" for <math>g\circ f</math>. Now we just need to make sure it is small, even after dividing by <math>|x-x_0|</math>.
@@ Line 57: / Line 57: @@
 But f is differentiable at <math>x_0</math>, so by Newton's approximation,
-<math display="block">|g'(f(x_0))E_f(\Delta x)| \leq |g'(f(x_0))| \epsilon_1 |x-x_0|</math>
+<math display="block">|g'(f(x_0))E_f(x,x_0)| \leq |g'(f(x_0))| \epsilon_1 |x-x_0|</math>
 we also have
-<math display="block">|E_g(\Delta f)| \leq \epsilon_2 |f(x)-f(x_0)| = \epsilon_2 |f'(x_0)(x-x_0) + E_f(\Delta x)|</math>
+<math display="block">|E_g(f(x), f(x_0))| \leq \epsilon_2 |f(x)-f(x_0)| = \epsilon_2 |f'(x_0)(x-x_0) + E_f(x,x_0)|</math>
 We can bound this from above using the triangle inequality:
-<math display="block">\begin{align}|E_g(\Delta f)| &\leq \epsilon_2 |f'(x_0)(x-x_0)| + \epsilon_2 |E_f(\Delta x)| \\ &\leq \epsilon_2 |f'(x_0)| |x-x_0| + \epsilon_2 \epsilon_1 |x-x_0|\end{align}</math>
+<math display="block">\begin{align}|E_g(f(x), f(x_0))| &\leq \epsilon_2 |f'(x_0)(x-x_0)| + \epsilon_2 |E_f(x,x_0)| \\ &\leq \epsilon_2 |f'(x_0)| |x-x_0| + \epsilon_2 \epsilon_1 |x-x_0|\end{align}</math>
 Now we can just choose <math>\epsilon_1, \epsilon_2</math> small enough.
 ==Limits of sequences==