User:IssaRice/Chain rule proofs: Difference between revisions

From Machinelearning
No edit summary
Line 24: Line 24:


<math>f(x) = f(x_0) + f'(x_0)(x - x_0) + E_f(\Delta x)</math>
<math>f(x) = f(x_0) + f'(x_0)(x - x_0) + E_f(\Delta x)</math>
Now we can substitute this into the expression for <math>g(f(x))</math> to get
<math>g(f(x)) = g(y_0) + g'(f(x_0))(f'(x_0)(x - x_0) + E_f(\Delta x)) + E_g(\Delta f)</math>
where we have canceled out two terms using <math>f(x_0) = y_0</math>.

Revision as of 01:31, 28 November 2018

Using Newton's approximation

Since g is differentiable at y0, we know g(y0) is a real number, and we can write

g(y)=g(y0)+g(y0)(yy0)+[g(y)(g(y0)+g(y0)(yy0))]

If we define Eg(Δy):=g(y)(g(y0)+g(y0)(yy0)) we can write

g(y)=g(y0)+g(f(x0))(yy0)+Eg(Δy)

Newton's approximation says that |Eg(Δy)|ϵ|yy0| as long as |yy0|δ.

Since f is differentiable at x0, we know that it must be continuous at x0. This means we can keep |f(x)y0|δ as long as we keep |xx0|δ.

Since f(x)Y and |f(x)y0|δ, this means we can substitute y=f(x) and get

g(f(x))=g(y0)+g(f(x0))(f(x)y0)+Eg(Δf)

Now we use the differentiability of f. We can write

f(x)=f(x0)+f(x0)(xx0)+[f(x)(f(x0)+f(x0)(xx0))]

Again, we can define Ef(Δx):=f(x)(f(x0)+f(x0)(xx0)) and write this as

f(x)=f(x0)+f(x0)(xx0)+Ef(Δx)

Now we can substitute this into the expression for g(f(x)) to get

g(f(x))=g(y0)+g(f(x0))(f(x0)(xx0)+Ef(Δx))+Eg(Δf)

where we have canceled out two terms using f(x0)=y0.