<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://machinelearning.subwiki.org/w/index.php?action=history&amp;feed=atom&amp;title=User%3AIssaRice%2FChain_rule_proofs</id>
	<title>User:IssaRice/Chain rule proofs - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://machinelearning.subwiki.org/w/index.php?action=history&amp;feed=atom&amp;title=User%3AIssaRice%2FChain_rule_proofs"/>
	<link rel="alternate" type="text/html" href="https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;action=history"/>
	<updated>2026-04-18T06:56:18Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.2</generator>
	<entry>
		<id>https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=3121&amp;oldid=prev</id>
		<title>IssaRice at 01:01, 31 May 2020</title>
		<link rel="alternate" type="text/html" href="https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=3121&amp;oldid=prev"/>
		<updated>2020-05-31T01:01:41Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 01:01, 31 May 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;a differentiable function looks locally like a linear transformation. if you compose two differentiable functions, then it seems pretty obvious that locally, that composed map also looks like a linear transformation. but which linear transformation? well, you first apply the linear transformation that locally approximates the inner function. then you land in some target space, and you find what the outer function locally looks like at the place you land. and that&amp;#039;s all the chain rule says.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;a differentiable function looks locally like a linear transformation. if you compose two differentiable functions, then it seems pretty obvious that locally, that composed map also looks like a linear transformation. but which linear transformation? well, you first apply the linear transformation that locally approximates the inner function. then you land in some target space, and you find what the outer function locally looks like at the place you land. and that&amp;#039;s all the chain rule says.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;So Folland actually says this on page 109: &quot;Since the product of two matrices gives the composition of the linear transformations defined by those matrices, the chain rule just says that &#039;&#039;the linear &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;approximations &lt;/del&gt;of a composition is the composition of the linear approximations&#039;&#039;.&quot; But... putting this short paragraph away after a proof, and only after you&#039;ve already discussed two versions of the chain rule in previous chapters, is in my opinion not trying hard enough to communicate the central message. This should be in bold flashing letters at the TOP of the FIRST discussion of the chain rule.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;So Folland actually says this on page 109: &quot;Since the product of two matrices gives the composition of the linear transformations defined by those matrices, the chain rule just says that &#039;&#039;the linear &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;approximation &lt;/ins&gt;of a composition is the composition of the linear approximations&#039;&#039;.&quot; But... putting this short paragraph away after a proof, and only after you&#039;ve already discussed two versions of the chain rule in previous chapters, is in my opinion not trying hard enough to communicate the central message. This should be in bold flashing letters at the TOP of the FIRST discussion of the chain rule.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;==Using Newton&amp;#039;s approximation==&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;==Using Newton&amp;#039;s approximation==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>IssaRice</name></author>
	</entry>
	<entry>
		<id>https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=3120&amp;oldid=prev</id>
		<title>IssaRice at 01:00, 31 May 2020</title>
		<link rel="alternate" type="text/html" href="https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=3120&amp;oldid=prev"/>
		<updated>2020-05-31T01:00:58Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 01:00, 31 May 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;a differentiable function looks locally like a linear transformation. if you compose two differentiable functions, then it seems pretty obvious that locally, that composed map also looks like a linear transformation. but which linear transformation? well, you first apply the linear transformation that locally approximates the inner function. then you land in some target space, and you find what the outer function locally looks like at the place you land. and that&amp;#039;s all the chain rule says.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;a differentiable function looks locally like a linear transformation. if you compose two differentiable functions, then it seems pretty obvious that locally, that composed map also looks like a linear transformation. but which linear transformation? well, you first apply the linear transformation that locally approximates the inner function. then you land in some target space, and you find what the outer function locally looks like at the place you land. and that&amp;#039;s all the chain rule says.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;So Folland actually says this on page 109: &quot;Since the product of two matrices gives the composition of the linear transformations defined by those matrices, the chain rule just says that &#039;&#039;the linear approximations of a composition is the composition of the linear approximations&#039;&#039;.&quot; But... putting this short paragraph away after a proof, and only after you&#039;ve already discussed two versions of the chain rule in previous chapters, is in my opinion not trying hard enough to communicate the central message. This should be in bold flashing letters at the TOP of the FIRST discussion of the chain rule.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;==Using Newton&amp;#039;s approximation==&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;==Using Newton&amp;#039;s approximation==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>IssaRice</name></author>
	</entry>
	<entry>
		<id>https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=2729&amp;oldid=prev</id>
		<title>IssaRice at 03:50, 18 January 2020</title>
		<link rel="alternate" type="text/html" href="https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=2729&amp;oldid=prev"/>
		<updated>2020-01-18T03:50:47Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 03:50, 18 January 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;a differentiable function looks locally like a linear transformation. if you compose two differentiable functions, then it seems pretty obvious that locally, that composed map also looks like a linear transformation. but which linear transformation? well, you first apply the linear transformation that locally approximates the inner function. then you land in some target space, and you find what the outer function locally looks like at the place you land. and that&#039;s all the chain rule says.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;==Using Newton&amp;#039;s approximation==&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;==Using Newton&amp;#039;s approximation==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>IssaRice</name></author>
	</entry>
	<entry>
		<id>https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=1022&amp;oldid=prev</id>
		<title>IssaRice: /* Limits of sequences */</title>
		<link rel="alternate" type="text/html" href="https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=1022&amp;oldid=prev"/>
		<updated>2018-12-01T06:54:47Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Limits of sequences&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 06:54, 1 December 2018&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l155&quot;&gt;Line 155:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 155:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Differentiability of &amp;lt;math&amp;gt;g&amp;lt;/math&amp;gt; at &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt; says that if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is a sequence taking values in &amp;lt;math&amp;gt;Y \setminus \{y_0\}&amp;lt;/math&amp;gt; that converges to &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n \to \infty&amp;lt;/math&amp;gt;. What if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is instead a sequence taking values in &amp;lt;math&amp;gt;Y&amp;lt;/math&amp;gt;? Then we can say &amp;lt;math&amp;gt;\phi(y_n) \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n\to\infty&amp;lt;/math&amp;gt;. To show this, let &amp;lt;math&amp;gt;\epsilon &amp;gt; 0&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Differentiability of &amp;lt;math&amp;gt;g&amp;lt;/math&amp;gt; at &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt; says that if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is a sequence taking values in &amp;lt;math&amp;gt;Y \setminus \{y_0\}&amp;lt;/math&amp;gt; that converges to &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n \to \infty&amp;lt;/math&amp;gt;. What if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is instead a sequence taking values in &amp;lt;math&amp;gt;Y&amp;lt;/math&amp;gt;? Then we can say &amp;lt;math&amp;gt;\phi(y_n) \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n\to\infty&amp;lt;/math&amp;gt;. To show this, let &amp;lt;math&amp;gt;\epsilon &amp;gt; 0&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Now we can find &amp;lt;math&amp;gt;N \geq 1&amp;lt;/math&amp;gt; such that for all &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, if &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;|\phi(y_n) - g&#039;(f(x_0))| = \left\vert\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} - g&#039;(f(x_0))\right\vert \leq \epsilon&amp;lt;/math&amp;gt;. (TODO: I think here we need to break off into two cases: one where there&#039;s an infinite number of &amp;lt;math&amp;gt;y_n&amp;lt;/math&amp;gt; such that &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, and one where there&#039;s only a finite number so that eventually the sequence is all just &amp;lt;math&amp;gt;y_n = f(x_0)&amp;lt;/math&amp;gt;. Only in the former case can we find &amp;lt;math&amp;gt;N&amp;lt;/math&amp;gt;, by considering the subsequence that isn&#039;t equal to &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt;, but this is not a problem because in the latter case the sequence&#039;s tail is already at the place where we need it to be, so we don&#039;t even need to find &amp;lt;math&amp;gt;N&amp;lt;/math&amp;gt;. The question is, is there some more elegant way to do this that doesn&#039;t break off into &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;case&lt;/del&gt;?)&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Now we can find &amp;lt;math&amp;gt;N \geq 1&amp;lt;/math&amp;gt; such that for all &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, if &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;|\phi(y_n) - g&#039;(f(x_0))| = \left\vert\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} - g&#039;(f(x_0))\right\vert \leq \epsilon&amp;lt;/math&amp;gt;. (TODO: I think here we need to break off into two cases: one where there&#039;s an infinite number of &amp;lt;math&amp;gt;y_n&amp;lt;/math&amp;gt; such that &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, and one where there&#039;s only a finite number so that eventually the sequence is all just &amp;lt;math&amp;gt;y_n = f(x_0)&amp;lt;/math&amp;gt;. Only in the former case can we find &amp;lt;math&amp;gt;N&amp;lt;/math&amp;gt;, by considering the subsequence that isn&#039;t equal to &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt;, but this is not a problem because in the latter case the sequence&#039;s tail is already at the place where we need it to be, so we don&#039;t even need to find &amp;lt;math&amp;gt;N&amp;lt;/math&amp;gt;. The question is, is there some more elegant way to do this that doesn&#039;t break off into &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;cases&lt;/ins&gt;?)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;But this means if &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, then we have two cases: either &amp;lt;math&amp;gt;y_n \in Y&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;|\phi(y_n) - g&amp;#039;(f(x_0))| \leq \epsilon&amp;lt;/math&amp;gt; as above, or else &amp;lt;math&amp;gt;y_n = f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;\phi(y_n) = g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; so &amp;lt;math&amp;gt;|\phi(y) - g&amp;#039;(f(x_0))| = 0 \leq \epsilon&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;But this means if &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, then we have two cases: either &amp;lt;math&amp;gt;y_n \in Y&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;|\phi(y_n) - g&amp;#039;(f(x_0))| \leq \epsilon&amp;lt;/math&amp;gt; as above, or else &amp;lt;math&amp;gt;y_n = f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;\phi(y_n) = g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; so &amp;lt;math&amp;gt;|\phi(y) - g&amp;#039;(f(x_0))| = 0 \leq \epsilon&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>IssaRice</name></author>
	</entry>
	<entry>
		<id>https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=1021&amp;oldid=prev</id>
		<title>IssaRice: /* Limits of sequences */</title>
		<link rel="alternate" type="text/html" href="https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=1021&amp;oldid=prev"/>
		<updated>2018-12-01T06:54:13Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Limits of sequences&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 06:54, 1 December 2018&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l155&quot;&gt;Line 155:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 155:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Differentiability of &amp;lt;math&amp;gt;g&amp;lt;/math&amp;gt; at &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt; says that if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is a sequence taking values in &amp;lt;math&amp;gt;Y \setminus \{y_0\}&amp;lt;/math&amp;gt; that converges to &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n \to \infty&amp;lt;/math&amp;gt;. What if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is instead a sequence taking values in &amp;lt;math&amp;gt;Y&amp;lt;/math&amp;gt;? Then we can say &amp;lt;math&amp;gt;\phi(y_n) \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n\to\infty&amp;lt;/math&amp;gt;. To show this, let &amp;lt;math&amp;gt;\epsilon &amp;gt; 0&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Differentiability of &amp;lt;math&amp;gt;g&amp;lt;/math&amp;gt; at &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt; says that if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is a sequence taking values in &amp;lt;math&amp;gt;Y \setminus \{y_0\}&amp;lt;/math&amp;gt; that converges to &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n \to \infty&amp;lt;/math&amp;gt;. What if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is instead a sequence taking values in &amp;lt;math&amp;gt;Y&amp;lt;/math&amp;gt;? Then we can say &amp;lt;math&amp;gt;\phi(y_n) \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n\to\infty&amp;lt;/math&amp;gt;. To show this, let &amp;lt;math&amp;gt;\epsilon &amp;gt; 0&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Now we can find &amp;lt;math&amp;gt;N \geq 1&amp;lt;/math&amp;gt; such that for all &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, if &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;|\phi(y_n) - g&#039;(f(x_0))| = \left\vert\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} - g&#039;(f(x_0))\right\vert \leq \epsilon&amp;lt;/math&amp;gt;. (TODO: I think here we need to break off into two cases: one where there&#039;s an infinite number of &amp;lt;math&amp;gt;y_n&amp;lt;/math&amp;gt; such that &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, and one where there&#039;s only a finite number so that eventually the sequence is all just &amp;lt;math&amp;gt;y_n = f(x_0)&amp;lt;/math&amp;gt;. Only in the former case can we find &amp;lt;math&amp;gt;N&amp;lt;/math&amp;gt;, but this is not a problem because in the latter case the sequence&#039;s tail is already at the place where we need it to be, so we don&#039;t even need to find &amp;lt;math&amp;gt;N&amp;lt;/math&amp;gt;. The question is, is there some more elegant way to do this that doesn&#039;t break off into case?)&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Now we can find &amp;lt;math&amp;gt;N \geq 1&amp;lt;/math&amp;gt; such that for all &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, if &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;|\phi(y_n) - g&#039;(f(x_0))| = \left\vert\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} - g&#039;(f(x_0))\right\vert \leq \epsilon&amp;lt;/math&amp;gt;. (TODO: I think here we need to break off into two cases: one where there&#039;s an infinite number of &amp;lt;math&amp;gt;y_n&amp;lt;/math&amp;gt; such that &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, and one where there&#039;s only a finite number so that eventually the sequence is all just &amp;lt;math&amp;gt;y_n = f(x_0)&amp;lt;/math&amp;gt;. Only in the former case can we find &amp;lt;math&amp;gt;N&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;/math&amp;gt;, by considering the subsequence that isn&#039;t equal to &amp;lt;math&amp;gt;f(x_0)&lt;/ins&gt;&amp;lt;/math&amp;gt;, but this is not a problem because in the latter case the sequence&#039;s tail is already at the place where we need it to be, so we don&#039;t even need to find &amp;lt;math&amp;gt;N&amp;lt;/math&amp;gt;. The question is, is there some more elegant way to do this that doesn&#039;t break off into case?)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;But this means if &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, then we have two cases: either &amp;lt;math&amp;gt;y_n \in Y&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;|\phi(y_n) - g&amp;#039;(f(x_0))| \leq \epsilon&amp;lt;/math&amp;gt; as above, or else &amp;lt;math&amp;gt;y_n = f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;\phi(y_n) = g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; so &amp;lt;math&amp;gt;|\phi(y) - g&amp;#039;(f(x_0))| = 0 \leq \epsilon&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;But this means if &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, then we have two cases: either &amp;lt;math&amp;gt;y_n \in Y&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;|\phi(y_n) - g&amp;#039;(f(x_0))| \leq \epsilon&amp;lt;/math&amp;gt; as above, or else &amp;lt;math&amp;gt;y_n = f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;\phi(y_n) = g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; so &amp;lt;math&amp;gt;|\phi(y) - g&amp;#039;(f(x_0))| = 0 \leq \epsilon&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>IssaRice</name></author>
	</entry>
	<entry>
		<id>https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=1020&amp;oldid=prev</id>
		<title>IssaRice: /* Limits of sequences */</title>
		<link rel="alternate" type="text/html" href="https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=1020&amp;oldid=prev"/>
		<updated>2018-12-01T06:53:09Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Limits of sequences&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 06:53, 1 December 2018&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l155&quot;&gt;Line 155:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 155:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Differentiability of &amp;lt;math&amp;gt;g&amp;lt;/math&amp;gt; at &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt; says that if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is a sequence taking values in &amp;lt;math&amp;gt;Y \setminus \{y_0\}&amp;lt;/math&amp;gt; that converges to &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n \to \infty&amp;lt;/math&amp;gt;. What if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is instead a sequence taking values in &amp;lt;math&amp;gt;Y&amp;lt;/math&amp;gt;? Then we can say &amp;lt;math&amp;gt;\phi(y_n) \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n\to\infty&amp;lt;/math&amp;gt;. To show this, let &amp;lt;math&amp;gt;\epsilon &amp;gt; 0&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Differentiability of &amp;lt;math&amp;gt;g&amp;lt;/math&amp;gt; at &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt; says that if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is a sequence taking values in &amp;lt;math&amp;gt;Y \setminus \{y_0\}&amp;lt;/math&amp;gt; that converges to &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n \to \infty&amp;lt;/math&amp;gt;. What if &amp;lt;math&amp;gt;(y_n)_{n=1}^\infty&amp;lt;/math&amp;gt; is instead a sequence taking values in &amp;lt;math&amp;gt;Y&amp;lt;/math&amp;gt;? Then we can say &amp;lt;math&amp;gt;\phi(y_n) \to g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; as &amp;lt;math&amp;gt;n\to\infty&amp;lt;/math&amp;gt;. To show this, let &amp;lt;math&amp;gt;\epsilon &amp;gt; 0&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Now we can find &amp;lt;math&amp;gt;N \geq 1&amp;lt;/math&amp;gt; such that for all &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, if &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;|\phi(y_n) - g&#039;(f(x_0))| = \left\vert\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} - g&#039;(f(x_0))\right\vert \leq \epsilon&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Now we can find &amp;lt;math&amp;gt;N \geq 1&amp;lt;/math&amp;gt; such that for all &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, if &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;|\phi(y_n) - g&#039;(f(x_0))| = \left\vert\frac{g(y_n) - g(f(x_0))}{y_n - f(x_0)} - g&#039;(f(x_0))\right\vert \leq \epsilon&amp;lt;/math&amp;gt;. &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;(TODO: I think here we need to break off into two cases: one where there&#039;s an infinite number of &amp;lt;math&amp;gt;y_n&amp;lt;/math&amp;gt; such that &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, and one where there&#039;s only a finite number so that eventually the sequence is all just &amp;lt;math&amp;gt;y_n = f(x_0)&amp;lt;/math&amp;gt;. Only in the former case can we find &amp;lt;math&amp;gt;N&amp;lt;/math&amp;gt;, but this is not a problem because in the latter case the sequence&#039;s tail is already at the place where we need it to be, so we don&#039;t even need to find &amp;lt;math&amp;gt;N&amp;lt;/math&amp;gt;. The question is, is there some more elegant way to do this that doesn&#039;t break off into case?)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;But this means if &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, then we have two cases: either &amp;lt;math&amp;gt;y_n \in Y&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;|\phi(y_n) - g&amp;#039;(f(x_0))| \leq \epsilon&amp;lt;/math&amp;gt; as above, or else &amp;lt;math&amp;gt;y_n = f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;\phi(y_n) = g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; so &amp;lt;math&amp;gt;|\phi(y) - g&amp;#039;(f(x_0))| = 0 \leq \epsilon&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;But this means if &amp;lt;math&amp;gt;n \geq N&amp;lt;/math&amp;gt;, then we have two cases: either &amp;lt;math&amp;gt;y_n \in Y&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;y_n \ne f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;|\phi(y_n) - g&amp;#039;(f(x_0))| \leq \epsilon&amp;lt;/math&amp;gt; as above, or else &amp;lt;math&amp;gt;y_n = f(x_0)&amp;lt;/math&amp;gt;, in which case &amp;lt;math&amp;gt;\phi(y_n) = g&amp;#039;(f(x_0))&amp;lt;/math&amp;gt; so &amp;lt;math&amp;gt;|\phi(y) - g&amp;#039;(f(x_0))| = 0 \leq \epsilon&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>IssaRice</name></author>
	</entry>
	<entry>
		<id>https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=960&amp;oldid=prev</id>
		<title>IssaRice: /* Using Newton&#039;s approximation */</title>
		<link rel="alternate" type="text/html" href="https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=960&amp;oldid=prev"/>
		<updated>2018-11-29T16:18:34Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Using Newton&amp;#039;s approximation&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 16:18, 29 November 2018&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l21&quot;&gt;Line 21:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 21:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;math display=&amp;quot;block&amp;quot;&amp;gt;\begin{align}g(f(x)) &amp;amp;= g(f(x_0)) + g&amp;#039;(f(x_0))(f&amp;#039;(x_0)(x-x_0) + o(x-x_0)) + o(f(x)-f(x_0)) \\ &amp;amp;= g(f(x_0)) + g&amp;#039;(f(x_0))f&amp;#039;(x_0)(x-x_0) + g&amp;#039;(f(x_0))o(x-x_0) + o(f(x)-f(x_0))\end{align}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;math display=&amp;quot;block&amp;quot;&amp;gt;\begin{align}g(f(x)) &amp;amp;= g(f(x_0)) + g&amp;#039;(f(x_0))(f&amp;#039;(x_0)(x-x_0) + o(x-x_0)) + o(f(x)-f(x_0)) \\ &amp;amp;= g(f(x_0)) + g&amp;#039;(f(x_0))f&amp;#039;(x_0)(x-x_0) + g&amp;#039;(f(x_0))o(x-x_0) + o(f(x)-f(x_0))\end{align}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;to complete the proof, we just need to show that the error &amp;lt;math&amp;gt;g&#039;(f(x_0))o(x-x_0) + o(f(x)-f(x_0))&amp;lt;/math&amp;gt; is &amp;lt;math&amp;gt;o(x-x_0)&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;to complete the proof, we just need to show that the error &amp;lt;math&amp;gt;g&#039;(f(x_0))o(x-x_0) + o(f(x)-f(x_0))&amp;lt;/math&amp;gt; is &amp;lt;math&amp;gt;o(x-x_0)&amp;lt;/math&amp;gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;. The former term clearly is. The latter term is &amp;lt;math&amp;gt;o(f&#039;(x_0)(x-x_0) + o(x-x_0))&amp;lt;/math&amp;gt; so it is as well&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===Proof===&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===Proof===&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>IssaRice</name></author>
	</entry>
	<entry>
		<id>https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=959&amp;oldid=prev</id>
		<title>IssaRice: /* Main idea */</title>
		<link rel="alternate" type="text/html" href="https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=959&amp;oldid=prev"/>
		<updated>2018-11-29T16:15:59Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Main idea&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 16:15, 29 November 2018&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l19&quot;&gt;Line 19:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 19:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Thus if &amp;lt;math&amp;gt;x&amp;lt;/math&amp;gt; is near &amp;lt;math&amp;gt;x_0&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Thus if &amp;lt;math&amp;gt;x&amp;lt;/math&amp;gt; is near &amp;lt;math&amp;gt;x_0&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;math display=&quot;block&quot;&amp;gt;\begin{align}g(f(x)) &amp;amp;= g(f(x_0)) + g&#039;(f(x_0))(f&#039;(x_0)(x-x_0) + o(x-x_0)) + o(&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;y&lt;/del&gt;-f(x_0)) \\ &amp;amp;= g(f(x_0)) + g&#039;(f(x_0))f&#039;(x_0)(x-x_0) + g&#039;(f(x_0))o(x-x_0) + o(&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;y&lt;/del&gt;-f(x_0))\end{align}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;math display=&quot;block&quot;&amp;gt;\begin{align}g(f(x)) &amp;amp;= g(f(x_0)) + g&#039;(f(x_0))(f&#039;(x_0)(x-x_0) + o(x-x_0)) + o(&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;f(x)&lt;/ins&gt;-f(x_0)) \\ &amp;amp;= g(f(x_0)) + g&#039;(f(x_0))f&#039;(x_0)(x-x_0) + g&#039;(f(x_0))o(x-x_0) + o(&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;f(x)&lt;/ins&gt;-f(x_0))\end{align}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;to complete the proof, we just need to show that the error &amp;lt;math&amp;gt;g&#039;(f(x_0))o(x-x_0) + o(f(x)-f(x_0))&amp;lt;/math&amp;gt; is &amp;lt;math&amp;gt;o(x-x_0)&amp;lt;/math&amp;gt;.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===Proof===&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===Proof===&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>IssaRice</name></author>
	</entry>
	<entry>
		<id>https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=958&amp;oldid=prev</id>
		<title>IssaRice: /* Using Newton&#039;s approximation */</title>
		<link rel="alternate" type="text/html" href="https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=958&amp;oldid=prev"/>
		<updated>2018-11-29T16:14:00Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Using Newton&amp;#039;s approximation&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 16:14, 29 November 2018&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l8&quot;&gt;Line 8:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 8:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Thus we get &amp;lt;math&amp;gt;g\circ f(x) \approx g\circ f(x_0) + g&amp;#039;(f(x_0))f&amp;#039;(x_0)(x-x_0)&amp;lt;/math&amp;gt;, which is what the chain rule says.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Thus we get &amp;lt;math&amp;gt;g\circ f(x) \approx g\circ f(x_0) + g&amp;#039;(f(x_0))f&amp;#039;(x_0)(x-x_0)&amp;lt;/math&amp;gt;, which is what the chain rule says.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Slightly more formally:&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;f(x) = f(x_0) + f&#039;(x_0)(x-x_0) + o(x-x_0)&amp;lt;/math&amp;gt; when &amp;lt;math&amp;gt;x&amp;lt;/math&amp;gt; is near &amp;lt;math&amp;gt;x_0&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;g(y) = g(f(x_0)) + g&#039;(f(x_0))(y - f(x_0)) + o(y-f(x_0))&amp;lt;/math&amp;gt; when &amp;lt;math&amp;gt;y&amp;lt;/math&amp;gt; is near &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;f&amp;lt;/math&amp;gt; is continuous at &amp;lt;math&amp;gt;x_0&amp;lt;/math&amp;gt; so &amp;lt;math&amp;gt;f(x)&amp;lt;/math&amp;gt; is near &amp;lt;math&amp;gt;f(x_0)&amp;lt;/math&amp;gt; whenever &amp;lt;math&amp;gt;x&amp;lt;/math&amp;gt; is near &amp;lt;math&amp;gt;x_0&amp;lt;/math&amp;gt;.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Thus if &amp;lt;math&amp;gt;x&amp;lt;/math&amp;gt; is near &amp;lt;math&amp;gt;x_0&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math display=&quot;block&quot;&amp;gt;\begin{align}g(f(x)) &amp;amp;= g(f(x_0)) + g&#039;(f(x_0))(f&#039;(x_0)(x-x_0) + o(x-x_0)) + o(y-f(x_0)) \\ &amp;amp;= g(f(x_0)) + g&#039;(f(x_0))f&#039;(x_0)(x-x_0) + g&#039;(f(x_0))o(x-x_0) + o(y-f(x_0))\end{align}&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===Proof===&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===Proof===&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>IssaRice</name></author>
	</entry>
	<entry>
		<id>https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=957&amp;oldid=prev</id>
		<title>IssaRice: /* Using Newton&#039;s approximation */</title>
		<link rel="alternate" type="text/html" href="https://machinelearning.subwiki.org/w/index.php?title=User:IssaRice/Chain_rule_proofs&amp;diff=957&amp;oldid=prev"/>
		<updated>2018-11-29T05:30:40Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Using Newton&amp;#039;s approximation&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 05:30, 29 November 2018&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l45&quot;&gt;Line 45:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 45:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;To get the bound for &amp;lt;math&amp;gt;|E_g(f(x),f(x_0))|&amp;lt;/math&amp;gt; (using Newton&amp;#039;s approximation), we need to make sure &amp;lt;math&amp;gt;|f(x)-f(x_0)|&amp;lt;/math&amp;gt; is small. But by continuity of &amp;lt;math&amp;gt;f&amp;lt;/math&amp;gt; at &amp;lt;math&amp;gt;x_0&amp;lt;/math&amp;gt; we can do this.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;To get the bound for &amp;lt;math&amp;gt;|E_g(f(x),f(x_0))|&amp;lt;/math&amp;gt; (using Newton&amp;#039;s approximation), we need to make sure &amp;lt;math&amp;gt;|f(x)-f(x_0)|&amp;lt;/math&amp;gt; is small. But by continuity of &amp;lt;math&amp;gt;f&amp;lt;/math&amp;gt; at &amp;lt;math&amp;gt;x_0&amp;lt;/math&amp;gt; we can do this.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;math display=&quot;block&quot;&amp;gt;\begin{align}|E_g(f(x),f(x_0))| &amp;amp;\leq \epsilon_2 |f(x) - f(x_0)| \\ &amp;amp;= \epsilon_2 |f&#039;(x_0)(x - x_0) + E_f(x,x_0)| \\ &amp;amp;\leq \epsilon_2|f&#039;(x_0)||x-x_0| + &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;\epsilon_2&lt;/del&gt;|x-x_0| \\ &amp;amp;= \epsilon_2(|f&#039;(x_0)| + 1)|x-x_0|\end{align}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;math display=&quot;block&quot;&amp;gt;\begin{align}|E_g(f(x),f(x_0))| &amp;amp;\leq \epsilon_2 |f(x) - f(x_0)| \\ &amp;amp;= \epsilon_2 |f&#039;(x_0)(x - x_0) + E_f(x,x_0)| \\ &amp;amp;\leq \epsilon_2&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;(&lt;/ins&gt;|f&#039;(x_0)||x-x_0| + |x-x_0|&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;) &lt;/ins&gt;\\ &amp;amp;= \epsilon_2(|f&#039;(x_0)| + 1)|x-x_0|\end{align}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;where again we are free to choose &amp;lt;math&amp;gt;\epsilon_2&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;, \epsilon_3&lt;/del&gt;&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;where again we are free to choose &amp;lt;math&amp;gt;\epsilon_2&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;TODO: can we do this same proof but without using the error term notation?&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;TODO: can we do this same proof but without using the error term notation?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>IssaRice</name></author>
	</entry>
</feed>