User:IssaRice/Extreme value theorem: Difference between revisions

From Machinelearning
No edit summary
No edit summary
Line 9: Line 9:
So now suppose <math>f(a) < M</math>. Then <math>a \in X</math>. We already know that <math>X</math> is bounded above, for instance by the number <math>b</math>. We can thus take the least upper bound of <math>X</math>, say <math>c = \sup X</math>. We already know <math>f(c) \leq M</math>, so if we can just eliminate the possibility that <math>f(c) < M</math>, we will be done.
So now suppose <math>f(a) < M</math>. Then <math>a \in X</math>. We already know that <math>X</math> is bounded above, for instance by the number <math>b</math>. We can thus take the least upper bound of <math>X</math>, say <math>c = \sup X</math>. We already know <math>f(c) \leq M</math>, so if we can just eliminate the possibility that <math>f(c) < M</math>, we will be done.


So suppose <math>f(c) < M</math>. We want to find <math>M' < M</math> such that <math>f(t) < M'</math> for all <math>t \in [a,c]</math>. That would mean that <math>\sup V_c \leq M' < M</math>. To do this, we split the interval into two parts. Choose <math>\epsilon > 0</math> with <math>\epsilon < M - f(c)</math>.<ref group="note">It is important here that <math>\epsilon</math> does not equal <math>M - f(c)</math>; choosing this <math>\epsilon</math> would be too weak and we would not be able to conclude <math>\sup V_c < M</math>, rather only that <math>\sup V_c \leq M</math>.</ref> By continuity at <math>c</math>, there exists a <math>\delta > 0</math> such that <math>|t-c|<\delta</math> implies <math>|f(t)-f(c)|<\epsilon</math>. So now pick a point like <math>c - \delta/2</math>, and split the interval into <math>[a,c-\delta/2]</math> and <math>[c-\delta/2,c]</math>.
So suppose <math>f(c) < M</math>. We want to find <math>M' < M</math> such that <math>f(t) \leq M'</math> for all <math>t \in [a,c]</math>. That would mean that <math>\sup V_c \leq M' < M</math>. To do this, we split the interval into two parts. Choose <math>\epsilon > 0</math> with <math>\epsilon < M - f(c)</math>.<ref group="note">It is important here that <math>\epsilon</math> does not equal <math>M - f(c)</math>; choosing this <math>\epsilon</math> would be too weak and we would not be able to conclude <math>\sup V_c < M</math>, rather only that <math>\sup V_c \leq M</math>.</ref> By continuity at <math>c</math>, there exists a <math>\delta > 0</math> such that <math>|t-c|<\delta</math> implies <math>|f(t)-f(c)|<\epsilon</math>. So now pick a point like <math>c - \delta/2</math>, and split the interval into <math>[a,c-\delta/2]</math> and <math>[c-\delta/2,c]</math>.


* Since <math>c-\delta/2 < c</math>, there exists <math>x \in X</math> such that <math>c-\delta/2 < x</math> (otherwise <math>c-\delta/2</math> would be a smaller upper bound for <math>X</math>). So <math>\sup V_{c-\delta/2} \leq \sup V_x < M</math>. This means that <math>f(t) \leq \sup V_{c-\delta/2} < M</math> for all <math>t \in [a,c-\delta/2]</math>.
* Since <math>c-\delta/2 < c</math>, there exists <math>x \in X</math> such that <math>c-\delta/2 < x</math> (otherwise <math>c-\delta/2</math> would be a smaller upper bound for <math>X</math>). So <math>\sup V_{c-\delta/2} \leq \sup V_x < M</math>. This means that <math>f(t) \leq \sup V_{c-\delta/2} < M</math> for all <math>t \in [a,c-\delta/2]</math>.

Revision as of 23:39, 1 June 2019

Working through the proof in Pugh's book by filling in the parts he doesn't talk about.

For x[a,b], define Vx=f([a,x])={f(t):atx} to be the values that f takes on as the input ranges from a to x (inclusive).

Let M=sup{f(x):axb}=supVb (this number exists by the boundedness theorem) and X={x[a,b]:supVx<M}.[note 1]

Our goal now is to find some x such that f(x)=M. If f(a)=M this is immediate.

So now suppose f(a)<M. Then aX. We already know that X is bounded above, for instance by the number b. We can thus take the least upper bound of X, say c=supX. We already know f(c)M, so if we can just eliminate the possibility that f(c)<M, we will be done.

So suppose f(c)<M. We want to find M<M such that f(t)M for all t[a,c]. That would mean that supVcM<M. To do this, we split the interval into two parts. Choose ϵ>0 with ϵ<Mf(c).[note 2] By continuity at c, there exists a δ>0 such that |tc|<δ implies |f(t)f(c)|<ϵ. So now pick a point like cδ/2, and split the interval into [a,cδ/2] and [cδ/2,c].

  • Since cδ/2<c, there exists xX such that cδ/2<x (otherwise cδ/2 would be a smaller upper bound for X). So supVcδ/2supVx<M. This means that f(t)supVcδ/2<M for all t[a,cδ/2].
  • But now if t[cδ/2,c], then |tc|<δ, so |f(t)f(c)|<ϵ. This means f(t)<f(c)+ϵ<M.

Now we can choose M=max{supVcδ/2,f(c)+ϵ}. Then whatever t[a,c] happens to be, we can say f(t)M.

If c<b then by continuity we can find points t to the right of c where supVt<M, which contradicts the fact that c is an upper bound of such points.

Therefore, c=b, which implies that M=supVb=supVc<M, a contradiction. So the assumption that f(c)<M was false, and we conclude f(c)=M.

Notes

  1. If we had used "" in the definition of X, then when we take the supremum we would just end up with b, regardless of where f achieves the maximum.
  2. It is important here that ϵ does not equal Mf(c); choosing this ϵ would be too weak and we would not be able to conclude supVc<M, rather only that supVcM.