Derivative of a quadratic form: Difference between revisions

From Machinelearning
Line 25: Line 25:
:<math>\frac{\|x_0^{\mathrm T}Ax_0 + h^{\mathrm T} Ax_0 + x_0^{\mathrm T} Ah + h^{\mathrm T}Ah - x_0^{\mathrm T}Ax_0 - L(h)\|}{\|h\|} = \frac{\|h^{\mathrm T} Ax_0 + x_0^{\mathrm T} Ah + h^{\mathrm T}Ah - L(h)\|}{\|h\|}</math>
:<math>\frac{\|x_0^{\mathrm T}Ax_0 + h^{\mathrm T} Ax_0 + x_0^{\mathrm T} Ah + h^{\mathrm T}Ah - x_0^{\mathrm T}Ax_0 - L(h)\|}{\|h\|} = \frac{\|h^{\mathrm T} Ax_0 + x_0^{\mathrm T} Ah + h^{\mathrm T}Ah - L(h)\|}{\|h\|}</math>


Focusing on <math>h^{\mathrm T} Ax_0</math>, it is a real number so <math>h^{\mathrm T} Ax_0 = (h^{\mathrm T} Ax_0)^{\mathrm T} = x_0^{\mathrm T}A^{\mathrm T}h</math>.
Focusing on <math>h^{\mathrm T} Ax_0</math>, it is a real number so taking the transpose leaves it unchanged: <math>h^{\mathrm T} Ax_0 = (h^{\mathrm T} Ax_0)^{\mathrm T} = x_0^{\mathrm T}A^{\mathrm T}h</math>.
 
Now the fraction is
 
:<math>\frac{\|x_0^{\mathrm T}A^{\mathrm T}h + x_0^{\mathrm T} Ah + h^{\mathrm T}Ah - L(h)\|}{\|h\|} = \frac{\|x_0^{\mathrm T}(A^{\mathrm T} + A)h + h^{\mathrm T}Ah - L(h)\|}{\|h\|}</math>


==Using the chain rule==
==Using the chain rule==

Revision as of 23:04, 13 July 2018

Let AMn,n(R) be an n by n real-valued matrix, and let f:RnR be defined by f(x)=xTAx. On this page, we calculate the derivative of f.

Understanding the problem

Straightforward method

Using the definition of the derivative

The derivative is the linear transformation L such that:

limxx0;xx0|f(x)(f(x0)+L(xx0))||xx0|=0

Using our function, this is:

limxx0;xx0|xTAxx0TAx0L(xx0)||xx0|=0

Defining h=xx0, we have x=x0+h and

|(x0+h)TA(x0+h)x0TAx0L(h)||h|

Focusing on the subexpression (x0+h)TA(x0+h), since A is a matrix, it is a linear transformation, so we obtain (x0+h)T(Ax0+Ah). Since the transpose of a sum is the sum of the transposes, we have (x0T+hT)(Ax0+Ah). Now using linearity we have x0TAx0+hTAx0+x0TAh+hTAh.

Now the fraction is

|x0TAx0+hTAx0+x0TAh+hTAhx0TAx0L(h)||h|=|hTAx0+x0TAh+hTAhL(h)||h|

Focusing on hTAx0, it is a real number so taking the transpose leaves it unchanged: hTAx0=(hTAx0)T=x0TATh.

Now the fraction is

|x0TATh+x0TAh+hTAhL(h)||h|=|x0T(AT+A)h+hTAhL(h)||h|

Using the chain rule