Derivative of a quadratic form: Difference between revisions

From Machinelearning
Line 14: Line 14:


==Using the definition of the derivative==
==Using the definition of the derivative==
This is an expanded version of the answer at [https://math.stackexchange.com/a/189436/35525].


The derivative is the linear transformation <math>L</math> such that:
The derivative is the linear transformation <math>L</math> such that:

Revision as of 23:30, 13 July 2018

Let AMn,n(R) be an n by n real-valued matrix, and let f:RnR be defined by f(x)=xTAx. On this page, we calculate the derivative of f.

Understanding the problem

Straightforward method

Let A=(aki) and x=(x1,,xn).

We expand

xTAx=xT(i=1na1ixii=1nanixi)=k=1nxki=1nakixi

Now we find the partial derivative of the above with respect to xj.

Using the definition of the derivative

This is an expanded version of the answer at [1].

The derivative is the linear transformation L such that:

limxx0;xx0|f(x)(f(x0)+L(xx0))||xx0|=0

Using our function, this is:

limxx0;xx0|xTAxx0TAx0L(xx0)||xx0|=0

Defining h=xx0, we have x=x0+h and

|(x0+h)TA(x0+h)x0TAx0L(h)||h|

Focusing on the subexpression (x0+h)TA(x0+h), since A is a matrix, it is a linear transformation, so we obtain (x0+h)T(Ax0+Ah). Since the transpose of a sum is the sum of the transposes, we have (x0T+hT)(Ax0+Ah). Now using linearity we have x0TAx0+hTAx0+x0TAh+hTAh.

Now the fraction is

|x0TAx0+hTAx0+x0TAh+hTAhx0TAx0L(h)||h|=|hTAx0+x0TAh+hTAhL(h)||h|

Focusing on hTAx0, it is a real number so taking the transpose leaves it unchanged: hTAx0=(hTAx0)T=x0TATh.

Now the fraction is

|x0TATh+x0TAh+hTAhL(h)||h|=|x0T(AT+A)h+hTAhL(h)||h|

In the numerator, hTAh is a higher order term that will disappear when taking the limit, so the linear transformation we are looking for must be L(h)=x0T(AT+A)h. Since A is symmetric, we have AT+A=2A and L(h)=2x0TAh.

Using the chain rule