(no subject)
Dec. 25th, 2018 10:11 pmLet ∑ be a symmetric, positive definite matrix. I have in mind a covariance matrix.
Diagonalize it:
∑ = P D P⁻¹
Since it’s symmetric, P⁻¹ = P’ (by which I mean, P transpose):
∑ = P D P’
Since it’s positive definite, all the elements in the diagonal matrix D are positive, so you can express D as the square of another diagonal matrix:
D = Q²
∑ = P Q² P’
Define
A = P Q P’
Note that A is symmetric.
Then
A’ A = P Q P’ P Q P’ = P Q² P’ = ∑
This means that quadratic forms of ∑ can be expressed as the lengths of vectors after being transformed by A:
|| A v ||² = v’ A’ A v = v’ ∑ v
I was thinking about this because of the optimization problem
Maximize u’ ∑ u subject to ||u|| = 1
In two dimensions, I can visualize this. The set allowed by the constraint is a circle. The image of that circle under A is an ellipse, since I interpret P and P’ as rotation matrices, and Q as scaling the x and y axes. Take a vector pointing to a point on the circle, rotate it clockwise, scale the axes, rotate it back counterclockwise.
Since u’ ∑ u = || A u ||², our goal is to get the point on the ellipse that is as far as possible from the center. So we want a vector Au that points along the major axis of the ellipse. And our maximum possible value will be the squared length of this vector.
It turns out this comes from a vector u which points along the major axis--this vector is an eigenvector, and it points along the major axis before and after the transformation. I came to that conclusion thinking about what A = P Q P’ does in three parts: rotate clockwise, scale the axes, rotate counterclockwise.