impossiblewizardry | (no subject) (Reply)

Let ∑ be a symmetric, positive definite matrix. I have in mind a covariance matrix.

Diagonalize it:

∑ = P D P⁻¹

Since it’s symmetric, P⁻¹ = P’ (by which I mean, P transpose):

∑ = P D P’

Since it’s positive definite, all the elements in the diagonal matrix D are positive, so you can express D as the square of another diagonal matrix:

D = Q²

∑ = P Q² P’

Define

A = P Q P’

Note that A is symmetric.

Then

A’ A = P Q P’ P Q P’ = P Q² P’ = ∑

This means that quadratic forms of ∑ can be expressed as the lengths of vectors after being transformed by A:

|| A v ||² = v’ A’ A v = v’ ∑ v

I was thinking about this because of the optimization problem

Maximize u’ ∑ u subject to ||u|| = 1

In two dimensions, I can visualize this. The set allowed by the constraint is a circle. The image of that circle under A is an ellipse, since I interpret P and P’ as rotation matrices, and Q as scaling the x and y axes. Take a vector pointing to a point on the circle, rotate it clockwise, scale the axes, rotate it back counterclockwise.

Since u’ ∑ u = || A u ||², our goal is to get the point on the ellipse that is as far as possible from the center. So we want a vector Au that points along the major axis of the ellipse. And our maximum possible value will be the squared length of this vector.

It turns out this comes from a vector u which points along the major axis--this vector is an eigenvector, and it points along the major axis before and after the transformation. I came to that conclusion thinking about what A = P Q P’ does in three parts: rotate clockwise, scale the axes, rotate counterclockwise.

(no subject)

Post a comment in response: