Recall the singular value decomposition (SVD) of a matrix .
We showed that is the best-fit subspace for
, so
We have that is the projection of
to
, and so
is the projection of
to
.
Let be defined as
.
Theorem 1.
AND
Proof.
We proved part (1) last lecture. We will prove part (2) here. Note that
and
.
We need to show that for any ,
where
have
columns and
.
Since we know that has only
columns,
such that
,
, and
.
.
We’ve seen that the SVD is useful in more than one way. Now we will look at how to compute it, a central question in numerical analysis. A simple method, power iteration, is presented here:
Set for
Repeat times:{
}
Theorem 2.
Assume . After
iterations,
satisfies
.
Proof.
Since , the weight on higher
is increasing.
.
Therefore, after iterations,
Setting
We note that the choice of the initial in the algorithm, one of the axis vectors
, must satisfy the assumption that
. Instead of trying all
basis vectors, we could also start with a random unit vector.
Claim.
Proof.
symmetric.
So, .
.
Moreover, with some constant probability. So a few runs of the algorithm will suffice with high probability.