In the previous post, we looked at the randomized SVD. In this post, we continue this discussion by looking at the analysis of the randomized SVD. Our approach is adapted from a new analysis of the randomized SVD by Joel A. Tropp and Robert J. Webber.
There are many types of analysis one can do for the randomized SVD. For instance, letting be a matrix and
the rank-
output of the randomized SVD, natural questions include:
- What is the expected error
measured in some norm
? What about expected squared error
? How do these answers change with different norms?
- How large can the randomized SVD error
get except for some small failure probability
?
- How close is the randomized SVD truncated to rank
compared to the best rank-
approximation to
?
- How close are the singular values and vectors of the randomized SVD approximation
compared to those of
?
Implicitly and explicitly, the analysis of Tropp and Webber provides satisfactory answers to a number of these questions. In the interest of simplifying the presentation, we shall focus our presentation on just one of these questions, proving following result:
Here, is the Frobenius norm. We encourage the interested reader to check out Tropp and Webber’s paper to see the methodology we summarize here used to prove numerous facts about the randomized SVD and its extensions using subspace iteration and block Krylov iteration.
Projection Formula
Let us recall the randomized SVD as we presented it in last post:
- Collect information. Form
where
is an appropriately chosen
random matrix.
- Orthogonalize. Compute an orthonormal basis
for the column space of
by, e.g., thin QR factorization.
- Project. Form
, where
denotes the conjugate transpose.
The randomized SVD is the approximation
It is easy to upgrade into a compact SVD form
, as we did as steps 4 and 5 in the previous post.
For the analysis of the randomized SVD, it is helpful to notice that the approximation takes the form
Here, denotes the orthogonal projection onto the column space of the matrix
. We call
the projection formula for the randomized SVD.1This formula bares a good deal of similarity to the projection formula for the Nyström approximation. This is not a coincidence.
Unitarily Invariant Norms
Aside: QuadraticTo state our error bounds in the most general language, we can adopt the language of quadratic unitarily invariant norms. Recall that a square matrix is unitary if
You may recall from my post on Nyström approximation that a norm on matrices is unitarily invariant if
A unitarily invariant norm is said to be quadratic if there exists another unitarily invariant norm
such that
(1)
Many examples of quadratic unitarily invariant norms are found among the Schatten -norms, defined as
Here, denote the decreasingly order singular values of
. The Schatten
-norm
is the Frobenius norm of a matrix. The spectral norm (i.e., operator 2-norm) is the Schatten
-norm, defined to be
The Schatten -norms are unitarily invariant norms for every
. However, the Schatten
-norms are quadratic unitarily invariant norms only for
since
For the remainder of this post, we let and
be a quadratic unitarily invariant norm pair satisfying (1).
Error Decomposition
The starting point of our analysis is the following decomposition of the error of the randomized SVD:
(2)
Recall that we have defined to be an optimal rank-
approximation to
:
Thus, the error decomposition says that the excess error is bounded by
Using the projection formula, we can prove the error decomposition (2) directly:
The first line is the projection formula, the second is relation between and
, and the third is the triangle inequality. For the fifth line, we use the fact that
is an orthoprojector and thus has unit spectral norm
together with the fact that
For the final inequality, we used commutation rule
Bounding the Error
In this section, we shall continue to refine the error decomposition (2). Consider a thin SVD of , partitioned as
where
and
both have orthonormal columns,
and
are square diagonal matrices with nonnegative entries,
and
have
columns, and
is an
matrix.
Under this notation, the best rank- approximation is
. Define
We assume throughout that is full-rank (i.e.,
).
Applying the error decomposition (2), we get
(3)
For the second line, we used the unitary invariance of . Observe that the column space of
is a subset of the column space of
so
(4)
Let’s work on simplifying the expression
First, observe that
Thus, the projector takes the form
Here, denotes the Moore–Penrose pseudoinverse, which reduces to the ordinary matrix inverse for a square nonsingular matrix. Thus,
(5)
Remarkably, this seemingly convoluted combination of matrices actually is a well-studied operation in matrix theory, the parallel sum. The parallel sum of positive semidefinite matrices and
is defined as
(6)
We will have much more to say about the parallel sum soon. For now, we use this notation to rewrite (5) as
Plugging this back into (4) and then (3), we obtain the error bound
(7)
Simplifying this expression further will require more knowledge about parallel sums, which we shall discuss now.
Parallel Sums
Aside:Let us put aside the randomized SVD for a moment and digress to the seemingly unrelated topic of electrical circuits. Suppose we have a battery of voltage and we connect the ends using a wire of resistance
. The current
is given by Ohm’s law
Similarly, if the wire is replaced by a different wire with resistance , the current
is then
Now comes the interesting question. What if we connect the battery by both wires (resistances and
) simultaneously in parallel, so that current can flow through either wire? Since wires of resistances
and
still have voltage
, Ohm’s law still applies and thus the total current is
The net effect of connecting resisting wires of resistances and
is the same as a single resisting wire of resistance
(8)
We call the the operation the parallel sum of
and
because it describes how resistances combine when connected in parallel.

The parallel sum operation can be extended to all nonnegative numbers
and
by continuity:
Electrically, this definition states that one obtains a short circuit (resistance ) if either of the wires carries zero resistance.
Parallel sums of matrices were introduced by William N. Anderson, Jr. and Richard J. Duffin as a generalization of the parallel sum operation from electrical circuits. There are several equivalent definitions. The natural definition (8) applies for positive definite matrices
This definition can be extended to positive semidefinite matrices by continuity. It can be useful to have an explicit definition which applies even to positive semidefinite matrices. To do so, observe that the (scalar) parallel sum satisfies the identity
To extend to matrices, we capitalize the letters and use the Moore–Penrose pseudoinverse in place of the inverse:
This was the definition (6) of the parallel sum we gave above which we found naturally in our randomized SVD error bounds.
The parallel sum enjoys a number of beautiful properties, some of which we list here:
- Symmetry.
,
- Simplified formula.
for positive definite matrices,
- Bounds.
and
.
- Monotone.
is monotone in the Loewner order.
- Concavity. The map
is (jointly) concave (with respect to the Loewner order).
- Conjugation. For any square
,
.
- Trace.
.2In fact, this property holds with the trace replaced by any positive linear map. That is, if
is a linear map from square matrices to square matrices satisfying
if
, then
.
- Unitarily invariant norms.
.
For completionists, we include a proof of the last property for reference.
Nearing the Finish: Enter the Randomness
Equipped with knowledge about parallel sums, we are now prepared to return to bounding the error of the randomized SVD. Apply property 8 of the parallel sum to the randomized SVD bound (7) to obtain
This bound is totally deterministic: It holds for any choice of test matrix , random or otherwise. We can use this bound to analyze the randomized SVD in many different ways for different kinds of random (or non-random) matrices
. Tropp and Webber provide a number of examples instantiating this general bound in different ways.
We shall present only the simplest error bound for randomized SVD. We make the following assumptions:
- The test matrix
is standard Gaussian.4By which, we mean the entries of
are independent and standard normal.
- The norm
is the Frobenius norm
.
- The randomized SVD rank
is
.
Under these assumptions and using the property , we obtain the following expression for the expected error of the randomized SVD
(9)
All that is left to is compute the expectation of . Remarkably, this can be done in closed form.
Aside: Expectation of Gaussian–Inverse Gaussian Matrix Product
The final ingredient in our randomized SVD analysis is the following computation involving Gaussian random matrices. Let and
be independent standard Gaussian random matrices with
of size
,
. Then
For the probability lovers, we will now go on to prove this. For those who are less probabilistically inclined, this result can be treated as a black box.
Wrapping it All Up
We’re now within striking distance. All that is left is to assemble the pieces we have been working with to obtain a final bound for the randomized SVD.
Recall we have chosen the random matrix to be standard Gaussian. By the rotational invariance of the Gaussian distribution,
and
are independent and standard Gaussian as well. Plugging the matrix expectation bound to (9) then completes the analysis
2 thoughts on “Low-Rank Approximation Toolbox: Analysis of the Randomized SVD”