The Better Way to Convert an SVD into a Symmetric Eigenvalue Problem

A singular value decomposition of an $m\times n$ matrix $B$ is a factorization of the form $B = U\Sigma V^\top$ , where $U$ and $V$ are square, orthogonal matrices and $\Sigma$ is a diagonal matrix with $(i,i)$ th entry $\sigma_i \ge 0$ .¹ The diagonal entries of $\Sigma$ are referred to as the singular values of $B$ and are conventionally ordered $\sigma_{\rm max} = \sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_{\min(m,n)} = \sigma_{\rm min}$ . The columns of the matrices $U$ and $V$ are referred to as the right- and left- singular vectors of $B$ and satisfy the relations $Bv_i = \sigma_i u_i$ and $B^\top u_i = \sigma_i v_i$ .

One can obtain the singular values and right and left singular vectors of $B$ from the eigenvalues and eigenvectors of $B^\top B$ and $BB^\top$ . This follows from the calculations $B^\top B = V\Sigma^2 V^\top$ and $B^\top B = U\Sigma^2 U^\top$ . In other words, the nonzero singular values of $B$ are the square roots of the nonzero eigenvalues of $B^\top B$ and $BB^\top$ . If one merely solves one of these problems, computing $\Sigma$ along with $U$ or $V$ , one can obtain the other matrix $V$ or $U$ by computing $U = BV \Sigma^{-1}$ or $V = B^\top U \Sigma^{-1}$ . (These formulas are valid for invertible square matrices $B$ , but similar formulas hold for singular or rectangular $B$ to compute the singular vectors with nonzero singular values.)

This approach is often unundesirable for several reasons. Here are a few I’m aware of:

Accuracy: Roughly speaking, in double-precision arithmetic, accurate stable numerical methods can resolve differences on the order of 16 orders of magnitude. This means an accurately computed SVD of $B$ can resolve the roughly 16 orders of magnitude of decaying singular values, with singular values smaller than that difficult to compute accurately. By computing $B^\top B$ , we square all of our singular values, so resolving 16 orders of magnitude of the eigenvalues of $B^\top B$ means we only resolve 8 orders of magnitude of the singular values of $B$ .² The dynamic range of our numerical computations has been cut in half!
Loss of orthogonality: While $U = BV \Sigma^{-1}$ and $V = B^\top U \Sigma^{-1}$ are valid formulas in exact arithmetic, they fair poorly when implemented numerically. Specifically, the numerically computed values $U_{\rm numerical}$ and $V_{\rm numerical}$ may not be orthogonal matrices with, for example, $U_{\rm numerical}^\top U_{\rm numerical}$ not even close to the identity matrix. One can, of course, orthogonalize the computed $U$ or $V$ , but this doesn’t fix the underlying problem that $U$ or $V$ have not been computed accurately.
Loss of structure: If $B$ possesses additional structure (e.g. sparsity), this structure may be lost or reduced by computing the product $B^\top B$ .
Nonlinearity: Even if we’re not actually computing the SVD numerically but doing analysis with pencil and paper, finding the SVD of $B$ from $B^\top B$ has the disadvantage of performing a nonlinear transformation on $B$ . This prevents us from utilizing additive perturbation theorems for sums of symmetric matrices in our analysis.³

There are times where these problems are insignificant and this approach is sensible: we shall return to this point in a bit. However, these problems should disqualify this approach from being the de facto way we reduce SVD computation to a symmetric eigenvalue problem. This is especially true since we have a better way.

The better way is by constructing the so-called Hermitian dilation⁴ of $B$ , which is defined to be the matrix

(1) $\begin{equation*} \mathcal{H}(B) = \begin{bmatrix} 0 & B \\ B^\top & 0 \end{bmatrix}. \end{equation*}$

One can show that the nonzero eigenvalues of $\mathcal{H}(B)$ are precisely plus-or-minus the singular values of $B$ . More specifically, we have

(2) $\begin{equation*} \mathcal{H}(B) \begin{bmatrix} u_i \\ \pm v_i \end{bmatrix} = \pm \sigma_i \begin{bmatrix} u_i \\ \pm v_i \end{bmatrix}. \end{equation*}$

All of the remaining eigenvalues of $\mathcal{H}(B)$ not of this form are zero.⁵ Thus, the singular value decomposition of $B$ is entirely encoded in the eigenvalue decomposition of $\mathcal{H}(B)$ .

This approach of using the Hermitian dilation $\mathcal{H}(B)$ to compute the SVD of $B$ fixes all the issues identified with the “ $B^\top B$ ” approach. We are able to accurately resolve a full 16 orders of magnitude of singular values. The computed singular vectors are accurate and numerically orthogonal provided we use an accurate method for the symmetric eigenvalue problem. The Hermitian dilation $\mathcal{H}(B)$ preserves important structural characteristics in $B$ like sparsity. For purposes of theoretical analysis, the mapping $B \mapsto \mathcal{H}(B)$ is linear.⁶

Often one can work with the Hermitian dilation only implicitly: the matrix $\mathcal{H}(B)$ need not actually be stored in memory with all its extra zeros. The programmer designs and implements an algorithm with $\mathcal{H}(B)$ in mind, but deals with the matrix $B$ directly for their computations. In a pinch, however, forming $\mathcal{H}(B)$ directly in software and utilizing symmetric eigenvalue routines directly is often not too much less efficient than a dedicated SVD routine and can cut down on programmer effort significantly.

As with all things in life, there’s no free lunch here. There are a couple of downsides to the Hermitian dilation approach. First, $\mathcal{H}(B)$ is, except for the trivial case $B = 0$ , an indefinite symmetric matrix. By constast, $B^\top B$ and $BB^\top$ are positive semidefinite, which can be helpful in some contexts.⁷ Further, if $n\ll m$ (respectively, $m \ll n$ ), then $B^\top B$ (respectively, $BB^\top$ ) is tiny compared to $\mathcal{H}(B)$ , so it might be considerably cheaper to compute an eigenvalue decomposition of $B^\top B$ (or $BB^\top$ ) than $\mathcal{H}(B)$ .

Despite the somewhat salacious title of this article, the $B^\top B$ and Hermitian dilation approaches both have their role, and the purpose of this article is not to say the $B^\top B$ approach should be thrown in the dustbin. However, in my experience, I frequently hear the $B^\top B$ approach stated as the definitive way of converting an SVD into an eigenvalue problem, with the Hermitian dilation approach not even mentioned. This, in my opinion, is backwards. For accuracy reasons alone, the Hermitian dilation should be the go-to tool for turning SVDs into symmetric eigenvalue problems, with the $B^\top B$ approach only used when the problem is known to have singular values which don’t span many orders of magnitude or $B$ is tall and skinny and the computational cost savings of the $B^\top B$ approach are vital.

7 thoughts on “The Better Way to Convert an SVD into a Symmetric Eigenvalue Problem”

Michael Scharrer says:

March 23, 2021 at 6:51 pm

Great explanation.
The increased accuracy also means that in many cases it is possible to move to 32bit float values, which are much faster on a lot of hardware.

1. Ethan Epperly says:
  
  March 23, 2021 at 7:27 pm
  
  This is an excellent point and probably something I should have explored in the original post. Being able to move to single or even half precision which, as you mentioned, can be significantly faster and require less data movement is an important reason to get as much accuracy as you can out of your floating point number system. Maybe 16 orders of magnitude of dynamic range for double precision may seem unnecessary, but the difference between two and four orders of magnitude range in half precision is the difference between half precision being useless and totally fine in many applications.
  
Shivkumar Chandrasekaran says:

May 11, 2021 at 2:30 pm

Even in the extremely skinny (or fat) case the Hermitian dilation can be made just as efficient by first doing a QR (or LQ) factorization and then doing the Hermitian dilation on R (or L).

Pingback: Note to Self: Hanson–Wright Inequality – Ethan Epperly
Dirk says:

March 6, 2024 at 6:20 am

Nice post! One question for the very beginning: Is there a quick way to see that B^T B and BB^T have the same eigenvalues (I mean the ones which are nonzero)? Using the SVD it’s obvious, but without it?

1. Ethan N. Epperly says:
  
  March 8, 2024 at 9:01 pm
  
  This is a consequence of the more general fact that BC and CB always have the same nonzero eigenvalues (even if B, C are rectangular). There are many, many proofs of this fact (Fuzhen Zhang’s book contains 4!). One of my favorites (specific to the case when B, C are square and nonsingular): BC = B(CB)B⁻¹, so BC and CB are similar and possess the same eigenvalues. (In fact, the nonsingular square case implies the general singular or rectangular case by continuity of eigenvalues and padding rectangular matrices by zeros until they are square.)
  
Pingback: Neat Randomized Algorithms: Randomized Cholesky QR – Ethan N. Epperly

7 thoughts on “The Better Way to Convert an SVD into a Symmetric Eigenvalue Problem”

Leave a Reply to Michael Scharrer Cancel reply