I am excited to share that my paper Does block size matter in randomized block Krylov low-rank approximation? has recently been released on arXiv. In that paper, we study the randomized block Krylov iteration (RBKI) algorithm for low-rank approximation. Existing results show that RBKI is efficient at producing rank- approximations with a large block size of
or a small block size of
, but these results give poor results for intermediate block sizes
. But often these intermediate block sizes are the most efficient in practice. In our paper, we close this theoretical gap, showing RBKI is efficient for any block size
. Check out the paper for details!
In our paper, the core technical challenge is understanding the condition number of a random block Krylov matrix of the form




Evaluating a Polynomial and Vandermonde Matrices
Let us begin with one the humblest but most important characters in mathematics, the univariate polynomial







Given a polynomial, we often wish to evaluate it at a set of inputs. Specifically, let be
(distinct) input locations. If we evaluate
at each number, we obtain a list of (output) values, which we denote by
of
(distinct) values, each of which given by the formula




Every linear transformation between vectors can be realized as a matrix–vector product, and the matrix for the coefficients-to-values map is called a Vandermonde matrix . It is given by the formula

Interpolating by a Polynomial and Inverting a Vandermonde Matrix
Going forward, let us set so the number of locations
equals the number of coefficients
. The Vandermonde matrix maps the vector of coefficients
to the vector of values
. Its inverse
maps a set of values
to a set of coefficients
defining a polynomial
which interpolates the values
:
To solve the polynomial interpolation problem with Vandermonde matrices, we can do the following. Given values , we first solve the linear system of equations
, obtaining a vector of coefficients
. Then, define the interpolating polynomial
(1)




Lagrange Interpolation
Inverting a the Vandermonde matrix is one way to solve the polynomial interpolation problem, but the polynomial interpolation can also be solved directly. To do so, first notice that we can construct a special polynomial that is zero at the locations
but nonzero at the first location
. (Remember that we have assumed that
are distinct.) Further, by rescaling this polynomial to



(2)









With the Lagrange polynomials in hand, the polynomial interpolation problem is easy. To obtain a polynomial whose values are , simply multiply each Lagrange polynomial
by the value
and sum up, obtaining


(3)
From Lagrange to Vandermonde via Elementary Symmetric Polynomials
We now have two ways of solving the polynomial interpolation problem, the Vandermonde way (1) and the Lagrange way (3). Ultimately, the difference between these formulas is one of basis: The Vandermonde formula (1) expresses the interpolating polynomial as a linear combination of monomials
and the Lagrange formula (3) expresses
as a linear combination of the Lagrange polynomials
. To convert between these formulas, we just need to express the Lagrange polynomial basis in the monomial basis.
To do so, let us examine the Lagrange polynomials more closely. Consider first the case , and consider the fourth unnormalized Lagrange polynomial



Indeed, these expressions are special. Up to a plus-or-minus sign, they are called the elementary symmetric polynomials of the locations . Specifically, given numbers
, the
th elementary symmetric polynomial
is defined as the sum of all products
of
values, i.e.,

The elementary symmetric polynomials appear all the time in mathematics. In particular, they are the coefficients of the characteristic polynomial of a matrix and feature heavily in the theory of determinantal point processes. For our purposes, the key observation will be that the elementary symmetric polynomials appear whenever one expands out an expression like :
Lemma 1 (Expanding a product of linear functions). It holds that
Using this fact, we obtain an expression for the Lagrange polynomials in the monomial basis. Let denote the list of locations without
. Then the
th Lagrange polynomial is given by
Indeed, we can write the interpolating polynomial as


(4)
Vandermonde Matrices are Merely Exponentially Ill-Conditioned
Vandermonde matrices are notoriously ill-conditioned, meaning that small changes to the values can cause large changes to the coefficients
. On its face, this might seem like the problem of polynomial interpolation itself is ill-conditioned, but this is too hasty a conclusion. After all, it is only the mapping from values
to coefficients
in the monomial basis that is ill-conditioned. Fortunately, there are much better, more numerically stable bases for representing a polynomial like the Chebyshev polynomials.
But these more stable methods of polynomial interpolation and approximation are not the subject of this post: Here, our task is to will be to characterize just how ill-conditioned the computation of is. To characterize this ill-conditioning, we will utilize the condition number of the matrix
. Given a norm
, the condition number of
is defined to be



(5)


Bounding is straightforward. Indeed, setting
and using (5), we compute
The harder task is bounding . Fortunately, we have already done most of the hard work needed to bound this quantity. Using our expression (4) for the entries of
and using the formula (5) for the
-norm, we have


(6)
Often, it is helpful to simply Gautschi’s bound a bit. Setting as above, the numerator is bounded as
. To bound the denominator, let
be the smallest distance between two locations. Using
and
, we can weaken the bound (6) to obtain

Theorem 2 (Gautschi’s bound, simplified). Introduce
and
. Then
and
Gautschi’s bound suggests that Vandermonde matrices can be very ill-conditioned, which is disappointing. But Gautschi’s bound also shows that Vandermonde matrices are merely exponentially ill-conditioned—that is, they are not worse than exponentially conditioned. The fact that Vandermonde matrices are only exponentially ill-conditioned plays a crucial role in our analysis of randomized block Krylov iteration.
Gautschi’s Bound as a Robust Version of the Fundamental Theorem of Algebra
The fundamental theorem of algebra states that a degree- polynomial has precisely
roots. Consequently, at
locations, it must be nonzero at least one. But how nonzero must the polynomial be at that one location? How large must it be? On this subject, the fundamental theorem of algebra is moot. However, Gautschi’s bound provides an answer.
To answer this question, we ask: What is the minimum possible size of the values
? Well, if we set all the coefficients
to zero, then
as well. So to avoid this trivial case, we should enforce a normalization condition on the coefficient vector
, say
. With this setting, we are ready to compute. Begin by observing that

Proposition 3 (Minimum stretch). For vector norm
and its induced operator norm
, it holds that for any square invertible matrix
that
Using this result, we obtain the lower bound on the values
of a polynomial with coefficients
. Combining with Gautschi’s bound gives the following robust version of the fundamental theorem of algebra:
Theorem 4 (Robust fundamental theorem of algebra). Fix a polynomial
and locations
. Define
and
. Then
Thus, at locations, a degree-
polynomial must be nonzero at least one point. In fact, the sum of the values at these
locations must be no worse than exponentially small in
.