In the comments to my post Does Sketching Work?, I was asked a very interesting question, which I will paraphrase as follows:
Is the sketch-and-solve least-squares solution an unbiased estimate of the true least squares solution of ?
In this post, I will answer this question for the special case in which is a Gaussian sketch, and I will also compute the expected least-squares residual . Throughout this post, I will assume knowledge of sketching; see my previous two posts for a refresher if needed. For this post, will have dimensions and will have dimensions , and these parameters are related .
I thought I knew the answer to this question—that sketch-and-solve suffers from inversion bias. Indeed, the fact I remembered is that
(1)
for the sketching matrices in common use. The issue of the inversion bias (1) is discussed in this paper.However, the inversion bias (1) does not imply that sketch-and-solve is biased. In fact, for a Gaussian sketch1Note that, typically, we use a sketch where the variance of the entries is . The sketch-and-solve solution does not change under a scaling of the matrix . Thus, we are free to take the entries to have variance one.
(2)
sketch-and-solve is unbiased.Theorem (Gaussian sketch-and-solve is unbiased). Let be a Gaussian sketch (2) and let be a full column-rank matrix. Then
(3)
That is, the sketch-and-solve solution is an unbiased estimate for the least-squares solution .
Here, is the Moore–Penrose pseudoinverse, which effectuates the least-squares solution:
In particular, by the normal equations, we have the relation
(4)
valid for any matrix with full column-rank.Let us prove this theorem. The Gaussian sketch (1) is orthogonally invariant: that is, has the same distribution as for any orthogonal matrices and . Thus, we are free to reparametrize. Let
be a (full) singular value decomposition of . Make the changes of variables
After this change of variables, still has the same distribution (2) and
Partition the reparametrized and conformally with :
Under this change of variables, the least-squares solution is
(5)
Now, we are ready to prove the claim (3). Observe that . Begin by using the normal equations (4) to write out the sketch-and-solve solution
Observe that . Thus,
Simplify to obtain
Using the normal equations (4) and the solution formula (5), we obtain
(6)
By the definition (2) of , and are independent and . Thus, taking expectations, we obtainThe desired claim (3) is proven.
Note that the solution formula (6) holds for any sketching matrix . We only use Gaussianity at the last step, where we invoke three properties (1) is mean zero; (2) is orthogonally invariant; and (3) conditional on the first columns of , the last columns of are mean-zero.
The Least-Squares Residual for Gaussian Sketch-and-Solve
The solution formula (6) can be used to understand other properties of the Gaussian sketch-and-solve solution. In this section, we use (6) to understand the expected squared residual norm
where is the sketch-and-solve solution. We write the expected residual norm as
Now, use the Gaussian–inverse Gaussian matrix product formula to compute the expectation, obtaining
Finally, we recognize is the optimal least-squares residual . Thus, we have shown
Note that this is an exact formula for the expected squared residual for the sketch-and-solve solution! Thus, the sketch-and-solve solution has an expected squared residual that factor larger than the optimal value. In particular,
Remark (Existing literature): When I originally wrote this post, I had not seen these results anywhere in the literature. Thanks to Michał Dereziński, who provided me with the following references: the following lecture notes of Mert Pilanci, equation 1 in this recent paper of Michał’s, and this recent survey by Michał and Michael Mahoney. I find it striking that these basic and beautiful results appear to only have been explicitly recorded very recently. Very big thanks also to Rob Webber for providing help in preparing this post.
Update (11/21/24): After this post was initially released, Mert Pilanci provided me with an earlier reference (2020); see also this paper for extensions. He also mentions an even earlier paper (2017) that does similar calculations in a statistical setting, but does not obtain exactly these formulas. See also this paper for exact and asymptomatically exact formulas for sketch-and-solve in the statistical setting.