{"id":1734,"date":"2024-01-28T02:53:09","date_gmt":"2024-01-28T02:53:09","guid":{"rendered":"https:\/\/www.ethanepperly.com\/?p=1734"},"modified":"2024-01-28T02:56:09","modified_gmt":"2024-01-28T02:56:09","slug":"dont-use-gaussians-in-stochastic-trace-estimation","status":"publish","type":"post","link":"https:\/\/www.ethanepperly.com\/index.php\/2024\/01\/28\/dont-use-gaussians-in-stochastic-trace-estimation\/","title":{"rendered":"Don&#8217;t Use Gaussians in Stochastic Trace Estimation"},"content":{"rendered":"\n<p>Suppose we are interested in estimating the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Trace_(linear_algebra)\">trace<\/a> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-ace410778fb72abb4014978d27f86b56_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#114;&#40;&#65;&#41;&#32;&#61;&#32;&#92;&#115;&#117;&#109;&#95;&#123;&#105;&#61;&#49;&#125;&#94;&#110;&#32;&#65;&#95;&#123;&#105;&#105;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"132\" style=\"vertical-align: -5px;\"\/> of an <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-41c0efe7f3a82ce93dc2542200956ad7_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#110;&#92;&#116;&#105;&#109;&#101;&#115;&#32;&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"9\" width=\"43\" style=\"vertical-align: 0px;\"\/> matrix <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/> that can be only accessed through matrix\u2013vector products <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-a32e1874620708e26607220d3c24baba_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;&#120;&#95;&#49;&#44;&#92;&#108;&#100;&#111;&#116;&#115;&#44;&#65;&#120;&#95;&#109;\" title=\"Rendered by QuickLaTeX.com\" height=\"17\" width=\"106\" style=\"vertical-align: -4px;\"\/>. The classical method for this purpose is the <a href=\"https:\/\/doi.org\/10.1007\/BF01395775\">Girard<\/a>\u2013<a href=\"https:\/\/doi.org\/10.1080\/03610918908812806\">Hutchinson<\/a> estimator <p class=\"ql-center-displayed-equation\" style=\"line-height: 36px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-f27a0c05da428b96172a0a8ab23ca54b_l3.png\" height=\"36\" width=\"265\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#109;&#125;&#32;&#92;&#108;&#101;&#102;&#116;&#40;&#32;&#120;&#95;&#49;&#94;&#92;&#116;&#111;&#112;&#32;&#65;&#120;&#95;&#49;&#32;&#43;&#32;&#92;&#99;&#100;&#111;&#116;&#115;&#32;&#43;&#32;&#120;&#95;&#109;&#94;&#92;&#116;&#111;&#112;&#32;&#65;&#120;&#95;&#109;&#32;&#92;&#114;&#105;&#103;&#104;&#116;&#41;&#44;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>where the vectors <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-01ba24a8050dda982891c42710ce82c4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#49;&#44;&#92;&#108;&#100;&#111;&#116;&#115;&#44;&#120;&#95;&#109;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"79\" style=\"vertical-align: -4px;\"\/> are <a href=\"https:\/\/en.wikipedia.org\/wiki\/Independence_(probability_theory)\">independent<\/a>, <a href=\"https:\/\/en.wikipedia.org\/wiki\/Independent_and_identically_distributed_random_variables\">identically distributed<\/a> (iid) random vectors satisfying the <a href=\"https:\/\/math.stackexchange.com\/questions\/2233936\/how-can-i-tell-when-a-random-a-vector-is-isotropic\">isotropy condition <\/a><p class=\"ql-center-displayed-equation\" style=\"line-height: 22px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-b5ac53011350edc256d9b8c66fdbe7c8_l3.png\" height=\"22\" width=\"96\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#92;&#101;&#120;&#112;&#101;&#99;&#116;&#091;&#120;&#95;&#105;&#120;&#95;&#105;&#94;&#92;&#116;&#111;&#112;&#093;&#32;&#61;&#32;&#73;&#46;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>Examples of vectors satisfying this condition include<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong><a href=\"https:\/\/en.wikipedia.org\/wiki\/Multivariate_normal_distribution#Standard_normal_random_vector\">Gaussian<\/a>:<\/strong> The entries of each <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-44092f757392a8bd31d874dabe451fd3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"15\" style=\"vertical-align: -3px;\"\/> are iid <a href=\"https:\/\/en.wikipedia.org\/wiki\/Normal_distribution#Standard_normal_distribution\">standard Gaussian random variables<\/a>.<\/li>\n\n\n\n<li><strong>Random signs:<\/strong> The entries of each <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-44092f757392a8bd31d874dabe451fd3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"15\" style=\"vertical-align: -3px;\"\/> are iid random numbers taking the values <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-7b7086fe465164ee1c46f7c8b3f43eaf_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#112;&#109;&#32;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"22\" style=\"vertical-align: 0px;\"\/> with equal probabilities (i.e., <a href=\"https:\/\/en.wikipedia.org\/wiki\/Rademacher_distribution\">Rademacher random variables<\/a>).<\/li>\n\n\n\n<li><strong><a href=\"https:\/\/en.wikipedia.org\/wiki\/N-sphere\">Sphere<\/a>:<\/strong> Each <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-44092f757392a8bd31d874dabe451fd3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"15\" style=\"vertical-align: -3px;\"\/> is a uniformly random point on the (Euclidean) sphere of radius <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-0bbb71c94391d90ce47d108084081941_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#115;&#113;&#114;&#116;&#123;&#110;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"25\" style=\"vertical-align: -4px;\"\/>, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-dd818958ecf7d433c3c06f16c43fedfc_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#105;&#32;&#92;&#115;&#105;&#109;&#32;&#92;&#111;&#112;&#101;&#114;&#97;&#116;&#111;&#114;&#110;&#97;&#109;&#101;&#123;&#85;&#110;&#105;&#102;&#125;&#32;&#92;&#123;&#32;&#121;&#32;&#92;&#105;&#110;&#32;&#92;&#109;&#97;&#116;&#104;&#98;&#98;&#123;&#82;&#125;&#94;&#110;&#32;&#58;&#32;&#121;&#94;&#92;&#116;&#111;&#112;&#32;&#121;&#32;&#61;&#32;&#110;&#32;&#92;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"224\" style=\"vertical-align: -5px;\"\/>.<\/li>\n<\/ul>\n\n\n\n<p>Stochastic trace estimation has a number of applications:&nbsp;<a href=\"https:\/\/doi.org\/10.1007\/s00211-017-0880-z\">log-determinant computations in machine learning<\/a>,&nbsp;<a href=\"https:\/\/arxiv.org\/abs\/2301.07825\">partition function calculations in statistical physics<\/a>,&nbsp;<a href=\"https:\/\/doi.org\/10.1007\/BF01395775\">generalized cross validation for smoothing splines<\/a>, and <a href=\"https:\/\/www.ethanepperly.com\/index.php\/2022\/08\/02\/big-ideas-in-applied-math-concentration-inequalities\/#triangle-counting\">triangle counting in large networks<\/a>. Several improvements to the basic Girard\u2013Hutchinson estimator <a href=\"https:\/\/arxiv.org\/abs\/2010.09649\">have<\/a> <a href=\"https:\/\/arxiv.org\/abs\/2109.10659\">been<\/a> <a href=\"https:\/\/arxiv.org\/abs\/2205.01736\">developed<\/a> <a href=\"https:\/\/arxiv.org\/abs\/2311.07035\">recently<\/a>. I am partial to <a href=\"https:\/\/arxiv.org\/abs\/2301.07825\">XTrace<\/a>, an improved trace estimator that I developed with <a href=\"https:\/\/tropp.caltech.edu\">my<\/a> <a href=\"https:\/\/rwebber.people.caltech.edu\">collaborators<\/a>.<\/p>\n\n\n\n<p>This post is addressed at the question:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Which distribution should be used for the test vectors <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-44092f757392a8bd31d874dabe451fd3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"15\" style=\"vertical-align: -3px;\"\/> for stochastic trace estimation?<\/p>\n<\/blockquote>\n\n\n\n<p>Since the Girard\u2013Hutchinson estimator is <a href=\"https:\/\/en.wikipedia.org\/wiki\/Bias_of_an_estimator\">unbiased<\/a> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-ad9e9d417b59555f2a6e78e78b6bcbe9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#101;&#120;&#112;&#101;&#99;&#116;&#091;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#093;&#32;&#61;&#32;&#92;&#116;&#114;&#40;&#65;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"99\" style=\"vertical-align: -5px;\"\/>, the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Variance\">variance<\/a> of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-73419300cbdf7c3bc535ccb1e5b66722_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"17\" width=\"14\" style=\"vertical-align: 0px;\"\/> is equal to the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Mean_squared_error\">mean-square error<\/a>. Thus, the lowest variance trace estimate is the most accurate. In <a href=\"https:\/\/www.ethanepperly.com\/index.php\/2023\/01\/26\/stochastic-trace-estimation\/\">my previous post on trace estimation<\/a>, I discussed formulas for the variance <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-2eaa501794fbc23c7055c81c8d921e68_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"54\" style=\"vertical-align: -5px;\"\/> of the Girard\u2013Hutchinson estimator with different choices of test vectors. In that post, I stated the formulas for different choices of test vectors (Gaussian, random signs, sphere) and showed how those formulas could be proven.<\/p>\n\n\n\n<p>In this post, I will take the opportunity to editorialize on which distribution to pick. The thesis of this post is as follows:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>The sphere distribution is essentially always preferable to the Gaussian distribution for trace estimation.<\/p>\n<\/blockquote>\n\n\n\n<p>To explain why, let&#8217;s focus on the case when <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/> is real and symmetric.<sup class=\"modern-footnotes-footnote \" data-mfn=\"1\" data-mfn-post-scope=\"00000000000005840000000000000000_1734\"><a href=\"javascript:void(0)\"  role=\"button\" aria-pressed=\"false\" aria-describedby=\"mfn-content-00000000000005840000000000000000_1734-1\">1<\/a><\/sup><span id=\"mfn-content-00000000000005840000000000000000_1734-1\" role=\"tooltip\" class=\"modern-footnotes-footnote__note\" tabindex=\"0\" data-mfn=\"1\">The same principles hold in the general case, but the variance formulas are more delicate to state. See <a href=\"https:\/\/www.ethanepperly.com\/index.php\/2023\/01\/26\/stochastic-trace-estimation\/\">my previous post<\/a> for the formulas.<\/span> Let <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-1f56841840850ad8399f6deaa7722d74_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#95;&#49;&#44;&#92;&#108;&#100;&#111;&#116;&#115;&#44;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#95;&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"75\" style=\"vertical-align: -4px;\"\/> be the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Spectral_theorem#Finite-dimensional_case\">eigenvalues<\/a> of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/> and define the eigenvalue mean <p class=\"ql-center-displayed-equation\" style=\"line-height: 36px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-3f9e9cf25f6cf20463de5be7e8d40f88_l3.png\" height=\"36\" width=\"143\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#95;&#49;&#32;&#43;&#32;&#92;&#99;&#100;&#111;&#116;&#115;&#32;&#43;&#32;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#95;&#110;&#125;&#123;&#110;&#125;&#46;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>Then the variance of the Girard\u2013Hutchinson estimator with Gaussian vectors <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-44092f757392a8bd31d874dabe451fd3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"15\" style=\"vertical-align: -3px;\"\/> is <p class=\"ql-center-displayed-equation\" style=\"line-height: 49px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-f8dc441dc68294b0d6ff6af42d3ecef5_l3.png\" height=\"49\" width=\"229\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#71;&#97;&#117;&#115;&#115;&#105;&#97;&#110;&#125;&#41;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#109;&#125;&#32;&#92;&#99;&#100;&#111;&#116;&#32;&#50;&#32;&#92;&#115;&#117;&#109;&#95;&#123;&#105;&#61;&#49;&#125;&#94;&#110;&#32;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#95;&#105;&#94;&#50;&#46;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>For vectors <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-44092f757392a8bd31d874dabe451fd3_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"15\" style=\"vertical-align: -3px;\"\/> drawn from the sphere, we have <p class=\"ql-center-displayed-equation\" style=\"line-height: 49px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-6a275f8ba8a47be085e22f679b60b497_l3.png\" height=\"49\" width=\"319\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#115;&#112;&#104;&#101;&#114;&#101;&#125;&#41;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#109;&#125;&#32;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#110;&#125;&#123;&#110;&#43;&#50;&#125;&#32;&#92;&#99;&#100;&#111;&#116;&#32;&#50;&#92;&#115;&#117;&#109;&#95;&#123;&#105;&#61;&#49;&#125;&#94;&#110;&#32;&#40;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#95;&#105;&#32;&#45;&#32;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#125;&#41;&#94;&#50;&#46;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p><\/p>\n\n\n\n<p>The sphere distribution improves on the Gaussian distribution in two ways. First, the variance of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-2797e5cf0b76458b3e9bfad1064d0abf_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#115;&#112;&#104;&#101;&#114;&#101;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"93\" style=\"vertical-align: -6px;\"\/> is smaller than <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-ad1f85dd411f50bb415b59255a41d9db_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#71;&#97;&#117;&#115;&#115;&#105;&#97;&#110;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"110\" style=\"vertical-align: -5px;\"\/>by a factor of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-01d8695fbf5c391cf78523c85fa6317c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#110;&#47;&#40;&#110;&#43;&#50;&#41;&#32;&#60;&#32;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"106\" style=\"vertical-align: -5px;\"\/>. This improvement is quite minor. Second, and more importantly, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-ad1f85dd411f50bb415b59255a41d9db_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#71;&#97;&#117;&#115;&#115;&#105;&#97;&#110;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"110\" style=\"vertical-align: -5px;\"\/> is proportional to the sum of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/>&#8216;s squared eigenvalues whereas <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-2797e5cf0b76458b3e9bfad1064d0abf_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#115;&#112;&#104;&#101;&#114;&#101;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"93\" style=\"vertical-align: -6px;\"\/> is proportional to the sum of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/>&#8216;s squared eigenvalues <em>after having been shifted to be mean-zero<\/em>!<\/p>\n\n\n\n<p>The difference between Gaussian and sphere test vectors can be large. To see this, consider a <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-6dbf2012bdd1602aa8d50e2e4e28a775_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#49;&#48;&#48;&#48;&#92;&#116;&#105;&#109;&#101;&#115;&#32;&#49;&#48;&#48;&#48;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"92\" style=\"vertical-align: 0px;\"\/> matrix <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/> with eigenvalues uniformly distributed between <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-7476d37b0d659713eb80723f8fbee627_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#48;&#46;&#57;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"23\" style=\"vertical-align: 0px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-df7e1231dfc65a575d45f74260d1469b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#49;&#46;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"21\" style=\"vertical-align: 0px;\"\/> with a (<a href=\"https:\/\/nhigham.com\/2020\/04\/22\/what-is-a-random-orthogonal-matrix\/\">Haar orthgonal<\/a>) random matrix of eigenvectors. For simplicity, since the variance of all Girard\u2013Hutchinson estimates is proportional to <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-5ffaeb59142daea267b37aa54360080a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#49;&#47;&#109;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"32\" style=\"vertical-align: -5px;\"\/>, we take <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-3d87f6d26309c921d4d6fa6518ca1bca_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#109;&#61;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"47\" style=\"vertical-align: 0px;\"\/>. Below show the variance of Girard\u2013Hutchinson estimator for different distributions for the test vector. We see that <strong>the sphere distribution leads to a trace estimate which has a variance 300\u00d7 smaller than the Gaussian distribution<\/strong>. For this example, the sphere and random sign distributions are similar.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>Distribution<\/strong><\/td><td><strong>Variance (divided by <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-787b54d93fe33d7fa694583d8d806cdf_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#114;&#40;&#65;&#41;&#94;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"48\" style=\"vertical-align: -5px;\"\/>)<\/strong><\/td><\/tr><tr><td>Gaussian<\/td><td><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-870b7234d4d51d6b080aa05d89b542dd_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#50;&#46;&#48;&#92;&#116;&#105;&#109;&#101;&#115;&#32;&#49;&#48;&#94;&#123;&#45;&#51;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"80\" style=\"vertical-align: 0px;\"\/><\/td><\/tr><tr><td>Sphere<\/td><td><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-8ad0dfecaeaee686412ed9d8879bd70e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#54;&#46;&#55;&#92;&#116;&#105;&#109;&#101;&#115;&#32;&#49;&#48;&#94;&#123;&#45;&#54;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"80\" style=\"vertical-align: 0px;\"\/><\/td><\/tr><tr><td>Random signs<\/td><td><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-8ad0dfecaeaee686412ed9d8879bd70e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#54;&#46;&#55;&#92;&#116;&#105;&#109;&#101;&#115;&#32;&#49;&#48;&#94;&#123;&#45;&#54;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"80\" style=\"vertical-align: 0px;\"\/><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Which Distribution <em>Should<\/em> You Use: Signs vs. Sphere<\/h2>\n\n\n\n<p>The main point of this post is to argue <em>against<\/em> using the Gaussian distribution. But which distribution <em>should<\/em> you use: Random signs? The sphere distribution? The answer, for most applications, is one of those two, but exactly which depends on the properties of the matrix <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/>.<\/p>\n\n\n\n<p>The variance of the Girard\u2013Hutchinson estimator with the random signs estimator is<p class=\"ql-center-displayed-equation\" style=\"line-height: 42px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-3a6665f8e55563640aedd5ebb0945409_l3.png\" height=\"42\" width=\"179\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#115;&#105;&#103;&#110;&#115;&#125;&#41;&#32;&#61;&#32;&#50;&#32;&#92;&#115;&#117;&#109;&#95;&#123;&#105;&#92;&#110;&#101;&#32;&#106;&#125;&#32;&#65;&#95;&#123;&#105;&#106;&#125;&#94;&#50;&#46;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>Thus, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-71e773e56981f2f7e7346c56b7eaf12c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#115;&#105;&#103;&#110;&#115;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"84\" style=\"vertical-align: -6px;\"\/> depends on the size of the off-diagonal entries of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/>; <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-71e773e56981f2f7e7346c56b7eaf12c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#115;&#105;&#103;&#110;&#115;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"84\" style=\"vertical-align: -6px;\"\/> <em>does not depend on the diagonal of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/> at all!<\/em> For matrices with small off-diagonal entries (such as <a href=\"https:\/\/nhigham.com\/2021\/04\/08\/what-is-a-diagonally-dominant-matrix\/\">diagonally dominant matrices<\/a>), the random signs distribution is often the best.<\/p>\n\n\n\n<p>However, for other problems, the sphere distribution is preferable to random signs. The sphere distribution is rotation-invariant, so <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-2797e5cf0b76458b3e9bfad1064d0abf_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#115;&#112;&#104;&#101;&#114;&#101;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"93\" style=\"vertical-align: -6px;\"\/> is independent of the eigenvectors of the (symmetric) matrix <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/>, depending only on <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/>&#8216;s eigenvalues. By contrast, the variance of the Girard\u2013Hutchinson estimator with the random signs distribution can significantly depend on the eigenvectors of the matrix <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-11f4f587954b361e7d78940f65b8d70d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: 0px;\"\/>. For a given set of eigenvalues and the worst-case choice of eigenvectors, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-2797e5cf0b76458b3e9bfad1064d0abf_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#115;&#112;&#104;&#101;&#114;&#101;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"93\" style=\"vertical-align: -6px;\"\/> will always be smaller than <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-71e773e56981f2f7e7346c56b7eaf12c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#115;&#105;&#103;&#110;&#115;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"84\" style=\"vertical-align: -6px;\"\/>. In fact, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.ethanepperly.com\/wp-content\/ql-cache\/quicklatex.com-2797e5cf0b76458b3e9bfad1064d0abf_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#86;&#97;&#114;&#40;&#92;&#104;&#97;&#116;&#123;&#92;&#116;&#114;&#125;&#95;&#123;&#92;&#114;&#109;&#32;&#115;&#112;&#104;&#101;&#114;&#101;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"93\" style=\"vertical-align: -6px;\"\/> is the minimum variance distribution for Girard\u2013Hutchinson trace estimation for a matrix with fixed eigenvalues and worst-case eigenvectors; see <a href=\"https:\/\/www.ethanepperly.com\/index.php\/2023\/01\/26\/stochastic-trace-estimation\/#optimality-properties\">this section of my previous post<\/a> for details.<\/p>\n\n\n\n<p>In my experience, random signs and the sphere distribution are both perfectly adequate for trace estimation and either is a sensible default if you&#8217;re developing software. The Gaussian distribution on the other hand&#8230; don&#8217;t use it unless you have a good reason to.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Suppose we are interested in estimating the trace of an matrix that can be only accessed through matrix\u2013vector products . The classical method for this purpose is the Girard\u2013Hutchinson estimator &nbsp; &nbsp; where the vectors are independent, identically distributed (iid) random vectors satisfying the isotropy condition &nbsp; &nbsp; Examples of vectors satisfying this condition include<a class=\"more-link\" href=\"https:\/\/www.ethanepperly.com\/index.php\/2024\/01\/28\/dont-use-gaussians-in-stochastic-trace-estimation\/\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,13,14],"tags":[],"class_list":["post-1734","post","type-post","status-publish","format-standard","hentry","category-expository","category-numerical-linear-algebra-dos-and-donts","category-trace-estimation"],"_links":{"self":[{"href":"https:\/\/www.ethanepperly.com\/index.php\/wp-json\/wp\/v2\/posts\/1734","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.ethanepperly.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ethanepperly.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ethanepperly.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ethanepperly.com\/index.php\/wp-json\/wp\/v2\/comments?post=1734"}],"version-history":[{"count":7,"href":"https:\/\/www.ethanepperly.com\/index.php\/wp-json\/wp\/v2\/posts\/1734\/revisions"}],"predecessor-version":[{"id":1741,"href":"https:\/\/www.ethanepperly.com\/index.php\/wp-json\/wp\/v2\/posts\/1734\/revisions\/1741"}],"wp:attachment":[{"href":"https:\/\/www.ethanepperly.com\/index.php\/wp-json\/wp\/v2\/media?parent=1734"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ethanepperly.com\/index.php\/wp-json\/wp\/v2\/categories?post=1734"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ethanepperly.com\/index.php\/wp-json\/wp\/v2\/tags?post=1734"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}