In the previous posts, we’ve been using eigenvalues to understand the mixing of reversible Markov chains. Our main convergence result was as follows:







Bounding the the rate of convergence requires an upper bound on and a lower bound on
. In this post, we will talk about techniques for bounding
. For more on the smallest eigenvalue
, see the previous post.
Setting
Let’s begin by establishing some notation, mostly the same as previous posts as this series. We work with a reversible Markov chain with transition matrix and stationary distribution
.
As in previous posts, we identify vectors and functions
, treating them as one and the same
.
For a vector/function ,
and
denote the variance with respect to the stationary distribution
:

We shall also use expressions such as
![Rendered by QuickLaTeX.com \expect_{x \sim \sigma, y\sim \tau} [f(x,y)]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-c0a4b94079ecb93b1666e9c7450453de_l3.png)





We denote the eigenvalues of the transition matrix are . The associated eigenvectors (eigenfunctions)
are orthonormal in the
-inner product
Variance and Local Variance
To discover methods for bounding , we begin by investigating a seemingly simple question:
How much variable is the output of a function
?
There are two natural quantities which provide answers to this question: the variance and the local variance. Poincaré inequalities—the main subject of this post—establish a relation between these two numbers. As a consequence, Poincaré inequalities will provide a bound on .
Variance
We begin with the first of our two main characters, the variance . The variance is a very familiar measure of variation, as it is defined for any random variable. It measures the average squared deviation of
from its mean, where
is drawn from the stationary distribution
.
Another helpful formula for the variance is the exchangeable pairs formula:





Local Variance
The exchangeable pairs formula shows that variance is a measure of the global variability of the function: It measures the amount varies across locations
and
sampled randomly from the entire set of possible states
.
The local variance measures how much varies between points
and
which are separated by just one step of the Markov chain, thus providing a more local measure of variability. Let
be sampled from the stationary distribution, and let
denote one step of the Markov chain after
. The local variance is
An important note: The variance of a function depends only on the stationary distribution
. By contrast, the local variance depends on the Markov transition matrix
.
Poincaré Inequalities
If does not vary much over a single step of the Markov chain, then it seems reasonable to expect that it doesn’t vary much globally. This intuition is made quantitative using Poincaré inequalities.
Definition (Poincaré inequality). A Markov chain is said to satisfy a Poincaré inequality with constant
if
(1)
Poincaré Inequalities and Mixing
Poincaré inequalities are intimately related with the speed of mixing for a Markov chain.
To see why, consider a function with small local variance. Because
has small local variance,
is close to
,
is close to
, etc.; the function
does not change much over a single step of the Markov chain. Does this mean that the (global) variance of
will also be small? Not necessarily. If the Markov chain takes a long time to mix, the small local variance can accumulate to a large global variance over many steps of the Markov chain. Thus, a slowly mixing chain has a large Poincaré constant
. Conversely, if the chain mixes rapidly, the Poincaré constant
is small.
This relation between mixing and Poincaré inequalities is quantified by the following theorem:
Theorem (Poincaré inequalities from eigenvalues). The Markov chain satisfies a Poincaré inequality with constant
This is the smallest possible Poincaré inequality for the Markov chain.
One way to interpret this result is that the eigenvalue gives you Poincaré inequality (1). But we can flip this result around: Poincaré inequalities (1) establish bounds on the eigenvalue
.
Corollary (Eigenvalue bounds from Poincaré inequalities). If the Markov chain satisfies a Poincaré inequality (1) for a certain constant
, then
A View to the Continuous Setting
For a particularly vivid example of a Poincaré inequality, it will be helpful to take a brief detour to the world of continuous Markov processes. This series has—to this point—exclusively focused on Markov chains that have finitely many possible states and are indexed by discrete times
.. We can generalize Markov chains by lifting both of these restrictions, considering Markov processes
which take values in continuous space (such as the real line
) and are indexed by continuous times
.

The mathematical details for Markov processes are a lot more complicated than for their Markov chain siblings, so we will keep it light on details.
For this example, our Markov process will be the Ornstein–Uhlenbeck process. This process has the somewhat mysterious form





Conditional on its starting value
, the Ornstein–Uhlenbeck process
is has a Gaussian distribution with mean
and variance
.
From this observation, it appears that the stationary distribution of the Ornstein–Uhlenbeck process is the standard Gaussian distribution. Indeed, this is the case, and the Ornstein–Uhlenbeck process converges to stationarity exponentially fast.
Since we have exponential convergence to stationarity,1And, as can be checked, the Ornstein–Uhlenbeck process is reversible, in the appropriate sense. there’s a Poincaré inequality lurking in the background, known as the Gaussian Poincaré inequality. Letting denote a standard Gaussian random variable, Gaussian Poincaré inequality states that
(2)
The Gaussian Poincaré inequality presents a very clear demonstration of what a Poincaré inequality is: The global variance of the function is controlled by its local variability, here quantified by the expected squared derivative:


Our main interest in Poincaré inequalities in this post is instrumental, we seek to use Poincaré inequalities to understand the mixing properties of Markov chains. But the Gaussian Poincaré inequality demonstrates that Poincaré inequalities are also interesting on their own terms. The inequality (2) is a useful inequality for bounding the variance of a function of a Gaussian random variable. As an immediate example, observe that the function has derivative bounded by
:
. Thus,


![Rendered by QuickLaTeX.com \Var(\tanh(Z)) = 0.5\expect[(\tanh Z - \tanh Z')^2] \le 0.5 \expect[(Z-Z')^2] = \Var(Z) = 1](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-086d0558ae776d2aec23d2765d397cee_l3.png)


Poincaré Inequalities and Eigenvalues
For the remainder of this post, we will develop the connection between Poincaré inequalities and eigenvalues, leading to a proof of our main theorem:
Theorem (Poincaré inequalities from eigenvalues). The Markov chain satisfies a Poincaré inequality with constant
That is,
(3)
There exists a functionfor which equality is attained.
We begin by showing that it suffices to consider mean-zero functions to prove (3). Next, we derive formulas for
and
using the
-inner product
. We conclude by expanding
in eigenvectors of
and deriving the Poincaré inequality (3).
Shift to Mean-Zero
To prove the Poincaré inequality (3), we are free to assume that has mean zero,
. Indeed, both the variance
and local variance
don’t change if we shift
by a constant
. That is, letting
denote the function



Variance
Our strategy for proving the main theorem will be to develop a more linear algebraic formula for the variance and local variance. Let’s begin with the variance.
Assume . Then the variance is

Local Variance
Now we derive a formula for the local variance:



Let’s take each of these terms one-by-one. For , recognize that
. Thus,








Conclusion
The Poincaré inequality

Consider a decomposition of as a linear combination of
‘s eigenvectors:
![Rendered by QuickLaTeX.com \expect[f] = 0](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-738717fe7c0b6e6b52a72c9664515948_l3.png)

Using the orthonormality of under the
-inner product and the eigenvalue relation
, we have that
Thus,
(4)
where






