1

If $X$ is n-dimensional standard Gaussian, is there an analytic expression for the differential entropy of the $\ell^p$ norm of $X$?

For the case $p=2$, the $\ell^2$ norm is exactly the chi distribution and the entropy of the chi distribution is known. For the case $p=1$, the method here can be used to derive an analytic expression for the pdf, though I'm not sure if an analytic expression for the entropy exists.

I am particularly interested in $p=1$ but general $p$ would be nice too.

Alex Tan
  • 33
  • 4
  • When $p=1$ the entropy is an (easily computable) constant plus that of the standard Normal in $n$ dimensions, as will be the case of any distribution that is symmetric in all its marginals. – whuber Sep 28 '23 at 13:23
  • Can you elaborate on this a bit more? I can see why the constant would be -1 for the case n=1, but not for the more general case. – Alex Tan Sep 29 '23 at 02:38

1 Answers1

2

A general result

The following requires only the most elementary calculation (a little arithmetic) and keeping track of some standard mathematical definitions. It generalizes the concept of a "half-Normal distribution" in which the standard Normal distribution is restricted to non-negative values, implying the density is doubled on $[0,\infty)$ and set to zero on $(-\infty, 0)$. The group in that setting has two elements $\{e,g\}$ where $e$ is the identity, $x^e=x$ for all numbers $x;$ and $g$ negates values, $x^g = -x.$ (Throughout this post, superscript notation denotes the action of the group, not exponentiation.)


Suppose $G$ is any finite group of order $|G|$ acting (measurably with respect to, say, Lebesgue measure $\mathrm d\lambda$) on $\mathbb R^n$ and $\mathrm dF$ is any absolutely continuous probability density on $\mathbb R^n$ invariant under that group action: that is, $\mathrm dF = f\,\mathrm d\lambda$ is a symmetric distribution in the sense of https://stats.stackexchange.com/a/29010/919. Specifically, for any event $\mathcal E\subset \mathbb R^n$ and any group element $g\in G,$

$$\mathrm dF(\mathcal E) = \mathrm dF(\mathcal E^g).$$

A fundamental domain for $G$ is a region $\Lambda \subset \mathbb R^n$ which fills out the entire space under the action of $G,$

$$\mathbb R^n = \bigcup_{g\in G} \Lambda^g,$$

without appreciable overlap in the sense that when $g\ne h\in G,$ the intersection $\Lambda^g \cap \Lambda^h$ has Lebesgue measure zero.

The differential entropy of $\mathrm dF$ is the expectation of $-\log f:$

$$H(\mathrm dF) = \int_{\mathbb R^n} (-\log f(x))\,f(x)\,\mathrm d\lambda.$$

These assumptions about the action imply $H$ can be computed separately over each region $\Lambda^g$ for $g\in G$ and that its value is the same in each region:

$$H(\mathrm dF) = \sum_{g\in G} \int_{\Lambda^g} (-\log f(x))\, f(x)\, \mathrm d\lambda = |G| \int_{\Lambda} (-\log f(x))\, f(x)\, \mathrm d\lambda.$$

Define a new distribution $\mathrm d F_{G;\Lambda}$ to be proportional to $\mathrm dF$ on $\Lambda$ and zero elsewhere. From the foregoing, the constant of proportionality must be $|G|,$ whence the density function for this restricted distribution is

$$f_{G;\Lambda}(x) = |G| f(x), \mathcal{I}(x\in\Lambda)$$

($\mathcal I$ is the indicator function). By definition, then, its entropy is

$$\begin{aligned} H(\mathrm dF_{G;\Lambda}) &= \int_{\mathbb R^n} (-\log f_{G;\Lambda}(x))\,f_{G;\Lambda}(x)\,\mathrm d\lambda\\ &= \int_{\Lambda} (-\log[|G| f(x)])\,|G|\,f(x)\,\mathrm d\lambda\\ &= \int_{\Lambda} -(\log |G| + \log f(x))\,|G|\,f(x)\,\mathrm d\lambda\\ &=\int_{\Lambda} (-\log |G|)\,|G|\,f(x)\,\mathrm d\lambda + \int_{\Lambda} (-\log f(x))\,|G|\,f(x)\,\mathrm d\lambda\\ &=(-\log |G|)\int_{\mathbb R^n} f(x)\,\mathrm d\lambda + \int_{\Lambda} (-\log f(x))\,f(x)\,\mathrm d\lambda\\ &=-\log|G| + H(\mathrm dF). \end{aligned}$$

The only new observation in this otherwise trivial string of identities is that because $\mathrm dF$ is a probability distribution, $f$ integrates to unity.


Application to the problem

The $\mathcal L^1$ norm of $X=(X_1,X_2,\ldots, X_n)$ is the sum of absolute values of its components,

$$||X||_1 = |X_1| + |X_2| + \cdots + |X_n|.$$

When all the components are non-negative (that is, they lie in the first orthant $\Lambda = \{x\in\mathbb R^n\mid x_1 \ge 0, x_2 \ge 0, \ldots, x_n \ge 0\}$), this is just the sum of the components.

The standard Normal distribution $\mathrm dF$ is invariant under the group $G$ generated by all reflections in the coordinates, because each component's distribution is unchanged upon negating its values. Because these reflections all commute, this group has order $|G| = 2^n.$ The first orthant is a fundamental domain for this group. Consequently the distribution of $||X||_1$ is $\mathrm dF_{G;\Lambda}$ and its entropy is

$$H(\mathrm dF_{||X||_1}) = -\log(|G|) + H(\mathrm dF) = -n\log 2 + H(\mathrm dF).$$

whuber
  • 322,774
  • Perhaps I misunderstand, but it seems like you're saying that the entropy of the normal density restricted to the first orthant is $-n\log 2+H(dF)$, which I agree with but is not the question I asked. For example, the pdf of $|X_1|+|X_2|$ is $2\sqrt{ 2 }[ 2\Phi( \frac{x}{\sqrt{2}} )-1]\phi( \frac{x}{\sqrt{ 2 }} )$ with support on $[0,\infty)$ (see this post) and Mathematica tells me the entropy of this is $1+\frac 1\pi+\frac 12\log\frac\pi 4$ which doesn't agree with what you have. – Alex Tan Oct 01 '23 at 12:19
  • You are correct, Alex: I misinterpreted the question. The answer actually is much simpler, because since each $|X_i|$ has a Gamma$(1/2)$ distribution, their sum has a Gamma$(n/2)$ distribution. Its entropy is readily calculated and reported at https://en.wikipedia.org/wiki/Gamma_distribution. – whuber Oct 01 '23 at 14:01
  • $|X_i|$ has pdf proportional to $\exp(-x^2/2)$ on $[0,\infty)$ and so can't be Gamma. In any case the pdf of $|X_1|+\cdots+|X_n|$ has a closed form of $2\sqrt{n}(2\Phi(x/\sqrt{n})-1)^{n-1}\phi(x/\sqrt{n})$, but I would like to know if there is a closed form for the entropy of this (even if asymptotically). – Alex Tan Oct 01 '23 at 23:07