4

I asked the following question in MSE for which I couldn't get any answer yet. I thought this would be a better place for that question.

In statistical maniolds $S=\{p_\theta\}$,$\theta=(\theta_1,\dots,\theta_n)$, the Riemaanian metric usually defined is the Fisher information metric $$g_{ij}(\partial_i,\partial_j)=\int \partial_i(\log p_\theta) \partial_i(\log p_\theta)~p_\theta~dx$$

The associated connection coefficients are defined by $$\Gamma_{ij}^k=\int \partial_i\partial_j(\log p_\theta)\partial_k (\log p_\theta)~ p_{\theta}~dx$$

where $\partial_i=\frac{\partial}{\partial\theta_i}$.

My question is, what is the intuition behind defining these? Is there a way to prove using the above metric and connection that the linear family of probability distributions $$L=\{p:\int f_i(x)p(x)~dx=m_i, i=1,\dots,k\}$$ intersects "orthogonally" the associated exponential family $$\mathcal{E}=\{p:p(x)=c(\theta)q(x)\exp(-\sum_{i=1}^k\theta_i f_i(x))\}$$ in the sense that $L\cap\mathcal{E}=\{p^*\}$ where $p^*$ satisfies $$D(p\|q)=D(p\|p^*)+D(p^*\|q)$$ for every $p\in L, q\in \mathcal{E}$.

I recently came to know about the connection between Fisher information metric and the relative entropy: $$D( p(\cdot , a+da) \| p(.,a) )\approx\frac{1}{2} g_{i,j} da^{i} da^{j}$$ Would this be a backbone in establishing the above result?

Kumara
  • 617

1 Answers1

2

Very short "answer": we've discussed in your other question the roots of the idea of geometrization of statistical models by Jeffreys, how they become Riemannian manifolds with metric $g_{ij}$, etc. Much later, classical statisticians such as Efron, Amari, etc, got interested in the idea of the curvature of these manifolds, because there is a link with the concept of second order efficiency of classical estimators. It turns out that the Jeffreys metric (aka Fisher-Rao Information Metric) is kind of boring from the curvature standpoint. For example, for the normal model, if you consider the connection compatible with $g_{ij}$, compute the corresponding Christoffel symbols $\Gamma^i_{jk}$, and the Riemann tensor $R^i_{jkl}$, you will figure out that this is a space of constant scalar curvature $R=g^{ij} R^k_{ijk}$ (this is a good exercise: do it!). To "enrich" the geometry, Amari considered a family of $\alpha$-connections, which are not, in general, compatible with $g_{ij}$, and studied the curvature induced by this family of connections. Amari succeeded in establishing links with second order efficiency, and did a lot of interesting things with his $\alpha$-connections. You can find all this and much more in his book.

http://www.amazon.com/Information-Translations-Mathematical-Monographs-Tanslations/dp/0821843028

Zen
  • 24,121