0

My notes introduce the concept of minimal sufficient statistics as follows:

Definition

A sufficient statistic $T(\mathbf{Y})$ is called a minimal sufficient statistic if it is a function of any other sufficient statistic.

Remark

Except for several very special examples, a minimal sufficient statistic always exists.

Assume the existence of a minimal sufficient statistic and consider partitioning the sample space $\Omega$, where $\mathbf{y}_1, \mathbf{y}_2 \in \Omega$ are assigned to the same equivalence class iff the likelihood ratio $L(\theta; \mathbf{y})/L(\theta,\mathbf{y})$ does not depend on $\theta$.

Define a statistic $T(\mathbf{Y})$ in such a way that $T(\mathbf{y}_1) = T(\mathbf{y_2})$ if $\mathbf{y}_1$ and $\mathbf{y}_2$ belong to the same equivalence class and $T(\mathbf{y}_1) \not= T(\mathbf{y_2})$ otherwise.

Therorem 2

The statistic $T(\mathbf{Y})$ defined above is the minimal sufficient statistic for $\theta$.

Proof of theorem 2 for the discrete case

First we show that $T(\mathbf{Y})$ is sufficient.

$$\begin{align} P_\theta (\mathbf{Y} = \mathbf{y} \vert T(\mathbf{Y}) = t) &= \dfrac{P(\mathbf{Y} = \mathbf{y}, T(\mathbf{Y}) = t)}{P(T(\mathbf{Y}) = t)} \\ &= \dfrac{P(\mathbf{Y} = \mathbf{y})}{\sum_{\mathbf{y}_i : T(\mathbf{y}_i) = t} P(\mathbf{Y} = \mathbf{y}_i)} \\ &= \dfrac{L(\theta; \mathbf{y})}{\sum_{\mathbf{y}_i : T(\mathbf{y}_i) = t} L(\theta; \mathbf{y})} \\ &= \dfrac{1}{\sum_{\mathbf{y}_i : T(\mathbf{y}_i) = t} \dfrac{L(\theta; \mathbf{y}_i)}{L(\theta; \mathbf{y})}} \end{align}$$

Since $T(\mathbf{y}) = T(\mathbf{y}_i) = t$, all $\mathbf{y}_i$ and $\mathbf{y}$ belong to the same equivalence class induced by $T(\mathbf{y})$ and, therefore, the likelihood ratios $L(\theta; \mathbf{y}_i)/L(\theta; \mathbf{y})$ do not depend on $\theta$.

To prove the minimality of $T(\mathbf{Y})$, consider any other sufficient statistic $S(\mathbf{Y})$ and the corresponding partitioning of $\Omega$. Let $\mathbf{y}_1, \mathbf{y}_2 \in \Omega$ belong to the same equivalence class of that partition. According to the factorisation theorem, Theorem 1, $$\dfrac{L(\theta; \mathbf{y}_1)}{L(\theta; \mathbf{y}_2)} = \dfrac{g(S(\mathbf{y}_1), \theta)h(\mathbf{y}_1)}{g(S(\mathbf{y}_2), \theta)h(\mathbf{y}_2)} = \dfrac{h(\mathbf{y}_1)}{h(\mathbf{y}_2)},$$ which does not depend on $\theta$. By definition, $\mathbf{y}_1, \mathbf{y}_2$ then fall within the same equivalence class induced by $T(\mathbf{Y})$ as well and, therefore, $T(\mathbf{Y})$ is a function of $S(\mathbf{Y})$.

Theorem 1 is as follows:

Theorem 1 [Fisher-Neyman Factorisation Theorem] A statistic $T(\mathbf{Y})$ is sufficient for $\theta$ if, and only if, for all $\theta \in \Theta$, $$L(\theta; \mathbf{y}) = g(T(\mathbf{y}, \theta) \times h(\mathbf{y}),$$ where the function $g(\cdot)$ depends on $\theta$ and the statistic $T(\mathbf{Y})$, while the function $h(\cdot)$ does not contain $\theta$.

How does one conclude that $\dfrac{L(\theta; \mathbf{y}_1)}{L(\theta; \mathbf{y}_2)} = \dfrac{g(S(\mathbf{y}_1), \theta)h(\mathbf{y}_1)}{g(S(\mathbf{y}_2), \theta)h(\mathbf{y}_2)} = \dfrac{h(\mathbf{y}_1)}{h(\mathbf{y}_2)}$? Specifically, how do $g(S(\mathbf{y}_1), \theta)$ and $g(S(\mathbf{y}_2), \theta)$ cancel?

The Pointer
  • 1,932
  • 1
    "Belong to the same equivalence class" means $S(y_1)=S(y_2).$ Go on from there. – whuber Apr 15 '21 at 16:00
  • @whuber Ahh, ok, that makes sense. And so, because the $\theta$ are the same, we must have $g(S(\mathbf{y}_1), \theta) = g(S(\mathbf{y}_2), \theta)$? – The Pointer Apr 15 '21 at 18:33

0 Answers0