5

I want to check whether Brier Score is a strictly proper scoring rule based on some definition I found here. Since the paper is behind a paywall, I provide the definition here:

A scoring rule assigns a numerical score $S(F, y)$ to each pair $(F, y)$, where $F \in \mathcal{F}$ is a probabilistic forecast and $y \in \mathbb{R}$ is the realized value. We write $S(F, G) = \mathbb{E}_G[S(F, Y)]$ for the expected score under $G$ when the probabilistic forecast is $F$. The scoring rule is proper relative to the class $\mathcal{F}$ if $S(G, G) \leq S(F, G)$. It is strictly proper if it holds with equality only if $F = G$.

A similar definition can also be found here (no paywall).

My attempt:

I only try to convince myself that it is true and that I understood the definition. So I simplify the problem.

Let $G \sim \text{Bernoulli}(p_1)$, $F \sim \text{Bernoulli}(p_2)$ and let $S$ be the Brier score.

\begin{align*} S(F, G) &= \mathbb{E}_G[S(F, Y)]\\ &= \sum_{x}p_G(x)\left(p_F(x) - y(x)\right)^2\\ &= p_1(p_2 - y(0))^2 + (1 - p_1)((1 - p_2) - y(1))^2 \end{align*}

\begin{align*} S(G, G) &= p_1(p_1 - y(0))^2 + (1 - p_1)((1 - p_1) - y(1))^2 \end{align*}

If $p_1 = 1$, then $S(G, G) = (1 - y(0))^2 \leq (p_2 - y(0))^2 = S(F, G)$. Only if $p_2 = 1$, it can be strictly proper and then $F = G$. Hence, it is a proper scoring rule.

Update:

I just set $y(0) = 1$ and $y(1) = 0$ to see what happens ("ground truth").

$$S(G, G) = p_1(p_1 - 1)^2 + (1 - p_1)^2 \leq p_1(p_2 - 1)^2 + (1 - p_1)(1 - p_2) = S(F, G)$$

When $p_1 = 0.3$, then the left side is $0.637$. The right side is $1 - 1.3 p_2 + 0.3 p_2^2$. If I set $p_2 = 0.9$, then the inequality does not hold anymore because the right side is $0.073$. Not sure what I am missing...

1 Answers1

2

I know now, why I had the wrong results, I used an incorrect definition of the Brier score and did not know what to do with $Y$. $y$ is here the index i.e. $Y = y$.

Let $S(G, y) = \sum_{i=1}^n (\delta_{iy} - p_G(i))^2$ be the Brier score where $\delta _{{ij}}={\begin{cases}0&{\text{if }}i\neq j,\\1&{\text{if }}i=j.\end{cases}}$. I assume again that $G$ and $F$ are both Bernoulli distributed. Then

\begin{align*} S(G, G) &= \mathbb{E}_G[S(G, Y)]\\ &= \sum_{x} p_G(x)\left(\sum_{i=1}^n (\delta_{ix} - p_G(i))^2\right)\\ &= p_1((\delta_{11} - p_1)^2 + (\delta_{21} - (1 - p_1))^2) + (1 - p_1)((\delta_{12} - p_1)^2 + (\delta_{22} - (1 - p_1))^2)\\ &= p_1((1 - p_1)^2 + (-(1 - p_1))^2) + (1 - p_1)((-p_1)^2 + (1 - (1 - p_1))^2)\\ &= 2p_1 - 2p_1^2 \end{align*}

\begin{align*} S(F, G) &= \mathbb{E}_G[S(F, Y)]\\ &= \sum_{x} p_G(x)\left(\sum_{i=1}^n (\delta_{ix} - p_F(i))^2\right)\\ &= p_1((1 - p_2)^2 + (-(1 - p_2))^2) + (1 - p_1)((-p_2)^2 + (1 - (1 - p_2))^2)\\ &= 2 p_2^2 - 4 p_1 p_2 + 2 p_1 \end{align*}

Then the inequality is

\begin{align*} S(G, G) = 2p_1 - 2p_1^2 &\leq 2 p_2^2 - 4 p_1 p_2 + 2 p_1 = S(F, G)\\ \iff (p_1 - p_2)^2 &\geq 0 \end{align*}

The only way to achieve equality is $p_1 = p_2$. Hence, Brier score is a strictly proper scoring rule. One could generalize the results, but for me the Bernoulli case is good enough.