6

Let $a \sim N(\mu_a,1/\tau)$, and $s = a + \epsilon$, where $\epsilon \sim N(0,1/\eta)$. I know that because both $a$ and $\epsilon$ is normal distribution, s must also be normally distributed with $s \sim N(\mu_a,\dfrac{\tau +\eta}{\tau\eta})$. $s$ is interpreted as a signal to $a$ that is not observed. Then the conditional expectation of $a$ given $s$ is given by:

\begin{align*} \mathbb{E}[a \mid s] & = \mu_a + \dfrac{cov(a,s)}{var(s)}(s-\mu_a)\\ & = \mu_a + \dfrac{\dfrac{1}{\tau}}{\dfrac{\tau + \eta}{\tau \eta}}(s-\mu_a) \\ & = \dfrac{\tau \mu_a + \eta s}{\tau + \eta} \end{align*}

Consider another $\tilde{s} = a + \tilde{\epsilon}$, where $\tilde{\epsilon} \sim N(0,1/\tilde{\eta})$. This is another signal to $a$, and $\tilde{\epsilon}$ is independent from $\epsilon$. We observe $s$ first, and update the belief, and then observe $\tilde{s}$. I would like to compute the expected value of $a$ given $s$, conditional on $\tilde{s}$.

That is, let $z = a \mid s$ be a conditional distribution of $a$ given $s$. Then I would like to compute $\mathbb{E}[z \mid \tilde{s}]$. I want to use the same formula as above, but I am unsure what $cov(z,\tilde{s})$ is.

I know $cov(z,\tilde{s}) = cov(z,a + \tilde{\epsilon}) = cov(z,a)$. How can I move forward from here?

EDIT: I have learned that the order of the signal does not matter for Bayesian updating. Then what I am really finding is:

\begin{align*} \mathbb{E}[z \mid \tilde{s}] = \mathbb{E}[a \mid s, \tilde{s}] & = \mu_a + \dfrac{cov(a,s)}{var(s)}(s-\mu_a) + \dfrac{cov(a,\tilde{s})}{var(\tilde{s})}(\tilde{s}-\mu_{a})\\ \end{align*}

Is this the correct approach? I don't feel confident, because $s$ and $\tilde{s}$ is correlated and the term above does not include any information regarding that.

EDIT2: Based on the Chris Leite's solution, this is what I understand so far:

\begin{align*} \mathbb{E}[z \mid \tilde{s}] & = \mathbb{E}[a \mid s, \tilde{s}] \\ & = \mathbb{E}[a \mid s'] \text{ where $s' = s + \tilde{s}$} \\ & = \mu_a + \dfrac{cov(a,s')}{var(s')}(s'-\mu_{s'}) \\ & = \mu_a + \dfrac{cov(a,s+\tilde{s})}{var(s+\tilde{s})}(s+\tilde{s}-2 \mu_{a}) \\ & = \mu_a + \dfrac{2var(a)}{var(s) + var(\tilde{s}) + 2cov(s,\tilde{s})}(s+\tilde{s}-2 \mu_{a}) \\ & = \mu_a + \dfrac{2\eta\tilde{\eta}}{\eta \tau + \tilde{\eta} \tau + 4 \eta \tilde{\eta}}(s+\tilde{s}-2 \mu_{a}) \\ \end{align*}

Hosea
  • 201
  • How did you get $E[a|s]=\frac{\tau\mu+\eta s}{\tau+\eta}$, and where is this formula from? I'm getting $\frac{\eta^2s+\tau^2\mu}{\eta^2+\tau^2}$ – Spätzle Nov 01 '23 at 07:51
  • All three r.v. $a, s , \tilde{s} $ Have the same mean. You can simplify the problem by substracting this mean. Other than that, I think, that the correlation of s and $\tilde{s} $ becomes irrelevant once you condition on both. The uncertainty of $a$ then only is caused by $\epsilon, \tilde{\epsilon}$ – ChrisL Nov 01 '23 at 11:57
  • @Spätzle I have included one more intermediary steps. Regarding the formula, I have seen this in various source, but here is one similar one (https://stats.stackexchange.com/questions/30588/deriving-the-conditional-distributions-of-a-multivariate-normal-distribution) though this is multidimensional. – Hosea Nov 01 '23 at 19:51
  • The random variable $z = a \mid s$ does not exist. – Xi'an Nov 02 '23 at 08:23
  • @Xi'an I'm not sure I follow your comment. While I'm not confident if the notation is right, there exists a conditional distribution of a given s, which is itself normal with mean stated in the question, and variance is also easy to compute. So z is a random variable that follows normal distribution. – Hosea Nov 02 '23 at 13:00
  • There is a random variable $a$ whose conditional distribution can be derived but this does not turn it into another random variable. – Xi'an Nov 02 '23 at 13:14

3 Answers3

7

Is this the correct approach? I don't feel confident, because $s$ and $\tilde{s}$ is correlated and the term above does not include any information regarding that.

The $s$ and $\tilde{s}$ are not correlated when you condition on $a$. They are independent distributed according to

$$s|a \sim N(a,1/\eta) \\ \tilde{s}|a \sim N(a,1/\tilde\eta)$$

or if you take both together with inverse variance weighting

$$\frac{\eta s+ \tilde{\eta}\tilde{s}}{\eta+\tilde{\eta}}|a \sim N\left(a,\frac{1}{\eta+\tilde{\eta}}\right)$$

In these three equations, you can regard the parameter $a$ as following a prior distribution

$$a \sim N(\mu_a,1/\tau)$$

and you are finding the posterior distribution after observing $\tilde{s}$ and/or $s$.

$$\begin{array}{lcrcl} a|s &\sim & N(\mu_{a|s},&\sigma_{a|s})\\ a|\tilde{s} &\sim & N(\mu_{a|\tilde{s}},&\sigma_{a|\tilde{s}})\\ a|s,\tilde{s} &\sim & N(\mu_{a|s,\tilde{s}},&\sigma_{a|s,\tilde{s}}) \end{array}$$

That posterior can be found with the updating rules that are derived here:

Bayesian updating with new data

Also very useful is this section about Bayesian inference on the Wikipedia page about the normal distribution.

It's a bit of work to write it down, but two update steps with the independent $s$ and $\tilde{s}$ should give the same result as one single update step with the weighted mean.

You don't need to worry here about correlations between $s$ and $\tilde{s}$. You just have the process of updating the distribution for $a$ based on the distributions in the first three equations. What changes with the sequential updating is that the posterior of the first step is the prior for the second step.

  • Thank you for your answer! I wasn't aware of the idea of conjugate prior, and I learn something new today. I would like to have a closed-form solution if possible for my application, and I think I have the closed-form solution using the approach given by Chris. – Hosea Nov 01 '23 at 20:03
4

REMARK/ EDIT: This answer does not contain a solution to the problem and was provided as a stepping stone to generate one. It aimed to look for a way to use the symmetry of the problem and led to this question which finally helped to solve the problem.

Assume $ \eta = \tilde{\eta} $ Then since $ \epsilon, \tilde{\epsilon} $ have mean zero are symmetric and have the same variance, you have $$a \stackrel{d}{=} s + \epsilon \stackrel{d}{=} \tilde{s} + \tilde{\epsilon} $$ So you can define $s'$ and $\epsilon'$ such that $$a = \frac{1}{2}(s +\tilde{s}) + \frac{1}{2}(\epsilon +\tilde{\epsilon}) \\ := s' + \epsilon' $$

Now you can calculate $$ \mathbb{E}[a \mid s'] = \mu + \frac{cov(a, s) + cov(a, \tilde{s})}{var(s) + var(\tilde{s}) + 2cov(s, \tilde{s})} (s + \tilde{s} - 2\mu) $$

Because of this post, if $\sigma_{\epsilon} = \sigma_a = \sigma_{\tilde{\epsilon}} $, then $$ \mathbb{E}[a \mid s'] = \mathbb{E}[a \mid s, \tilde{s}]$$

See the accepted answer for the general case where variances differ.

ChrisL
  • 311
  • The sum of $\tilde{s}$ and $s$ is a sufficient statistic. Knowing $\tilde{s}$ and $s$ versus knowing just the sum gives the same information about $a$. – Sextus Empiricus Nov 01 '23 at 13:46
  • @SextusEmpiricus Do you mean that this approach is valid? I have written an edit in the question to incorporate what I have learned from this answer. – Hosea Nov 01 '23 at 19:53
  • I simply applied the formula of the $E(a|s)$ given in the question to $E(a|s')$ . Note that a is not being defined here. I was only defining s' and $\epsilon'$ – ChrisL Nov 01 '23 at 20:46
  • @SextusEmpiricus if the sum of $\tilde{s}$ and $s$ is sufficient statistics, isn't $E[a|s,\tilde{s}] = E[a|s']$? Could you comment on why the formula is not valid? – Hosea Nov 01 '23 at 23:45
  • @Hosea I think the formula is right. It looked a bit alien because of the many terms and the $\mu$ occurs twice. We can write $cov(a,s') = var(a)$ and $var(s') = var(a)+var(\epsilon)/2$, that can be used to simplify the formulas. – Sextus Empiricus Nov 02 '23 at 07:17
  • I am not convinced that knowing the sum gives as much information as knowing the parts. If we know only the sum $s + \tilde{s} = z$ then $a$ must be close to $z$ with uncertainty $var(\epsilon) + var(\tilde{\epsilon})$. But if we know $s = x$ and $\tilde{s} = y$ with uncertainty $var(\epsilon)$ and $var(\tilde{\epsilon}) $ then we can look at the intersection of the intervals around $x$ and $y$ and get a better estimate of $a$ then from the sum. The estimate improves over the sum when $s$ and $\tilde{s}$ are less correlated – ChrisL Nov 02 '23 at 08:22
  • If $var(\epsilon)$ and $var(\tilde{\epsilon})$ are different, then the mean of $s$ and $\tilde{s}$ is indeed not a sufficient statistic. But, some other mean, a weighted mean, will be instead the sufficient statistic. I see now that the $s$ and $\tilde{s}$ can have different variance, bit that doesn't change the principle of the problem (just adds more algebraic work). – Sextus Empiricus Nov 02 '23 at 10:45
  • That sound plausible and i think it deserves to appear as an answer. I think it is interesting enough to be stated as a seperate question that can be used as a refrence. – ChrisL Nov 02 '23 at 11:22
  • https://stats.stackexchange.com/questions/630269/let-s-x-u-t-x-v-for-normal-r-v-x-u-v-is-it-possible-to-find-a-b-re – ChrisL Nov 02 '23 at 11:45
  • You seem to ignore the assumptions about precision and implicitly assume $\eta = \tilde\eta.$ The result contradicts the other two answers and is likely to confuse readers. (-1) – whuber Nov 22 '23 at 15:54
  • @whuber Thank you for pointing that out. I have adapted my answer. However, I gave the answer in order to help before the accepted answer was deduced from my answer here and an answer to this question: stats.stackexchange.com/questions/630269/… . It is incomplete and rather confusing. Should I delete it? – ChrisL Nov 22 '23 at 16:12
  • 2
    Options including editing it to note the history; correcting it to reflect your current understanding; deleting it; and letting it stand, hoping readers will go through this comment thread for more information. (I have employed all these strategies with my own answers!) – whuber Nov 22 '23 at 16:15
  • Thank you for the advice. I decided to add a remark and keep the answer. – ChrisL Nov 22 '23 at 16:40
2

This answer is adapted from another question. To solve $E[a \mid s, \tilde{s}]$, I first need to figure out the joint distribution of $(a,s,\tilde{s})$. Note that by definition, they are linear combination of $(a,\varepsilon,\tilde{\varepsilon})$.

\begin{align*} \begin{bmatrix} a \\ s \\ \tilde{s} \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \\ 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix} \begin{bmatrix} a \\ \varepsilon \\ \tilde{\varepsilon} \end{bmatrix} \end{align*}

Now first note that $s \sim N(\mu,\dfrac{\tau + \eta}{\tau \eta})$ and $\tilde{s} \sim N(\mu,\dfrac{\tau + \tilde{\eta}}{\tau \tilde{\eta}})$. Because $(a,\varepsilon,\tilde{\varepsilon})$ is independent, $(a,s,\tilde{s})$ forms a multivariate normal distribution:

\begin{align*} \begin{bmatrix} a \\ s \\ \tilde{s} \end{bmatrix} \sim N \Bigg( \begin{bmatrix} \mu \\ \mu \\ \mu \end{bmatrix}, \begin{bmatrix} \frac{1}{\tau} & \frac{1}{\tau} & \frac{1}{\tau} \\ \frac{1}{\tau} & \frac{\tau + \eta}{\tau \eta} & \frac{1}{\tau} \\ \frac{1}{\tau} & \frac{1}{\tau} & \frac{\tau + \tilde{\eta}}{\tau \tilde{\eta}} \end{bmatrix} \Bigg) \end{align*}

Then the expectation of $a$ conditional on $s,\tilde{s}$ is given by:

\begin{align*} E[a \mid s, \tilde{s}] & = \mu_a + \begin{bmatrix} \frac{1}{\tau} & \frac{1}{\tau} \end{bmatrix} \begin{bmatrix} \frac{\tau + \eta}{\tau \eta} & \frac{1}{\tau} \\ \frac{1}{\tau} & \frac{\tau + \tilde{\eta}}{\tau \tilde{\eta}} \end{bmatrix}^{-1} \begin{bmatrix} s - \mu_{s} \\ \tilde{s} - \mu_{\tilde{s}} \end{bmatrix} \\ & = \mu + \dfrac{(s-\mu)\tau\eta + (\tilde{s}-\mu)\tau\tilde{\eta}}{(\tau + \eta)(\tau + \tilde{\eta})} \end{align*}

Hosea
  • 201