28

Background and Empirical Example

I have two studies; I ran an experiment (Study 1) and then replicated it (Study 2). In Study 1, I found an interaction between two variables; in Study 2, this interaction was in the same direction but not significant. Here is the summary for Study 1's model:

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)              5.75882    0.26368  21.840  < 2e-16 ***
condSuppression         -1.69598    0.34549  -4.909 1.94e-06 ***
prej                    -0.01981    0.08474  -0.234  0.81542    
condSuppression:prej     0.36342    0.11513   3.157  0.00185 ** 

And Study 2's model:

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)           5.24493    0.24459  21.444   <2e-16 ***
prej                  0.13817    0.07984   1.731   0.0851 .  
condSuppression      -0.59510    0.34168  -1.742   0.0831 .  
prej:condSuppression  0.13588    0.11889   1.143   0.2545  

Instead of saying, "I guess I don't have anything, because I 'failed to replicate,'" what I did was combine the two data sets, created a dummy variable for what study the data came from, and then ran the interaction again after controlling for study dummy variable. This interaction was significant even after controlling for it, and I found that this two-way interaction between condition and dislike/prej was not qualified by a three-way interaction with the study dummy variable.

Introducing Bayesian Analysis

I had someone suggest that this is a great opportunity to use Bayesian analysis: In Study 2, I have information from Study 1 that I can use as prior information! In this way, Study 2 is doing a Bayesian updating from the frequentist, ordinary least squares results in Study 1. So, I go back and re-analyze the Study 2 model, now using informative priors on the coefficients: All the coefficients had a normal prior where the mean was the estimate in Study 1 and the standard deviation was the standard error in Study 1.

This is a summary of the result:

Estimates:
                       mean    sd      2.5%    25%     50%     75%     97.5%
(Intercept)             5.63    0.17    5.30    5.52    5.63    5.74    5.96
condSuppression        -1.20    0.20   -1.60   -1.34   -1.21   -1.07   -0.80
prej                    0.02    0.05   -0.08   -0.01    0.02    0.05    0.11
condSuppression:prej    0.34    0.06    0.21    0.30    0.34    0.38    0.46
sigma                   1.14    0.06    1.03    1.10    1.13    1.17    1.26
mean_PPD                5.49    0.11    5.27    5.41    5.49    5.56    5.72
log-posterior        -316.40    1.63 -320.25 -317.25 -316.03 -315.23 -314.29

It looks like now we have pretty solid evidence for an interaction from the Study 2 analysis. This agrees with what I did when I simply stacked the data on top of one another and ran the model with study number as a dummy-variable.

Counterfactual: What If I Ran Study 2 First?

That got me thinking: What if I had run Study 2 first and then used the data from Study 1 to update my beliefs on Study 2? I did the same thing as above, but in reverse: I re-analyzed the Study 1 data using the frequentist, ordinary least squares coefficient estimates and standard deviations from Study 2 as prior means and standard deviations for my analysis of Study 1 data. The summary results were:

Estimates:
                          mean    sd      2.5%    25%     50%     75%     97.5%
(Intercept)                5.35    0.17    5.01    5.23    5.35    5.46    5.69
condSuppression           -1.09    0.20   -1.47   -1.22   -1.09   -0.96   -0.69
prej                       0.11    0.05    0.01    0.08    0.11    0.14    0.21
condSuppression:prej       0.17    0.06    0.05    0.13    0.17    0.21    0.28
sigma                      1.10    0.06    0.99    1.06    1.09    1.13    1.21
mean_PPD                   5.33    0.11    5.11    5.25    5.33    5.40    5.54
log-posterior           -303.89    1.61 -307.96 -304.67 -303.53 -302.74 -301.83

Again, we see evidence for an interaction, however this might not have necessarily been the case. Note that the point estimate for both Bayesian analyses aren't even in the 95% credible intervals for one another; the two credible intervals from the Bayesian analyses have more non-overlap than they do overlap.

What Is The Bayesian Justification For Time Precedence?

My question is thus: What is the justifications that Bayesians have for respecting the chronology of how the data were collected and analyzed? I get results from Study 1 and use them as informative priors in Study 2 so that I use Study 2 to "update" my beliefs. But if we assume that the results I get are randomly taken from a distribution with a true population effect... then why do I privilege the results from Study 1? What is the justification for using Study 1 results as priors for Study 2 instead of taking Study 2 results as priors for Study 1? Does the order in which I collected and calculated the analyses really matter? It does not seem like it should to me—what is the Bayesian justification for this? Why should I believe the point estimate is closer to .34 than it is to .17 just because I ran Study 1 first?


Responding to Kodiologist's Answer

Kodiologist remarked:

The second of these points to an important departure you have made from Bayesian convention. You didn't set a prior first and then fit both models in Bayesian fashion. You fit one model in a non-Bayesian fashion and then used that for priors for the other model. If you used the conventional approach, you wouldn't see the dependence on order that you saw here.

To address this, I fit the models for Study 1 and Study 2 where all regression coefficients had a prior of $\text{N}(0, 5)$. The cond variable was a dummy variable for experimental condition, coded 0 or 1; the prej variable, as well as the outcome, were both measured on 7-point scales ranging from 1 to 7. Thus, I think it is a fair choice of prior. Just by how the data are scaled, it would be very, very rare to see coefficients much larger than what that prior suggests.

The mean estimates and standard deviation of those estimates are about the same as in the OLS regression. Study 1:

Estimates:
                       mean     sd       2.5%     25%      50%      75%      97.5% 
(Intercept)             5.756    0.270    5.236    5.573    5.751    5.940    6.289
condSuppression        -1.694    0.357   -2.403   -1.925   -1.688   -1.452   -0.986
prej                   -0.019    0.087   -0.191   -0.079   -0.017    0.040    0.150
condSuppression:prej    0.363    0.119    0.132    0.282    0.360    0.442    0.601
sigma                   1.091    0.057    0.987    1.054    1.088    1.126    1.213
mean_PPD                5.332    0.108    5.121    5.259    5.332    5.406    5.542
log-posterior        -304.764    1.589 -308.532 -305.551 -304.463 -303.595 -302.625

And Study 2:

Estimates:
                       mean     sd       2.5%     25%      50%      75%      97.5% 
(Intercept)             5.249    0.243    4.783    5.082    5.246    5.417    5.715
condSuppression        -0.599    0.342   -1.272   -0.823   -0.599   -0.374    0.098
prej                    0.137    0.079   -0.021    0.084    0.138    0.192    0.287
condSuppression:prej    0.135    0.120   -0.099    0.055    0.136    0.214    0.366
sigma                   1.132    0.056    1.034    1.092    1.128    1.169    1.253
mean_PPD                5.470    0.114    5.248    5.392    5.471    5.548    5.687
log-posterior        -316.699    1.583 -320.626 -317.454 -316.342 -315.561 -314.651

Since these means and standard deviations are the more or less the same as the OLS estimates, the order effect above still occurs. If I plug-in the posterior summary statistics from Study 1 into the priors when analyzing Study 2, I observe a different final posterior than when analyzing Study 2 first and then using those posterior summary statistics as priors for analyzing Study 1.

Even when I use the Bayesian means and standard deviations for the regression coefficients as priors instead of the frequentist estimates, I would still observe the same order effect. So the question remains: What is the Bayesian justification for privileging the study that came first?

Mark White
  • 10,252
  • 2
    "I would still be in the same situation. So the question remains: What is the Bayesian justification for privileging the study that came first?" — Huh? In what sense are you still privileging Study 1? You can fit the two models as you described here or in the opposite order and your final estimate of e.g. the true population coefficient for prej should be the same either way, unless I'm misunderstanding your procedure. – Kodiologist Apr 28 '18 at 06:58
  • @Kodiologist I edited for clarity, including more about the procedure. – Mark White Apr 28 '18 at 13:26
  • 1
    What about the the covariance matrix & the error? You've got to use the whole joint posterior as your new prior. – Scortchi - Reinstate Monica Apr 28 '18 at 13:26
  • @Scortchi bingo—that is the correct answer, I think, and it was what unutbu's answer led me to believe. What I did was a really crude version of updating: I took summary statistics, not the entire joint posterior. That implies the question: Is there a way to include the whole joint posterior as a prior in rstanarm or Stan? It seems like that question has been asked here before: https://stats.stackexchange.com/questions/241690/in-stan-is-there-a-way-to-use-parameter-posterior-from-old-analysis-as-prior-in – Mark White Apr 28 '18 at 13:29
  • 1
    If you're starting with Gaussian priors (& independence?) for the coefficients & an inverse-gamma for the variance, then you've got a normal inverse-gamma prior & it's conjugate. Look up the updating equations. – Scortchi - Reinstate Monica Apr 28 '18 at 13:33

3 Answers3

25

Bayes' theorem says the posterior is equal to prior * likelihood after rescaling (so the probability sums to 1). Each observation has a likelihood which can be used to update the prior and create a new posterior:

posterior_1 = prior * likelihood_1
posterior_2 = posterior_1 * likelihood_2
...
posterior_n = posterior_{n-1} * likelihood_n

So that

posterior_n = prior * likelihood_1 * ... * likelihood_n

The commutativity of multiplication implies the updates can be done in any order. So if you start with a single prior, you can mix the observations from Study 1 and Study 2 in any order, apply Bayes' formula and arrive at the same final posterior.

unutbu
  • 579
  • 1
    Makes perfect sense. So this points to a possible reason for the discrepancy as being: the way I did my analyses (plug posterior summary statistics into the prior arguments for the next study) is not how updating works? That is: I need to consider the entirety of the posterior, not just plugging summary statistics from it into the priors of subsequent analyses. Correct? – Mark White Apr 28 '18 at 13:11
  • 5
    @MarkWhite Correct. The posterior distributions from your first analysis should be your priors for the second. – Kodiologist Apr 28 '18 at 16:01
  • 4
    @Kodiologist and summary statistics about the posterior != the posterior – Mark White Apr 28 '18 at 16:02
  • @MarkWhite Right. – Kodiologist Apr 28 '18 at 16:03
24

First I should point out that:

  1. In your significance-testing approach, you followed up a negative result with a different model that gave you another chance to get a positive result. Such a strategy increases your project-wise type-I error rate. Significance-testing requires choosing your analytic strategy in advance for the $p$-values to be correct.
  2. You're putting a lot of faith in the results of Study 1 by translating your findings from that sample so directly into priors. Remember, a prior is not just a reflection of past findings. It needs to encode the entirety of your preexisting beliefs, including your beliefs before the earlier findings. If you admit that Study 1 involved sampling error as well as other kinds of less tractiable uncertainty, such as model uncertainty, you should be using a more conservative prior.

The second of these points to an important departure you have made from Bayesian convention. You didn't set a prior first and then fit both models in Bayesian fashion. You fit one model in a non-Bayesian fashion and then used that for priors for the other model. If you used the conventional approach, you wouldn't see the dependence on order that you saw here.

Kodiologist
  • 20,116
  • How did I follow up a negative result with a different model? What do you mean by "negative result"? As far as study-wide Type I error rate, these are two separate studies conducted weeks apart from one another. Either way, I believe in doing exploratory data analysis, so I don't ever think p-values in practice are "correct" or that we should expect them to be "totally correct." If people only did the tests they thought of beforehand, we would miss out on lots of great findings that happened by accident—and we'd be wasting tons of data.
  • – Mark White Apr 27 '18 at 17:08
  • I buy this point about not letting Study 1 dominate my priors—I need to incorporate more uncertainty. But when it comes to fitting the model in OLS and then use those for Bayesian priors: I don't think my point estimate and standard errors for Study 1 would have been that much different between OLS and Bayesian regression, because I would have been using weakly-informative priors in Study 1.
  • – Mark White Apr 27 '18 at 17:09
  • 2
  • A negative result is one that is unexciting or disappointing, or more specifically in the context of significance testing, a negative result is failing to reject a null hypothesis. If you don't think that $p$-values are ever correct, of course, then significance-testing can't be of any value even in theory. There's nothing wrong with a exploratory philosophy, but significance-testing isn't suited for it. By "study-wise", I actually meant "project-wise", in the sense of the word "project" encompassing both studies; I've corrected that.
  • – Kodiologist Apr 27 '18 at 18:28