15

TL;DR: I am looking for the paper that proposed the following "inter-ocular trauma test for statistical significance".

intra-ocular trauma test


Longer version

The idea of the proposed informal "test" is as follows. Assume you have observations and a null hypothesis, and assume that there is some quantity(ies) which can be derived either from your observations or by simulation under the null hypothesis. Assume further that this quantity can be graphed easily.

For instance, the quantity we are interested in could be a regression parameter estimate. Or in the example above, which is about examining uniformity of distributions, it could be histograms of the counts in the five fullest and the five emptiest bins.

Now, simulate the quantity of interest under the null hypothesis, say, 19 times. Arrange the graphical representations in a $4\times5$ grid, including the representation of the actual observations at a random spot.

Does the panel for your actual observation stand out sufficiently that it is obvious? (I.e., does it "hit you right between the eyes", which is sometimes called an "inter-ocular trauma test"?) If so, then there is something there.

For additional social ineptness, accost random strangers, show them the plot and ask them to identify which panel "doesn't fit". If 95% of your victims correctly identify the panel corresponding to the actual observations, we can informally say that $p=0.05$.

I read about this proposal in a paper, which I believe dates from the 2000s, by a well known statistician, on the order of Tibshirani or Breiman, but no matter how much I dig through my literature database, I can't locate the original paper. It may not even have been published (it doesn't seem to be among the papers I have read from the Journal of Computational and Graphical Statistics).

Can anyone identify the paper in which this was proposed?


R code for the graphic above

set.seed(1)
n_items <- 5000
n_bins <- 1000

actual_distribution <- factor(sample(1:n_bins,n_items,replace=TRUE,prob=0.996^(1:n_bins)),levels=1:n_bins)

y_max <- 30 # set through trial and error

n_plots <- 20 (where_to_insert <- sample(1:n_plots,1))

opar <- par(mfrow=c(4,5),las=2,mai=c(.1,.5,.1,.1)) for ( ii in 1:n_plots ) { if ( ii == where_to_insert ) { sim <- actual_distribution } else { sim <- factor(sample(1:n_bins,n_items,replace=TRUE),levels=1:n_bins) } barplot(c(sort(table(sim),decreasing=TRUE)[1:5], NA,NA, rev(sort(table(sim),decreasing=FALSE)[1:5])), xaxt="n",lwd=2,col="gray",ylim=c(0,y_max)) text(7.2,1,"...",cex=2,font=2) } par(opar)

Stephan Kolassa
  • 123,354

1 Answers1

14

Joe Berkson coined the phrase. To my knowledge, it first shows up in 1963 in Edwards, Lindman, and Savage's "Bayesian Statistical Inference for Psychological Research in Psychological Review, 70(3):

The preceding paragraph illustrates a procedure that statisticians of all schools find important but elusive. It has been called the interocular traumatic test; you know what the data mean when the conclusion hits you between the eyes.

They cite "personal communication" in July of 1958 with Joe Berkson for the idea. It may appear in literature sooner, but this 1963 paper is the earliest of which I'm aware.

Mark White
  • 10,252
  • 2
    Hm. Much as I appreciate seeing Bayesian statistics as "a currently controversial viewpoint", I skimmed the paper and did not see anything related to the kind of lineup plot above (which, in contrast, was proposed by Buja et al.). I am not asking about the first use of an "inter-ocular trauma test" in statistical inference as such, but about using a "lineup" of null distributions and the actual data (my question may not have been optimally formulated), and that doesn't seem to be in that paper. Am I missing something? – Stephan Kolassa Apr 02 '20 at 12:37
  • I interpreted the question as what was the first time this appeared in the literature. They mention it throughout the paper, talking about how the distributions shown are obviously different from one another. But no, they don't do the specific panel grid thing you are mentioning. I'd imagine forward-searching on that citation will get you what you're looking for. – Mark White Apr 02 '20 at 12:42