2

I have been a studying high energy physics (HEP) for the last few years but I recently started working on a project in medical imaging. I have been a little surprised (not entirely I was aware that 95% was commonly used) to find major studies reporting using 95% cl. In HEP the convention is not worry too much until 3 sd if not 5. I appreciate in the 'real world' such idealistic situations cannot always be created but even so 95% doesn't seem such a high level of confidence. What is the rationale behind this? Is it simply a pragmatic one in the interests of completing in a timely manner? I have already come across trial or two which appear, to say the least, couterintuitive in their findings.

Can anyone recommend a good book to jump into the mathematics of this? I have borrowed a fairly qualitative introduction but things like power calculations and kaplan meier curves are essentially totally new to me.

  • 1
    The reason for $1-\alpha$ = 95% CI's is largely the same as for $\alpha$ = 5% significance levels. – Glen_b Dec 16 '13 at 23:40
  • Very interesting. I'm a bit concerned by the idea that it is used because it is achievable, I can be pragmatic up to a point, but surely an experiment should aim for a minimum level of evidence not just do what is necessary to reach 95% (p=0.05) within reason. I wouldn't go as far as extraordinary claims require extraordinary evidence but if one is going to claim to disprove a long standing theory, discover a new particle/cure a greater level of significance that a fit for some calibration curve seems like a good idea. – user36288 Dec 17 '13 at 00:03
  • 1
    I'm not sure what you mean by 'it's used because it's achievable'. That's not the message I take away from the linked page. I don't think anybody is saying that 5% should be used for any particular case (are they?), and certainly not for the case where one is trying to "disprove a long standing theory". You seem to be arguing against something nobody (to my knowledge) is saying. The convention in HEP is just as much a convention as the convention you see in say agricultural research, the distinction is that the stakes are different so of course suitable α (& β) is too. What did I miss? – Glen_b Dec 17 '13 at 01:18
  • 1
    If anything, there's a good argument that hypothesis testing should be used much more rarely than it is, and in many ways its use in HEP is bizarre... even more bizarre than in many other areas. Why aren't they using Bayesian statistics, one might reasonably ask. I am not sure. It seems at least as counterintuitive to use $5 \sigma$ in HEP as to use say $2 \sigma$ when choosing between old and new varieties of tomato. – Glen_b Dec 17 '13 at 01:29
  • @Glen_b I have wondered that myself about the Higgs Boson experiments. This was a case in which they expected to find a "peak" signal within a certain range, shouldn't the null hypothesis have been the prediction? I suppose maybe the theory did not predict well enough for a point null. – Flask Dec 17 '13 at 04:47
  • I'm not an expert but I believe that the standard model does not predict the mass of the Higgs, I think being the last unconstrained parameter was part of what the hype was about, so you are looking for a 'bump' in a continuous spectrum. Why 5 s.d. I don't know I guess fundamentally you assume number of events to be a pretty poisson process so perhaps they think they understand their experiments that well. – Bowler Dec 17 '13 at 10:55

3 Answers3

4

There are a few reasons:

1) If you decrease the chance of type I error you increase the chance of type II error. Sometimes one is more important, sometimes the other. Often, I think, 5% is too low because it increases type II error.

2) In some ways, type I errors never happen. That is, suppose your null hypothesis is that two means are equal. But, in a population, two means are never exactly equal. It's more a question of how far apart they are.

3) It's traditional and there's no really great reason to change it.

4) It's traditional and journal editors/dissertation committees/pointy haired bosses demand it.

5) It's traditional and it lets you avoid thinking about whether it makes sense in a particular situation.

Peter Flom
  • 119,535
  • 36
  • 175
  • 383
  • In medical research getting adequate numbers of controlled observations is always a big challenge, emphasizing the importance of point (1). – EdM Dec 17 '13 at 17:13