1

Definition of Significance level: The significance level is the probability of making a Type I error. I.e. rejecting the null hypothesis given that it is true. Hence, a small significance level is desirable

However, when we make a hypothesis test and can reject the null hypothesis we say the test is significant.

Don't you think there is a contradiction there?

  • 1
    Welcome to Cross Validated! What contradiction do you see? Keep in mind that, when we have a “significant” result, we are hoping for it to be a correct rejection of a false null hypothesis, not one of those incorrect rejections of a true null hypothesis. – Dave Oct 28 '22 at 05:36
  • The theory of null hypothesis significance testing (NHST) wasn't developed with an eye on being intuitive and easy to understand. So why ask if the terminology is intuitive and easy to understand? Another example: the significance level α is also called the size of the test. But we don't say a test with α=0.01 is "smaller" than a test with α=0.05. Better focus on understanding NHST; see eg. 1, 2, 3. – dipetkov Oct 28 '22 at 06:18
  • @dipetkov I don't believe that the question is 'whether or not the terminology is intuitive and easy to understand?'. – Sextus Empiricus Oct 28 '22 at 07:05
  • @SextusEmpiricus We'll have to agree to disagree what a question that states "The accepted terminology doesn't make sense to me. Do you agree with me?" is about. My answer is "no", yours is "yes, kind of"? That's okay. PS: I only wrote a comment to explain why I vote to close this question as opinion-based. – dipetkov Oct 28 '22 at 07:17
  • @dipetkov The OP has the question "Isn't it contradictory ...?" but I read this as "Why* isn't it contradictory ...?"*. The question is not about deciding whether or not it is contradictory, but instead about explain how it works. At least, that's what the approach is that I took in my answer. The OP is confused about something (and I agree that it can be confusing and you as well note that the terminology is not intuitive and easy to understand), and requires explanation about it. – Sextus Empiricus Oct 28 '22 at 07:48
  • @SextusEmpiricus How significance testing works has been explained many times on CV. That's why I linked to a few CV threads. – dipetkov Oct 28 '22 at 08:00
  • @dipetkov the OP seems to have a specific problem which is about some perceived contradiction in the sentence "when we make a hypothesis test and can reject the null hypothesis we say the test is significant." It is not clearly explained what the contradiction is, but I suspected it has to do with the idea that 'can reject the null' is being something that should make it insignificant. Sure, we can direct the OP to some threads that explain the term significance and maybe when they read ten of those they will get it. But we can also address the contradiction more directly. – Sextus Empiricus Oct 28 '22 at 08:07
  • I would agree to close this question for being unclear. I picked up a certain interpretation in that single sentence, but just that single sentence is indeed not very clear in explaining the perceived contradiction. – Sextus Empiricus Oct 28 '22 at 08:08
  • 2
    Part of the difficulty with this question is that it is predicated on an incorrect statement: the significance level is not the "probability of making a Type I error." It is the hypothetical chance when the null hypothesis is true. That distinction between actual and hypothetical conditions is crucial. – whuber Oct 28 '22 at 12:45
  • @whuber I don't understand why the statement "$\alpha$ is the probability of making a type I error" is wrong. Type I error is the event {reject null hypothesis | null hypothesis is true}. – dipetkov Oct 29 '22 at 16:00
  • @dipetkov I wished to emphasize, as you do, the hypothetical, conditional nature of this probability. To see a contradiction is to misunderstand that. Additionally, $\alpha$ is the supremum over the null hypothesis of the chance of making a Type I error. With composite hypotheses (which includes standard one-tailed tests), there is usually no definite probability of a Type I error for the null hypothesis--only an upper bound. – whuber Oct 29 '22 at 16:57

1 Answers1

1

"Rejecting the null hypothesis" is in practice* "accepting the alternative hypothesis".

This is where your confusion might come from.

The term 'reject' sounds like a negative. But actually it is positive from the point of view of the statistical accuracy of an experiment or observation. It meant that we made an precise observation. The measurements were accurate enough to be able to do the rejection. If you have a noisy signal then you could not reject the null hypothesis because of the noise making it impossible to reliably measure an effect.

Also 'reject' has often a positive touch because often the experiment is not about 'rejecting' the null hypothesis but instead about "confirming' or measuring some effect. Significant means that the effect was able to be measured with sufficient precision such that we can differentiate it from the situation without the effect present (the null, empty, hypothesis). From a Popperian viewpoint the scientific knowledge advances by eliminating possibilities; the more we can reject the better.

A related confusion is testing negative on a medical test. Negative is good because it means not sick.


*Important note, the above is more subtle and there is not really literal accepting

- Expression of significance: or 'acceptance' means that we observed an effect, and consider it as a 'significant' effect. There is no literal 'acceptance' of some theory/hypothesis here. There is just the consideration that we found that the data shows there is some effect and it is significantly different from a case when to there would be zero effect. Whether this means that the alternative theory should be accepted, that is not explicitly stated and should also not be assumed implicitly. The alternative hypothesis (related to the effect) works for the present data, but that is different from being accepted, (it just has not been rejected yet).

  • The parts about "positive touch" don't seem to apply to equivalence tests very well. – dipetkov Oct 28 '22 at 06:38
  • @dipetkov Equivalence tests, like two one-sided t-tests, can be seen as adding a nuance to the ditochomy '$\text{effect}=0$' versus '$\text{effect}\neq 0$' and instead one uses three levels '$\text{effect}=0$', '$\text{effect}<\text{some level}$' and '$\text{effect}\geq\text{some level}$'. Depending on the position and size of the confidence interval we can reject zero, one or two of the hypotheses. Indeed, rejecting $\text{effect}\geq\text{some level}$ might be unexpected and considered negative (if researcher was hoping for special discovery), but it is not a null hypothesis testing anymore. – Sextus Empiricus Oct 28 '22 at 07:30
  • @dipetkov equivalence testing is testing hypotheses, but a null hypothesis should be 'absence of effect'. The second two sided t-tests in a TOST is not really testing a null hypothesis. – Sextus Empiricus Oct 28 '22 at 07:32
  • The difference in view and terminology (whether 'null' should strictly refer to absence of effect) might relate to the the Fisher vs Neyman approaches to hypothesis testing. But in any case an equivalence test like TOST seems to be disguised as two seperate null hypothesis tests, but if you look at it from a broader view it is actually a test of three hypotheses. – Sextus Empiricus Oct 28 '22 at 07:34
  • The problem here is that null hypothesis testing is a trick to make the mathematical computations but what an experimenter is actually interested in is the confidence for different parameter values or effect sizes. – Sextus Empiricus Oct 28 '22 at 07:39
  • Okay, I admit that although I think I understand the difference between the Fisher approach (compute p-value) and the Neyman-Pearson approach (report reject/don't reject), I tend to find explanations why significance testing makes lots of sense (even though it tends to confuse people) a bit uninteresting. As I said, I voted to close this question and I add comments to explain my votes. I cannot close a question on my own, so it might very well stay open. – dipetkov Oct 28 '22 at 07:42
  • Really last comment. I acknowledge your effort to help the OP (+1). However, I also believe that common sense doesn't help with understanding significance testing. Accepting this is, at least for me, the first step to understanding the theory. – dipetkov Oct 28 '22 at 08:16