2

I wonder what the relationship between confidence interval and random chance is. Let me elaborate a bit. Say, I have the following linear relationship between two variables of $X$ and $Y:$

\begin{align} Y&=\alpha+\beta_1 X_1+ \varepsilon. \end{align} Now, say, I get a $\beta_1 $ of $10 $ which is statistically significant at $5\%.$ This means there is a $95\%$ probability that the relationship is correct and $5\%$ that it might be incorrect. More specifically, in my case, it means that with every unit increase in $X,$ my $Y $ increases by $10\%.$

Question: Can I interpret my result by saying, given that the coefficient is $10\%, $ my outcome is $5\%$ more than the random chance? Does it even work like this?

So, say, if I take the significance level of $10\%$ to account and my coefficient is hypothetically still $10,$ then is my result basically random and not strong enough to draw any meaningful conclusion from?

User1865345
  • 8,202

2 Answers2

3

This is incorrect, significance at 5% level does not mean that you have 95% chance of being correct. Hence, the rest of reasoning does not hold.

I do not follow the rest of your question but you cannot multiply coefficients by significance level, it’s comparing apples to oranges, meaningless. Significance level is also not something that you pick based on your results to “make” them significant, this would lead to flawed results.

Tim
  • 138,066
0

No, having a significant result at 5% does not mean that the result is correct 95%.

In linear regression you get a confidence interval on all the coefficients. Usually having a significant result means that the coefficient value of zero, that is $\beta_1=0$ is not in the confidence interval of the coefficient at a 5% level.

The way to interpret it is that under the assumptions of the model, the chance of $\beta_1$ being zero given the data is less than 5%.

leviva
  • 954
  • 3
    This interpretation of statistical significance is not correct, see the link in my answer in this thread. Significance is $P(X|H_0)$ not $P(H_0|X)$. – Tim Jul 31 '22 at 08:58
  • Confidence intervals are slightly different from hypothesis testing.

    While in hypothesis testing we assume a null hypothesis and check if the sample is plausible as you said, when calculating a confidence interval we calculate it given the sample: $P(v_1\leqslant\theta\leqslant v_2|X)\leqslant CL$ where $v_1$ and $v_2$ are the confidence interval limits $X$ the sample and $\theta$ the true population value. $CL$ is the confidence level. The interpretation is that given the procedure, 95% of the confidence intervals calculated by it will contain the true population parameter.

    – leviva Jul 31 '22 at 10:34
  • But this is slightly different than what you said in the answer. – Tim Jul 31 '22 at 10:53
  • You're correct. How about the following edit: Under the assumptions of the model, the probability of the true $\beta_1$ being inside the confidence interval is 95% – leviva Jul 31 '22 at 13:15