-1

Here is the summary of my linear model:

Call:
lm(formula = weight ~ height, data = height and weight)

Residuals:
    Min      1Q  Median      3Q     Max 
-10.267  -4.267  -1.267   6.455  11.538 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)   
(Intercept)       20.4878     5.8335   3.512  0.00158 **
height             0.3195     0.1447   2.208  0.03596 * 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.703 on 27 degrees of freedom
Multiple R-squared:  0.1529,    Adjusted R-squared:  0.1215 
F-statistic: 4.873 on 1 and 27 DF,  p-value: 0.03596

And here is my confidence interval for $\hat{\beta}_1$: 0.02260012 0.61639988 at $5\%$ level.

As you can see $0 \notin $CI at $5\%$ level. Also, the $p$-value for the test $H_0: \beta_1=0$ turns out to be $0.03596 $ which is in between $0.01 <0.03596 < 0.05$.

How do I interpret this?

The p-value < 0.05 should mean that there is moderate evidence against the null hypothesis $H_0: \beta_1=0$, also, $0$ is not in CI. So should I reject the fact that $\beta_1=0$ or should I say the evidence is not strong enough?

Furthermore, what would happen if in the same situation, I would get $0 \in CI$ but still $p-$value <0.05?

What about $p-$value >0.05 but $0\in CI$?

Euler_Salter
  • 2,196
  • 1
    Could you explain how you obtained that confidence interval for $\beta_1$ (presumably it corresponds to height)? The values you quote are inconsistent with what's shown in the output you have reproduced. – whuber Apr 07 '17 at 15:22
  • @whuber I had the data, I used the lm function to create a linear model. Then I used summary to generate this summary, that's it. Why do you say it is inconsistent? I might have made some typos – Euler_Salter Apr 07 '17 at 15:24
  • 2
    In this case the model does not appear to be useful and the R-square measures are low. So I would reject the model. The use of 0.05 as a measure of significance need not be universal. – Michael R. Chernick Apr 07 '17 at 15:27
  • 1
    I assume the p-value > 0.05 and 0 in the CI is hypothetical. – Michael R. Chernick Apr 07 '17 at 15:29
  • 1
    I believe your entire question might rest on a mistaken calculation (or transcription) of the confidence interval. Please double-check that, because it's obviously not correct. – whuber Apr 07 '17 at 15:39
  • @MichaelChernick wait I don't understand. Why are you assuming p-value >0.05 etc? I think from the summary it explicitely say that the p-value is 0.03596, doesn't it? – Euler_Salter Apr 07 '17 at 15:55
  • 1
    @whuber if you write qt(0.975,27) you get 2.051831. The I simply calculated the CI as follows $$\hat{\beta}1 \pm c\cdot se(\hat{\beta}_1)$$ where $c = t{n-2, 1-\frac{\alpha}{2}}=t_{27, 0.975} = 2.051831$ and $se(\hat{\beta}_1) = 0.1447 $ (where the standard error you get it from the summary again). Hence if you do this calculation, you get $(0.3195-2.051831 \times 0.1447, 0.3195+2.051831 \times 0.1447) = (0.0226000, 0.6163999)$ which is the CI I gave above – Euler_Salter Apr 07 '17 at 16:00
  • 1
    Thank you for showing the details. (I had misread the value of the LCL, incidentally, and seeing the calculation made that evident.) Now that you have done so, you have clearly shown what your CI is. What I still don't understand is why you perceive there to be any discrepancy. Perhaps the discussion at http://stats.stackexchange.com/questions/31/what-is-the-meaning-of-p-values-and-t-values-in-statistical-tests will settle any questions you might have about interpretation. – whuber Apr 07 '17 at 16:07
  • I assumed that you were talking about a hypothetical case that did not pertain to the explicitly summary table that you gave in your question. – Michael R. Chernick Apr 07 '17 at 16:09
  • I think your title is incorrect. Should it not read "does not include"? – mdewey Apr 07 '17 at 16:10
  • @MichaelChernick I know there is some confusion. Basically I start with my specific question, but then I ask: in general if I calculate the CI and the p-value (having set a level of $\alpha$), how do I base my decision on the hypothesis test with $H_0: \beta_1 = 0$ against $H_A: \beta_1 \neq 0$? We have of course 4 cases: $0 \in CI$ and $p-$value $>0.05$ ; $0 \in CI$ and $p-$value $<0.05$ ; $0 \notin CI$ and $p-$value $>0.05 $ ; $0 \notin CI$ and $p-$value $<0.05$. What should be my interpretation in these four cases? (notice my case is one of them!) – Euler_Salter Apr 07 '17 at 16:17
  • @mdewey oh yes, I missed that, thank you for spotting the typo! – Euler_Salter Apr 07 '17 at 16:17

2 Answers2

0

Cases where a) the 95% confidence interval doesn't include zero, and cases where b) p <= 0.05, don't always coincide precisely.

For some discussion on cases with the two-sample t test, see perhaps Cumming and Finch, 2005, Inference by Eye.

But also there are different ways the p value may be determined. For example, to assess effects in a model, Wald tests could be used, or likelihood ratio tests could be used, and their results may not be precisely equal.

Likewise, confidence intervals can be constructed in different ways. For example, determining confidence intervals by bootstrap is relatively common now, and there are actual several different bootstrap approaches that could be employed. Again, the results of the different methods won't necessarily yield precisely identical results.

Usually different statistical approaches will yield similar results, but especially in borderline cases where you are using hard cutoff (e.g. p <= 0.05 and p > 0.05 yield drastically different conclusions), different approaches may yield different results.

So what is one to do? A good practice is to decide ahead of time what statistical approach to use, and how you will interpret the results. If you want to use confidence intervals, use confidence intervals. If you want to use p values, use p values.

But in all cases, remember to look at the size of the effects you are observing (in this case, Beta1 or r-squared), and the practical implications of the results. For this model, Is a Beta1 of 0.3 practically meaningful? Is an r-squared of 0.15 practically meaningful?

Sal Mangiafico
  • 11,330
  • 2
  • 15
  • 35
-3

First of all, it's a 95% not 5% confidence interval, and for the confidence interval (of the slope - since you use the CI ̂ 1±⋅(̂ 1)) you would say: "I am 95% confident that the true slope is between 0.02260012 and 0.61639988, for the current sample." Performing a hypothesis test would yield rejecting the null, so, there is a difference. However, your r-squared indicates your model only accounts for 15% of the variation. You either need to throw out your model or look for the lurking variable that accounts for the remaining 85%. Additionally, the p-value for height indicates that it is statistically insignificant.

  • I was wondering about the downvote and found the reason in your last sentence: are you sure you typed it as you intended? If so, could you elaborate on what it is about the height's p-value that causes you to conclude it is "statistically insignificant"? – whuber May 01 '19 at 13:42