This is really basic, but I'm having trouble interpreting the summary output of a multiple regression, including one or more interactions with a categorical variable and I couldn't find satisfactory explanations elsewhere. Consider an example based on the mpg data set:
easypackages::libraries("tidyverse", "nlme", "car")
mod1 <- lm(hwy ~ displ + drv + displ:drv, data = mpg)
Anova(mod1, type = 3)
#> Anova Table (Type III tests)
#>
#> Response: hwy
#> Sum Sq Df F value Pr(>F)
#> (Intercept) 7211.6 1 783.660 < 2.2e-16 ***
#> displ 1096.0 1 119.102 < 2.2e-16 ***
#> drv 204.4 2 11.105 2.499e-05 ***
#> displ:drv 86.0 2 4.673 0.01026 *
#> Residuals 2098.2 228
#> ---
#> Signif. codes: 0 '*' 0.001 '' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This suggests a significant interaction between displ and drv, i.e. the effect of displ will differ, depending on the level of drv. Let's look at this in more detail:
summary(mod1)
#>
#> Call:
#> lm(formula = hwy ~ displ + drv + displ:drv, data = mpg)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -8.489 -1.895 -0.191 1.797 13.467
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 30.6831 1.0961 27.994 < 2e-16 ***
#> displ -2.8785 0.2638 -10.913 < 2e-16 ***
#> drvf 6.6950 1.5670 4.272 2.84e-05 ***
#> drvr -4.9034 4.1821 -1.172 0.2422
#> displ:drvf -0.7243 0.4979 -1.455 0.1471
#> displ:drvr 1.9550 0.8148 2.400 0.0172 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 3.034 on 228 degrees of freedom
#> Multiple R-squared: 0.746, Adjusted R-squared: 0.7405
#> F-statistic: 134 on 5 and 228 DF, p-value: < 2.2e-16
This tells me that the relationship between displ:drv for the drv-level r is significant, while this is not the case for the drv-level f. Do I understand it correctly that the p-value associated with the drv-level 4, which is the reference level in this model is given in the line of the intercept, i.e. significant in this case? Let's plot the interaction:
mpg %>%
ggplot(aes(x = displ, y = hwy, col = drv)) +
geom_point(size = 3, alpha = 0.4) +
geom_smooth(method = "lm", lwd = 1.5)
Hmmm, judging by the regression lines and associated scatter, I'm somewhat surprised that the results for drv-level r are significant, while those for drv-level f are not. What happens, if we add another term plus associated interaction?
mod2 <- lm(hwy ~ displ + drv + cyl + displ:drv + cyl:drv, data = mpg)
summary(mod2)
#>
#> Call:
#> lm(formula = hwy ~ displ + drv + cyl + displ:drv + cyl:drv, data = mpg)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -8.2265 -1.8421 0.0316 1.4962 13.3950
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 31.7350 1.2197 26.018 < 2e-16 ***
#> displ -1.8893 0.6279 -3.009 0.00292 **
#> drvf 8.2000 1.8886 4.342 2.14e-05 ***
#> drvr 9.6730 6.3080 1.533 0.12657
#> cyl -0.7720 0.4481 -1.723 0.08627 .
#> displ:drvf 0.3036 1.0626 0.286 0.77533
#> displ:drvr 3.3796 1.2243 2.761 0.00625 **
#> drvf:cyl -0.8073 0.7415 -1.089 0.27743
#> drvr:cyl -2.8897 1.2139 -2.381 0.01812 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 2.923 on 225 degrees of freedom
#> Multiple R-squared: 0.7674, Adjusted R-squared: 0.7591
#> F-statistic: 92.78 on 8 and 225 DF, p-value: < 2.2e-16
OK, so we now have significant interactions for some levels of both displ:drv and cyl:drv, but what happened to the results for the reference-level of drv, i.e. 4? Does the p-value shown for the intercept give the significance for both interactions of displ:drv and cyl:drv, when drv equals 4?
To sum this up:
- Is it correct to interpret p-values for the different levels of an interaction term with a categorical variable separately?
- Is the p-value for the reference level of the categorical variable (in this case
drv-level4) that shown in the line of the intercept of thesummaryoutput? - Is that still the case in the presence of multiple interactions?

displ:drvfis large because the green and red lines have similar slopes. – Doctor Milt Apr 28 '23 at 12:21