Is interaction possible between two continuous variables?

Question

All of my variables are continuous. There are no levels. Is it possible to even have interaction between the variables?

chl · Accepted Answer · 2010-11-28T11:40:27.143

Yes, why not? The same consideration as for categorical variables would apply in this case: The effect of $X_1$ on the outcome $Y$ is not the same depending on the value of $X_2$. To help visualize it, you can think of the values taken by $X_1$ when $X_2$ takes high or low values. Contrary to categorical variables, here interaction is just represented by the product of $X_1$ and $X_2$. Of note, it's better to center your two variables first (so that the coefficient for say $X_1$ reads as the effect of $X_1$ when $X_2$ is at its sample mean).

As kindly suggested by @whuber, an easy way to see how $X_1$ varies with $Y$ as a function of $X_2$ when an interaction term is included, is to write down the model $\mathbb{E}(Y|X)=\beta_0+\beta_1X_1+\beta_2X_2+\beta_3X_1X_2$.

Then, it can be seen that the effect of a one-unit increase in $X_1$ when $X_2$ is held constant may be expressed as:

$$ \begin{eqnarray*} \mathbb{E}(Y|X_1+1,X_2)-\mathbb{E}(Y|X_1,X_2)&=&\beta_0+\beta_1(X_1+1)+\beta_2X_2+\beta_3(X_1+1)X_2\\ &&-\big(\beta_0+\beta_1X_1+\beta_2X_2+\beta_3X_1X_2\big)\\ &=& \beta_1+\beta_3X_2 \end{eqnarray*} $$

Likewise, the effect when $X_2$ is increased by one unit while holding $X_1$ constant is $\beta_2+\beta_3X_1$. This demonstrates why it is difficult to interpret the effects of $X_1$ ($\beta_1$) and $X_2$ ($\beta_2$) in isolation. This will even be more complicated if both predictors are highly correlated. It is also important to keep in mind the linearity assumption that is being made in such a linear model.

You can have a look at Multiple regression: testing and interpreting interactions, by Leona S. Aiken, Stephen G. West, and Raymond R. Reno (Sage Publications, 1996), for an overview of the different kind of interaction effects in multiple regression. (This is probably not the best book, but it's available through Google)

Here is a toy example in R:

library(mvtnorm)
set.seed(101)
n <- 300                      # sample size
S <- matrix(c(1,.2,.8,0,.2,1,.6,0,.8,.6,1,-.2,0,0,-.2,1), 
            nr=4, byrow=TRUE) # cor matrix
X <- as.data.frame(rmvnorm(n, mean=rep(0, 4), sigma=S))
colnames(X) <- c("x1","x2","y","x1x2")
summary(lm(y~x1+x2+x1x2, data=X))
pairs(X)

where the output actually reads:

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.01050    0.01860  -0.565    0.573    
x1           0.71498    0.01999  35.758   <2e-16 ***
x2           0.43706    0.01969  22.201   <2e-16 ***
x1x2        -0.17626    0.01801  -9.789   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.3206 on 296 degrees of freedom
Multiple R-squared: 0.8828, Adjusted R-squared: 0.8816 
F-statistic: 743.2 on 3 and 296 DF,  p-value: < 2.2e-16

And here is how the simulated data looks like:

alt text

To illustrate @whuber's second comment, you can always look at the variations of $Y$ as a function of $X_2$ at different values of $X_1$ (e.g., terciles or deciles); trellis displays are useful in this case. With the data above, we would proceed as follows:

library(Hmisc)
X$x1b <- cut2(X$x1, g=5) # consider 5 quantiles (60 obs. per group)
coplot(y~x2|x1b, data=X, panel = panel.smooth)

alt text

(+1) If you have the time and inclination you might strengthen this answer by expanding on your claim that including X1X2 makes the effect of X1 on Y vary with X2. Specifically, a model Y = b0 + b1X1 + b2X2 + b3(X1X2) + error can also be viewed as having the form Y = b0 + (b1 + b3X2)X1 + b2X2 + error, showing precisely how the coefficient of X1--which equals b1 + b3*X2--varies with X2 (and, symmetrically, the coefficient of X2 varies with X1). That's a simple, natural form of "interaction." — whuber, Nov 27 '10 at 21:13
@chl - Thanks for the response. The problem that I have is that I have a large n (11K) and am using MiniTab to do an Interactions Plot and it takes forever to calculate but doesn't show anything. I'm just not sure how I see if there is interaction with this dataset. — TheCloudlessSky, Nov 27 '10 at 21:33
@TheCloudlessSky: One approach is to slice the data into bins according to values of X1. Plot Y versus X2 bin by bin, looking for changes in slope as the bins vary. Do the same with the roles of X1 and X2 reversed. — whuber, Nov 27 '10 at 22:05
@chl The trellis display is a nice illustration. Slicing one variable at equal-interval quantiles is attractive. There are other approaches. E.g., Tukey recommended slicing by halving the tails: that is, slice the X2 values into halves at the median, then slice those halves by their medians, then slice the lower half of the lowest group at its median and the upper half of the highest group at its median, and so on, continuing for as long as the new groups have enough data. — whuber, Nov 28 '10 at 19:25

Is interaction possible between two continuous variables?

1 Answers1

Linked

Related