I am trying to estimate the treatment effect based on the above two model: $$Y(Z)=\beta_0+\tau Z+\varepsilon.$$ $$Y(Z)=\tau Z+\varepsilon.$$ Based on result from my data, I found the intercept is not significant, and the estimate of $\tau$ differs significantly.
I understand, typically we need to include the intercept term. However, the treatment $Z$ and the intercept can have a strong correlation. Therefore if we include the intercept, the variance of the estimate of $\tau$ will be very large. Can we exclude the intercept if it is not significant?
I generate a toy example as follows to illustrate the problem.
set.seed(1)
Z=rbinom(100,1,0.8)
X=rnorm(100)
Y=0.5*Z+X+rnorm(100,sd=1)
Then I fit the model with/without intercept.
> summary(lm(Y~Z+X))
Call:
lm(formula = Y ~ Z + X)
Residuals:
Min 1Q Median 3Q Max
-2.84170 -0.66118 -0.03344 0.70559 2.40098
Coefficients:
Estimate Std. Error t value
(Intercept) 0.1953 0.2560 0.763
Z 0.2935 0.2812 1.044
X 1.0611 0.1129 9.395
Pr(>|t|)
(Intercept) 0.448
Z 0.299
X 2.72e-15 ***
Signif. codes:
0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1
‘ ’ 1
Residual standard error: 1.055 on 97 degrees of freedom
Multiple R-squared: 0.4771, Adjusted R-squared: 0.4664
F-statistic: 44.26 on 2 and 97 DF, p-value: 2.195e-14
> summary(lm(Y~Z+X-1))
Call:
lm(formula = Y ~ Z + X - 1)
Residuals:
Min 1Q Median 3Q Max
-2.83963 -0.66514 -0.03021 0.70907 2.39416
Coefficients:
Estimate Std. Error t value Pr(>|t|)
Z 0.4889 0.1156 4.229 5.29e-05 ***
X 1.0648 0.1126 9.457 1.82e-15 ***
Signif. codes:
0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1
‘ ’ 1
Residual standard error: 1.052 on 98 degrees of freedom
Multiple R-squared: 0.5155, Adjusted R-squared: 0.5056
F-statistic: 52.13 on 2 and 98 DF, p-value: 3.809e-16
We can see the model with intercept does not estimate the effect of $Z$ well, and does not recognize the significancy of $Z$, but the model without intercept does. So, can we remove the intercept, or, what we can we do to fix the problem?