Estimating treatment effect with/without intercept

Question

I am trying to estimate the treatment effect based on the above two model: $$Y(Z)=\beta_0+\tau Z+\varepsilon.$$ $$Y(Z)=\tau Z+\varepsilon.$$ Based on result from my data, I found the intercept is not significant, and the estimate of $\tau$ differs significantly.

I understand, typically we need to include the intercept term. However, the treatment $Z$ and the intercept can have a strong correlation. Therefore if we include the intercept, the variance of the estimate of $\tau$ will be very large. Can we exclude the intercept if it is not significant?

I generate a toy example as follows to illustrate the problem.


set.seed(1)
Z=rbinom(100,1,0.8)
X=rnorm(100)
Y=0.5*Z+X+rnorm(100,sd=1)

Then I fit the model with/without intercept.

> summary(lm(Y~Z+X))
Call:
lm(formula = Y ~ Z + X)
Residuals:
     Min       1Q   Median       3Q      Max 
-2.84170 -0.66118 -0.03344  0.70559  2.40098
Coefficients:
            Estimate Std. Error t value
(Intercept)   0.1953     0.2560   0.763
Z             0.2935     0.2812   1.044
X             1.0611     0.1129   9.395
            Pr(>|t|)

(Intercept)    0.448

Z              0.299

X           2.72e-15 ***

Signif. codes:

  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1
  ‘ ’ 1
Residual standard error: 1.055 on 97 degrees of freedom
Multiple R-squared:  0.4771,    Adjusted R-squared:  0.4664 
F-statistic: 44.26 on 2 and 97 DF,  p-value: 2.195e-14
> summary(lm(Y~Z+X-1))
Call:
lm(formula = Y ~ Z + X - 1)
Residuals:
     Min       1Q   Median       3Q      Max 
-2.83963 -0.66514 -0.03021  0.70907  2.39416
Coefficients:
  Estimate Std. Error t value Pr(>|t|)

Z   0.4889     0.1156   4.229 5.29e-05 ***
X   1.0648     0.1126   9.457 1.82e-15 ***

Signif. codes:

  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1
  ‘ ’ 1
Residual standard error: 1.052 on 98 degrees of freedom
Multiple R-squared:  0.5155,    Adjusted R-squared:  0.5056 
F-statistic: 52.13 on 2 and 98 DF,  p-value: 3.809e-16

We can see the model with intercept does not estimate the effect of $Z$ well, and does not recognize the significancy of $Z$, but the model without intercept does. So, can we remove the intercept, or, what we can we do to fix the problem?

Estimating treatment effect with/without intercept

0 Answers0