I am trying to construct a model to predict how a treatment will affect new units in the presence of certain covariates. For purposes of explanation, let's suppose the units are bacterial colonies and the treatment is the application of heat. The dependent variable in my model is a continuous outcome measure, Y (e.g. density of a desired protein in the bacterial colonies). The predictor variables are:
- A coded Treatment variable, taking on the value of 0 for untreated units and 1 for treated units.
- A series of variables X1, X2, ...X_(N-1) which are populated for all units and which affect the impact of the test treatment.
- A variable X_N which is populated only for the test units. For purposes of the example, let's suppose X_N is the proximity of the bacteria to the heat source (so it's sort of a treatment intensity variable, but not quite).
My goal is to generate a predictive model to ascertain, for a new unit, how large the effect of the test treatment will be. If I were only seeking to model on X1, X2, ...X_(N-1), I think I would just want to construct the model:
Y = B0 + B1*Treatment + B2*X1 + B3*X2 + ... B_N*X_(N-1) + B_(N+1)*Treatment*X1 + B_(N+2)*Treatment*X2 + ... B_(2N-1)*Treatment*X_(N-1)
I could then apply the model to the values for a new unit to make a prediction for its protein generation with and without the application of heat, and take their difference to predict the incremental benefit of the heat treatment.
My question is how to incorporate the X_N variable (proximity to the heat source). Since this variable is only populated for treatment units, I would think it makes sense to incorporate the variable as an interaction effect but NOT a main effect, i.e. construct my model as:
Y = B0 + B1*Treatment + B2*X1 + B3*X2 + ... B_N*X_(N-1) + B_(N+1)*Treatment*X1 + B_(N+2)*Treatment*X2 + ... B_(2N-1)*Treatment*X_(N-1) + B_(2N)*Treatment*X_(N)
If I built my model like this, I could make predictions for the incremental effect of applying heat at a specific distance from the new bacterial colony. However, having read a few posts on this forum, it seems like there is a lot of debate about whether it is ever acceptable to incorporate an interaction effect without incorporating a main effect. I don't want to add a main effect in this case because X_N isn't populated for my control units.
Does anyone know if it's acceptable to add an interaction effect, but not a main effect, in cases like these?
http://www.wiley.com/legacy/wileychi/eob/pdfs/eob2_isotonic.pdf
– Cliff AB Jun 09 '15 at 18:10