Is it acceptable to remove a main effect in an interaction model where the variable is only populated for a treatment group?

Question

I am trying to construct a model to predict how a treatment will affect new units in the presence of certain covariates. For purposes of explanation, let's suppose the units are bacterial colonies and the treatment is the application of heat. The dependent variable in my model is a continuous outcome measure, Y (e.g. density of a desired protein in the bacterial colonies). The predictor variables are:

A coded Treatment variable, taking on the value of 0 for untreated units and 1 for treated units.
A series of variables X1, X2, ...X_(N-1) which are populated for all units and which affect the impact of the test treatment.
A variable X_N which is populated only for the test units. For purposes of the example, let's suppose X_N is the proximity of the bacteria to the heat source (so it's sort of a treatment intensity variable, but not quite).

My goal is to generate a predictive model to ascertain, for a new unit, how large the effect of the test treatment will be. If I were only seeking to model on X1, X2, ...X_(N-1), I think I would just want to construct the model:

Y = B0 + B1*Treatment + B2*X1 + B3*X2 + ... B_N*X_(N-1) + B_(N+1)*Treatment*X1 + B_(N+2)*Treatment*X2 + ... B_(2N-1)*Treatment*X_(N-1)

I could then apply the model to the values for a new unit to make a prediction for its protein generation with and without the application of heat, and take their difference to predict the incremental benefit of the heat treatment.

My question is how to incorporate the X_N variable (proximity to the heat source). Since this variable is only populated for treatment units, I would think it makes sense to incorporate the variable as an interaction effect but NOT a main effect, i.e. construct my model as:

Y = B0 + B1*Treatment + B2*X1 + B3*X2 + ... B_N*X_(N-1) + B_(N+1)*Treatment*X1 + B_(N+2)*Treatment*X2 + ... B_(2N-1)*Treatment*X_(N-1) + B_(2N)*Treatment*X_(N)

If I built my model like this, I could make predictions for the incremental effect of applying heat at a specific distance from the new bacterial colony. However, having read a few posts on this forum, it seems like there is a lot of debate about whether it is ever acceptable to incorporate an interaction effect without incorporating a main effect. I don't want to add a main effect in this case because X_N isn't populated for my control units.

Does anyone know if it's acceptable to add an interaction effect, but not a main effect, in cases like these?

score 2 · Answer 1 · answered Jun 09 '15 at 00:34

In my opinion, you don't need to be worrying about this idea of interaction vs main effects at all, but rather you are looking at something similar to dosage effects: "dosage" could be something like 1/distance to flame. In the untreated group, 1/distance to flame is very naturally set to 0. Under this view, you don't need to think about interaction effects at all and you will get your answer in a very nice fashion.

Similarly, if you are considering a very large number of different distances with very few measurements at each distance, you might want to consider something like monotonic regression to take advantage of the fact that you know effect should increase as distance decreases, yet you don't need to restrain the effect to be a purely linear.

Thanks for the prompt response. Would it be possible to point me toward any overviews on dosage models or monotonic regression? I'm not familiar with either of these concepts. — JSMInspiredMe, Jun 09 '15 at 17:46
Based on a very quick review, this appeared the gentlest introduction to isotonic (i.e. monotonic) regression that I could find:
http://www.wiley.com/legacy/wileychi/eob/pdfs/eob2_isotonic.pdf — Cliff AB, Jun 09 '15 at 18:10

Is it acceptable to remove a main effect in an interaction model where the variable is only populated for a treatment group?

1 Answers1