0

I want to remove from a linear regression formula all interaction that does not exist between two categorical variables.

The dataset has the following categories:

store, tv_brand, tv_size, tv_resolution, and tv_price.

The thing is that there are stores that only sell certain brands and brands that do not manufacture tv with certain resolutions. For that reason when I fit the values considering the interaction:

lm(log1p(tv_price) ~ tv_brand + tv_size + tv_resolution + 
         store * tv_brand + brand * resolution, 
         data = data)

The linear regression contains a lot of NA values for the missing interaction.

Even though these values NA does not affect the final linear regression, I would like to remove them because it might be confused with actual aliases (100% correlated Interaction) like a store that only sells a brand or something similar.

How can I create a linear regression where store * tv_brand interactions omit the nonexisting ones?

neilfws
  • 29,020
  • 5
  • 49
  • 59
  • Hi, Welcome to SO! You will probably get a better response to your question if it were possible to include your data, or a subset of the data that reproduces the data structure adequately to address your problem. Here are some tips on how to [include data](https://stackoverflow.com/a/5963610/5456906) in your post. – xilliam Dec 15 '21 at 11:23

0 Answers0