I am thinking about this question in using a dummy variable, from the textbook we know dummy play as a "switch" to represent different groups regression models. However, in data analysis, if I have a two-level factor explanatory variable and I am particularly interested in one level, given the dataset can I just shrink the dataset to focus on that particular level.
I've played around with the data and realised whether using a dummy variable or not, obtained the same parameter estimation, but without a dummy, I could have a smaller standard error for the estimated parameter for that specific level.
I am just wondering is there any drawback of this procedure in data analysis, is it ok to shrink the dataset?
I was not able to post those data to discuss until I had submitted the assignment, now we can discuss a bit further.
Here is the regression of $sqft$ on $price$, the upper left model is only regressed on traditional-style data, bottom left is the regression with indicator variable on the whole data set(traditional+nontraditional), and the right one is the model where traditional takes value $1$, we just compare the upper left and right, we could see the estimate is the same, but left model with lower $SE$ so that I prefer to use the left model when all questions are asking about traditional style, is that appropriate to shrink the data using the left model as I did. If not, why not?
