I am trying to run a simple regression using a categorical predictor for the first time. I have the following model: lm(income ~ gender, data = df)
I have two questions:
What are the assumptions that need to be met? I know when the predictor is continuous there are assumptions of linearity, normality of residuals, homoscedasticity and independence of residual error terms. Some of these don't apply to categorical data, so what test should I be running? (I am using R)
What is an alternative in the event that these assumptions don't hold?
genderhas two levels, this model is equivalent to a t-test (not Welch's test though). If the categorical variable has more than 2 levels, it's analogous to an ANOVA. The assumptions you listed all apply (except linearity). There are multiple alternatives, one example would be a permutation test. – COOLSerdash Aug 25 '22 at 07:39lm(income ~ gender). I never implied or said that there are no other options!! – dipetkov Aug 25 '22 at 08:10