0

I ran simple linear regression models, however my model could not meet all the assumptions (e.g., the normality of the residuals, the homogeneity of the variance). I know that both are quite important to be meet if I want to run the simple linear regression model. I tried different kinds of transformations (e.g., square, log10, BoxCox, so on), but none of them were successful. So, by searching/reading some literatures, I decided to focus on resolving the heteroscedasticity in the linear model with Robust Standard Errors.

My question is that: After resolving the heteroscedasticity, is that ok for me to use the estimate, se, and CI from my model without paying any attentions on the normality. Thank you in advance!

Anh
  • 101
  • Welcome to Cross Validated! How are you determining that you fail to meet these assumptions? Plots? Formal testing? – Dave Aug 10 '22 at 19:42
  • Hi @Dave. Yes, so far I am using the package performance with the function check_normality, and check_heteroscedasticity instead of using qqplot so on. – Anh Aug 10 '22 at 19:47
  • I don’t know that package. Is it doing hypothesis testing? – Dave Aug 10 '22 at 19:52
  • Yes, as far as I checked, check_normlity is the same like shapiro_test... – Anh Aug 10 '22 at 19:54
  • According to the ref: check_normality() calls stats::shapiro.test and checks the standardized residuals (or Studentized residuals for mixed models) for normal distribution. Note that this formal test almost always yields significant results for the distribution of residuals and visual inspection (e.g. Q-Q plots) are preferable. – Anh Aug 10 '22 at 19:55
  • 2
    Normality testing is essentially useless, and ditto for formal testing of variance differences for about the same reason: you all-but-know the variances are at least a little bit different. – Dave Aug 10 '22 at 19:59
  • @Dave thanks Dave. – Anh Aug 10 '22 at 21:12
  • If your residuals are not normally distributed, it simply means that the linear model is too simplistic, or that other variables are influencing your data. What are your data and what kind of relationship do you expect between them? – Koen Van de Moortel Aug 12 '22 at 14:19
  • Welcome to Cross Validated! This claim is false. A relationship can be totally linear without having a normal error term. For instance, use a Laplace error term. – Dave Aug 12 '22 at 14:33
  • Well, maybe I ought to say: "it's an indication that the linear model might be unappropriate". Any pattern that can be seen in the residuals and which doesn't look random, is an indication that something is wrong with the model. – Koen Van de Moortel Aug 12 '22 at 14:55
  • @Koen You seem to equate "non-Normality" with "doesn't look random," but that's incorrect. – whuber Aug 12 '22 at 15:24

0 Answers0