0

I know MICE can be used for imputation of multiple variables simultaneously. The expectation maximization approach (EM) can be used to impute missing data. Typically, one should only be using imputation on variables with missing rates of < 10%; is there an imputation approach that allows robust imputation when missing rates are much higher? Is the EM approach suitable? The scenario considered is where there are ~50 variables being considered with potential collinearity.

enter image description here

StatsBio
  • 103
  • https://stats.stackexchange.com/questions/208845/problems-with-missing-values gives a good explanation but not from EM's perspective. – StatsBio Mar 16 '22 at 14:33
  • https://stats.stackexchange.com/questions/122015/what-are-the-pros-and-cons-of-using-median-imputation-to-handle-missing-value/136172#136172 However implied that EM is prone to overfitting. – StatsBio Mar 16 '22 at 14:39

0 Answers0