Doing multi-value imputation to maintain the same distribution

Asked Nov 18 '19 at 16:03

Active Nov 18 '19 at 16:03

Viewed 62 times

I currently learning value imputation. The popular methods that I've seen such as mean, median, arbitrary value etc, impute all missing values with a single calculated value. Each of these methods can potentially alter the distribution of the variable.

This has given me an idea. Why not impute missing values with multiple imputes calculated to retain the same distribution. For example, when doing mode imputation, if the most frequent values in a categorical variable are E and F, and they occur 10 and 5 times respectively. Then the three missing values can be replaced by E, E, F. Obviously this would be scripted as a more generic solution involving top n values.

Would there be any disadvantages to this method? Is this a known standard method (I couldn't find one)?

asked Nov 18 '19 at 16:03

Kshitiz Sharma

Doing multi-value imputation to maintain the same distribution

0 Answers0