I am currently working on a project involving banking stock price data. I have around 3000 observations, some columns have a lot of missing values (null value); they can account for 5 to 50% of the total observations. I have no idea what is the proper order for handling missing values, outliers and take log transformation of the data. Should we deal with outliers then impute the missing values and take log transformation or should it be imputation -> outliers -> transformation or transformation -> outliers -> imputation, etc. Please help me with this problem. Thanks a lot. Steve Nguyen
Asked
Active
Viewed 200 times
0
-
About outliers: https://stats.stackexchange.com/questions/200534/is-it-ok-to-remove-outliers-from-data – mkt Aug 29 '22 at 05:04
-
Thanks for the post you shared; it is really helpful. I think I may not remove any outliers now, my question is just for the broader context, in which I may encounter a situation when removing outliers is plausible – MINH NHỰT NGUYỄN TRẦN Aug 29 '22 at 05:14