I am currently working on big dataset. There are a few columns which are ordinal categorical data. In order to simply the dataset, I decided to change them into numeric. However, there are missing values within these few columns. I would like to drop these rows which contain missing values in these columns cause it only is 0.9% of the total number of rows. I have checked the target variable which I want to predict, these 0.9% rows of data doesn't contain boundary values. But, I can't find any reference to support my approach.
Is it safe to drop these rows? Data imputation could be quite complex since there are many columns that contain missing values. The dataset is also quite large to run some automate data imputation such as using the mice package.
Could you please provide some suggestions on what reference can I get to support my approach?
Many thanks!