I'm currently working on a machine learning model for a classification task in an engineering application. While working on this project I realized that the provided data is insufficient to get a robust classification running.
Now I'm planning to collect more data using DoE methods like fractional factorial to capture the whole plausible range of levels for the factors while keeping the number of experimental runs on a reasonable level.
In the course of doing some research on this, I found no proof which verified these method to gather data for training of ML models. So I'm worried to miss something and to end up with just another bunch of insufficient or biased data.
Some figures: The DoE I'm thinking of consists of three continuous factors and five discrete factors with two or three levels.