My classification model (10 layers of MLP) has to overfit training data to perform well (measured by loss and other metrics), however this overfits and perform badly on validation data; without overfitting (smaller model, regularization, etc.), both data fit badly.
Data are collected using same process and randomly split to training and validation with 5:1 ratio, I had verified their distributions are the same.
Why does this happen and what should i do for this situation? Thank you.