0

While fine-tuning a deep neural network I ran into the following situation:

  1. My train- and validation loss are both decreasing and have very similar values throughout training. Especially the train-loss is not significantly lower than the validation loss. Still both loss values are rather high. Thus, I would argue the model is underfitting.
  2. After training is complete, I am calculating various metrics such as precision and recall on the trainingset and the validationset. Here, the metrics on the trainingset seem sound but the metrics on the validationset show very poor performance. Thus, I would argue the model is overfitting.

This seems very contradictory to me. Usually I would argue that the loss function I am using is not useful for the task I want the model to learn, however, I do not have the possibility to change it.

What can I do now? And: is my model underfitting or overfitting?

User1865345
  • 8,202
Lipa
  • 1
  • 1
    Precision and recall are highly problematic metrics and suffer from the exact same weaknesses as accuracy. Better not to trust them. Also, how do you figure out that your loss values (which loss are you using?) are "rather high"? "High" compared to what? There is no absolute benchmark for losses, and the only comparison that makes sense is to a benchmark model, e.g., a naive or climatological one. Some kinds of data are just harder to predict or classify than others. Just because your loss has a certain value does not indicate underfitting. – Stephan Kolassa Mar 14 '23 at 07:43
  • Thanks for the link - that is indeed a very good and interesting read! However, even though precision and recall may be an improper scoring rule, I would still argue that the effect should be same on train- and validation set, hence there shouldn't be a huge difference between them. – Lipa Mar 14 '23 at 08:09
  • You are right, to say that the loss is "rather high" without having a benchmark is questionable. Still, train- and validation loss have very similar values (up to the fourth decimal). For a model that is overfitting, I would usually assume the validation loss to be higher than the training loss, and maybe even to start increasing. However, this is not the case for me, as both loss curves are decreasing. – Lipa Mar 14 '23 at 08:09
  • This may be helpful: https://stats.stackexchange.com/q/352036/1352 – Stephan Kolassa Mar 14 '23 at 08:19

0 Answers0