1

I'm training an LSTM model architecture with the following hyperparameters: hidden_size = 128 num_layers = 1 batch_size = 64 learning_rate = 0.001

The learning curve looks as shown below

enter image description here

I'm wondering why is the training loss decreasing, while the validation loss is oscillating around some mean value. Does it mean that the model is overfitting, since it can't perform well on the test set?

0 Answers0