Strange increase in loss during training

Question

As a project I've built a graphical neural network sandbox software, which has the feature of drawing a loss graph to show how well the network is learning during training. I'm receiving these really odd results shown below though, and I can't figure it out.

The loss decreases over time telling me it is learning, however eventually it tends to reverse and increases in loss as I feed it more samples. These are all samples used in training by the way, so it's not just overfitted. The samples are also well shuffled.

Is there a reason a pattern like this can occur? Or could it be caused by something wrong with my software, such as my implementation of backpropagation?

What statistic is being measured by "Error"? Is this the proportion of samples with incorrect predictions, or something else (e.g. binary cross-entropy loss, or square error)? — Sycorax, Jun 06 '22 at 15:30
@Sycorax In this case I'm using MSE, however a similar pattern occurs with MAE as well. — Lemniscate, Jun 06 '22 at 15:36
Does the pattern persist if you reduce the learning rate? I recommend trying a variety of learning rates on a logarithmic scale (e.g. 0.1, 0.01, 0.001, etc.) — Sycorax, Jun 06 '22 at 15:44
@Sycorax Well this is interesting, it does persist but the learning rate does seem to impact when this reversal happens
A learning rate of 0.1 causes it to happen later (around the 80 sample mark), and oddly a learning rate of 0.01 causes it to happen sooner (around the 50 sample mark) — Lemniscate, Jun 06 '22 at 16:01
Diagnosing exactly what's happening and why will require more detail about your data and your model. — Sycorax, Jun 06 '22 at 16:40
My model is fully connected and has 2 inputs, a single hidden layer with 3 neurons, and a final output neuron. The activation of all neurons is sigmoid.
I'm feeding it a 3 dimentional regression problem, I am giving it X and Y as inputs for it to learn to predict Z. — Lemniscate, Jun 06 '22 at 16:56
I believe the duplicate answers your question. If it does not, please [edit] to clarify what you still want to know. Also, please include the model details in the question and also include information on if the data are scaled & how, how the network is initialized, and how the model is trained (what optimizer, what learning rate, if you're using mini-batches, the mini-batch size) and the number of observations you have, and any regularization you're applying (and what it is, and its configuration details). — Sycorax, Jun 06 '22 at 16:58
It is certainly sounding more like a coding issue, I'll have a look through my program to see what might be going wrong — Lemniscate, Jun 06 '22 at 18:08

Strange increase in loss during training

0 Answers0