Why the accuracy of my neural network is falling when epoch increases?

Question

Here is my model :

model = Sequential()
model.add(LSTM(4, input_shape=(look_back, trainX.shape[2])))
model.add(Dense(1))
model.compile(loss='binary_crossentropy', optimizer='adam',metrics=['accuracy'])
model.fit(trainX, trainY,validation_split=0.3, epochs=100, batch_size=50, verbose=1)

and is what I get... I see that the accuracy is suddenly falling for some epochs and remains low until the end, while the validation accuracy is quite great.

What's wrong ? Is it under/over fitting (I guess no overfitting, otherwise validation score should be low) ? How can I prevent from such a thing a priori (i.e. by initializing a parameter or sth like this). Is reducing the number of epochs would be a solution ?

What is quite strange is that the accuracy first grows and reaches a very high value, and then suddenly falls, with no appearing reason...

Thanks for help

Possible duplicate of What should I do when my neural network doesn't learn? — Jan Kukacka, Aug 17 '18 at 08:54
Are you trying to perform classification or regression? You probably need an activation function as last layer. — Tonca, Aug 17 '18 at 09:01
@Tonca I am trying to do binary classification on time series. That's why I use binary-crossentropy — MysteryGuy, Aug 17 '18 at 09:18
Then you should put a sigmoidal activation function after the dense layer. I think — Tonca, Aug 17 '18 at 09:19
@Tonca Could you please write the line corresponding ? I have pain to see precisely what you mean... — MysteryGuy, Aug 17 '18 at 09:28
@JanKukacka I don't think. Here my performances are first growing and then fall very fast at 68th epoch (only on the training dataset) — MysteryGuy, Aug 17 '18 at 09:51
I'm voting to leave this question open because it appears that this is related to using a linear activation function in the final layer instead of softmax. That is, it's a conceptual problem related to neural networks for classification. — Sycorax, Aug 18 '18 at 03:57

score 5 · Accepted Answer · answered Aug 17 '18 at 09:47

5

I think that the reason of the instability of your network is the missing activation function as last layer.

As you are preformin a binary classification you should add a sigmoidat activation function right after the fully-connected layer. You can doit like this:

model.add(Dense(1, activation="sigmoid"))

answered Aug 17 '18 at 09:47

Tonca

616

I test it and give you feedback :) . Do you think I should change my number of epochs ? Indeed, it's only from the 68th epoch that it falls dramatically – MysteryGuy Aug 17 '18 at 09:49
I put model.add(LSTM(50, return_sequences=True,input_shape=(look_back, trainX.shape[2]))) model.add(LSTM(50)) model.add(Dense(1), activation='sigmoid') and I've got add() got an unexpected keyword argument 'activation' error for the last layer... Why ? Should I set return_sequences=True for second LSTM layer ? – MysteryGuy Aug 17 '18 at 10:06
1

I've corrected the answer, I made a typo earlier – Tonca Aug 17 '18 at 10:07
So return_sequences no needed (Why ?) ? – MysteryGuy Aug 17 '18 at 10:09
1

No, you don't need any information about the hidden states. – Tonca Aug 17 '18 at 10:14
But between the two LSTM layers, it seems it's necessary ? – MysteryGuy Aug 17 '18 at 10:17
1

You're right I'm sorry, it's necessary for the Dense layer as well. – Tonca Aug 17 '18 at 10:19
That's okay, I don't have the problem anymore BUT get another problem. Please look at https://stats.stackexchange.com/questions/362676/how-to-interpret-that-my-model-gives-no-negative-class-prediction-on-test-set – MysteryGuy Aug 17 '18 at 13:32

score 0 · Answer 2 · answered Nov 09 '21 at 17:40

I had a similar situation.

What worked for me is to decrease the learning rate when epochs increase.

What probably happens, that model arrives close to some local maximum and then jumps of it, since the learning rate is to high.

For example you can make a few epochs with high learning rate optimizer = optimizers.SGD(0.02), than switch to SGD(0.02) and after that to SGD(0.01).

The actual numbers might be different.

Why the accuracy of my neural network is falling when epoch increases?

2 Answers2