2

I have a very difficult binary text classification problem for which I am trying to fine-tune a bert model (bert_en_uncased_L-12).

By "difficult" I mean, the text data contains a large amount of features that are shared between classes and the differences between my classes are probably more subtle and often require real world knowledge. Since I have a very large dataset, I am trying out different ways of selecting observations for my two classes to try and increase inter-class distinctiveness.

Whenever I start training, my binary accuracy is stuck at around 0.5 without much improvement (acc sometimes goes up by 0.1 and then goes down again, loss does improve continuously, but very slowly, maybe by 0.5 over the first epoch).

Epoch 1/4 13850/41441 [=========>....................] - ETA: 20:28 - loss: 7.7321 - binary_accuracy: 0.4984

Question: How quickly can you tell that your model is not going to improve reasonably in this kind of situation? Should I try letting it run for multiple epochs or is one epoch of basically no improvement in accuracy (and only very slow improvement of loss) enough to scrap the experiment?

Any help would be much appreciated.

Sigmund
  • 21
  • 1
    Lots of suggestions here: https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn/352037?r=SearchResults&s=1%7C0.0000#352037 the best place to start is probably adjusting the learning rate – Sycorax Jun 02 '22 at 13:43

0 Answers0