6

I have a fairly small dataset with 100 examples per class and 12 classes in total. Out of all the CNN models I have tried, the only inference I could make is that my training accuracy plateaus at 97%, but my validation accuracy is 7-8% which can be random.

So where is the problem?

  1. Is my dataset too small?
  2. Is my code wrong? (I am not asking for any code advices, just a conceptual question)
  3. Something else
Ferdi
  • 5,179

1 Answers1

13

Sounds like you are severely overfitting. Basically, you need to use a simpler model than the one you are currently using or collect (a lot) more data. Generally, the more data you have, the more complex a model you can fit without overfitting.

I do not think you are going to get meaningful results using a CNN on such a small dataset. Start with a simple decision tree with 1 to 3 levels to establish a benchmark. Maybe try linear models with high regularization. You are looking for poor performance (but better than random) on the training set and similar performance on the validation set. Then you can start trying more complex models that fit the training set better and maybe generalize to the validation set a bit better, too.

rinspy
  • 3,360
  • 5
    I disagree that a CNN could not be used in this case. I do think that training a CNN will not work without overfitting. This could be a good candidate for transfer learning. Using a pretrained CNN on Imagenet to extract image features then classify the features with a linear SVM can work well for small datasets. – J_Heads Aug 03 '18 at 17:41
  • 5
    Never underestimate the ability of a ninth order polynomial to fit your data. – Joshua Aug 03 '18 at 23:29