Cannot underfit/overfit on the IRIS dataset

Question

I am playing with the IRIS dataset and want to see underfitting and overfitting in action. I am using a multilayer perceptron (2 layers).

The problem is that I cannot underfit or overfit the data (see the plot below). I understand why I cannot underfit: it might happen if data is easily separable, but why I cannot overfit? The dataset capacity is 600 (# of samples (150) times # of features (4)), so I should be able to overfit using a network with a capacity bigger than that. I am trying to use a multilayer perceptron with a total # of parameters ranging from 15 to ~32000, but neither under-, nor over-fitting happens. What is going on? Maybe overfitting does not happen for the same reason, because the data is easily separable? Thank you!

The Iris dataset doesn't seem to fit neural networks techniques as it has a very limited number of examples, even more so as one divide the original dataset into training/test/validation. — tagoma, Aug 27 '17 at 07:20

score 1 · Answer 1 · answered Aug 27 '17 at 07:14

1

IRIS is a very small data set, with not that many data points. I have found that to play with a data set, MNIST does much better job, and you can easily do over fitting and under fitting there.

answered Aug 27 '17 at 07:14

Juan Antonio Gomez Moriano

1,197
1
8
17

Yes, I also tried to overfit on MNIST and it worked, I was just wondering why it does not work for IRIS. Interestingly I have the same problem with other datasets, which are pretty big. Probably, data is not very representative. Also it could be that the IRIS problem is too easy, so any algorithm can solve it and it is impossible to overfit. – Yuri Aug 28 '17 at 17:51
Note that in order to over fit your model must "memorize" the problem, for that to occur you need a lot of data, which is not the case of the iris dataset. – Juan Antonio Gomez Moriano Aug 29 '17 at 01:43

Cannot underfit/overfit on the IRIS dataset

1 Answers1