1

I am relatively new to neural networks. To understand the basics, I converted the MNIST database into a format that I liked, and I wrote a single layer NN with 784 neurons from scratch without using any library related to NN.

I trained the NN over 600 samples, and I tested it on 10000 samples. (I accept that the reverse would be much better.) I can see that as the NN trains with more and more samples, the error decreases almost exponentially with the training sample size. However, in the end, the test error was 90%.

Is this normal for such a simple NN? Where can I find the performance of different types of NNs with different parameters to compare with my own NN?

Of course, this is a very simple NN, but before making it more complex, I would like to know whether the results I am getting are in agreement with others, and I would like to get an idea about what kind of different methods & schemes could be used in other situations.

Sycorax
  • 90,934
Our
  • 237
  • Why aren’t you training on all 60000 training samples? – Dave Mar 14 '20 at 11:29
  • @Dave current computational contraints. – Our Mar 14 '20 at 11:30
  • 2
    You should set up the same model with an established framework and see what the performance is. 10% correct is very near to random guess rate. – Jon Nordby Mar 14 '20 at 12:19
  • MNIST has 10 classes, so 90% error rate is no better than a random guess. Check for bugs. This problem is pretty easy, so any error rate higher than 10% suggests that whatever method you're using is deficient, has a bug, or both. – Sycorax Oct 17 '23 at 19:26

2 Answers2

3

MNIST is a pretty simple, "toy" dataset, and you can get high performance ($>90\%$ accuracy) even with simple classifiers like logistic regression, single layer neural network, or $k$-NN. You can find summary of the results on the page maintained by Yann LeCun.

Tim
  • 138,066
  • I am not sure how this answers the question, given that the $90%$ in the OP refers to the error rate and not the accuracy. Nonetheless, this has been accepted, so perhaps I am missing something. – Dave Oct 17 '23 at 19:13
  • @Dave This appears to directly address the bold text in OP – Sycorax Oct 18 '23 at 01:53
2

THIS MAKES ME THINK THERE HAS BEEN A CODING ERROR

I remember this once happening to me. I had managed to scramble the true labels so that the expected accuracy really was just one out of ten for a $90\%$ error rate (because there are ten categories with even representation). Once I fixed that coding mistake I had made, my accuracy jumped up to be more along the lines of what would be anticipated for MNIST digit classification.

MNIST is pretty easy to solve with quite high accuracy, and while overfitting could happen, your model is so simple that I would suspect a coding error that has scrambled your test set categories before I would suspect a major issue with the model.

Dave
  • 62,186
  • The wrong style of training data or form of loss can do the same. Binary cross-entropy is not the same as categorical cross-entropy. – EngrStudent Oct 18 '23 at 02:18