0

I am training a 6 layer Deep Neural Network with:

<bound method Module.parameters of Model2(
  (layer1): Linear(in_features=4800, out_features=8000, bias=True)
  (layer2): Linear(in_features=8000, out_features=5000, bias=True)
  (layer3): Linear(in_features=5000, out_features=2000, bias=True)
  (layer4): Linear(in_features=2000, out_features=200, bias=True)
  (layer5): Linear(in_features=200, out_features=20, bias=True)
  (layer6): Linear(in_features=20, out_features=52, bias=True)

Inputs are images in size 60 * 80. I am using relu activation function, Cross-Entropy for loss function, and Stochastic Gradient Descent. I used the weight-decay parameter(i.e. L2 regularization method). I don't face overfitting but the accuracy was 61 % and now is 26%!

Can anyone explain the reason?

(I assigned hyperparameter of regularization to 0.01. I changed it but no improvement in the accuracy!)

Dave
  • 62,186
Maryam
  • 1
  • How does your loss function perform with and without regularization? Accuracy has some issues as a performance metric. 2) In-sample or out-of-sample accuracy?
  • – Dave Jul 06 '20 at 11:29
  • @Dave I didn't get it. What do you mean by loss function performance? I am just comparing the accuracy of model on the train and test data(to check overfitting) and the accuracy of the model. Also, I checked the parameters of my network, weights decreased but they were not near zero. – Maryam Jul 06 '20 at 11:33
  • @Dave It is out-sample accuracy that I mentioned in the post. – Maryam Jul 06 '20 at 11:35
  • Evaluate the cross-entropy loss on your out-of-sample data. It is possible for loss to decrease while accuracy also decreases. Accuracy turns out to be a surprisingly bad performance metric, despite how common it is. – Dave Jul 06 '20 at 11:37
  • @Dave For 10 epochs my losses are:epoch 1 3.109, epoch 2, 2.585, epoch 3, 2.469 , epoch 4, 2.306, epoch 5, 2.267, epoch 6, 2.161, epoch 7, 2.103, epoch 8, 2.043, epoch 9, 2.047, epoch 10, 1.996, – Maryam Jul 06 '20 at 11:49
  • Losses on what data? – Dave Jul 06 '20 at 12:16
  • @Dave on training data. For out-sample data, I should wait more for finishing running. – Maryam Jul 06 '20 at 12:19
  • 0.01 is generally too high for weight decay -- i usually use 5E-4 – shimao Jul 06 '20 at 14:17
  • @shimao I take acceptable accuracy and loss for 0.0001 but the training accuracy is 3 percent higher. – Maryam Jul 06 '20 at 14:44