0

When it comes to tuning neural networks, there are lots of design options:

  • Optimiser
  • Learning rate
  • Activation function
  • Number of hidden layers
  • Number of nodes/neuron in each hidden layer
  • Regularisation, e.g. early stopping
  • Batch size

I know the general principle is to start simple and incrementally add complexity. However, I have no idea what order to go in.

Say I start with a small fully connected network with a single hidden layer, SGD optimiser, batch size of 32 and ReLU activations. What should I change/explore first? And what after that? All the design choices seem to affect each other, so I'm struggling to figure out a systematic way.

Could someone provide a step-by-step guide, and also a justification for the order proposed?

EDIT, In response to Sycorax's comment:

Essentially I am training a simple fully connected network for a regression problem, 6 inputs and 6 outputs. And I wish to find the configuration -- i.e. set of design choices/hyperparameters -- that yield the lowest validation loss for this regression problem.

EDIT 2: my question isn't to do with generalisation. I'm not having trouble getting my model to generalise. I just want to know whether there's a systematic way of tuning neural network parameters such that the best performance is achieved.

  • 3
    It depends on what your motive is for changing any component. Perhaps you could [edit] to explain what problem you're trying to solve, and how changing your neural network configuration helps you to solve it. – Sycorax Feb 10 '24 at 18:12
  • @Sycorax Thanks for the feedback, I've edited my question now. Does that help? – Blahblahblacksheep Feb 10 '24 at 22:19
  • I'd say it was more of a duplicate of this question, actually:

    https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn/352037#352037

    – Adrian Keister Feb 11 '24 at 20:54

0 Answers0