1

I'm using the R h2o package for building a deep net with three hidden layers. When inspecting the model object, I'm noticing the training RMSE fluctuates as a function of number of epochs. I'm assuming with a stable gradient, the train RMSE should monotonically decrease as a function of epochs until convergence.

Are there parameters I should vary to stabilize learning as a function of epochs?

  • it is a neural network. It is a big, messy, sloppy neural network. Any of the "normal" neural network issues - it will have those and more. I found in my nearly trivial experiments with deep networks that multiplying the number of nodes on a layer by 3x can reduce oscillations at the cost of slower learning. – EngrStudent Sep 23 '15 at 21:08
  • what happens if you decrease the learning rate by a factor of 10? – John Madden Jan 16 '24 at 19:14

2 Answers2

0

H2O deep learning is finding local minimum or saddle points, that's likely why you are seeing the fluctuations.

Try varying l1 and l2 regularization, input layer and per layer dropout. Those are usually a good starting point for building models that eventually converge. You will likely want to do some sort of search (grid or random search) across these hyperparameters.

The H2O booklet covers much of these items: http://h2o-release.s3.amazonaws.com/h2o/rel-slater/3/docs-website/h2o-docs/booklets/DeepLearning_Vignette.pdf

0

The details are sparse, but it's very plausible that this is due to the learning rate being too large, so the model will overshoot a minimum and start to increase the loss, until it reverses direction again.

Here's an example of how that can happen: How can change in cost function be positive?

More tips and tricks for troubleshooting a neural network when the loss does not decrease: What should I do when my neural network doesn't learn?

Sycorax
  • 90,934