2

Coursera Machine Learning in the Enterprise - Science of Machine Learning and Custom Training says large batch size require smaller LR.

enter image description here

However, How should the learning rate change as the batch size change? suggests otherwise.

However, recent experiments with large mini-batches suggest for a simpler linear scaling rule, i.e multiply your learning rate by k when using mini-batch size of kN. See P.Goyal et al.: Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour https://arxiv.org/abs/1706.02677

Which is correct? Large batch size requires large LR or Large batch size requires small LR.

mon
  • 1,468

0 Answers0