1

What are some recommended ways to tune hyperparameters and/or develop domain-specific architectures for a large neural network model? That is, how to further tune a large neural network that already takes a long time to train?

My question is two-fold (on hyperparameters and architecture):

The standard seems to be bayesian hyperparameter optimization but even so, this optimization can take a long time if one wishes to take advantage of the sequential optimization in this method (optimizing based on prior knowledge).

There is also evolutionary methods for developing novel architectures but how can the average data scientist execute this method while avoiding expensive computational resources? (I am not too familiar with evolutionary algorithms)

Are there any clever approaches/techniques to further tune hyperparameters/develop domain-specific architectures for neural networks that already take a long time to train?

lightbox142
  • 31
  • 1
  • 3

1 Answers1

0

There are several techniques related to this challenge of estimating the performance of a neural network without running the entire training procedure addressed by the multi-fidelity(quality of performance estimation) optimization literature. Some of them are:

  1. Training on a subset of data
  2. Training on lower resolution image if one is working on image datasets
  3. Extrapolating learning curves

Two good references for this literature are below:

https://www.ml4aad.org/wp-content/uploads/2018/07/automl_book_draft_neural_architecture_search.pdf

https://github.com/automl/HpBandSter

randomprime
  • 365
  • 2
  • 12