2

I feel it is kind of circular to use GP for hyperparameter tuning, since GP has its own hyperparameters. Or is it the case that GP typically has less number of hyperparameters than the model we want to tune (say NNs), which mitigates the issue somewhat?

Sam
  • 363

1 Answers1

1

You are correct Gaussian process has it's own hyperparameters. But same is true for every other hyperparameter tuning algorithm you would use. If you use grid search, you need to decide on the grid of points to use, if you want to do random search, you need to decide on the distributions used for sampling the parameters, etc.

The good news is that in many cases, the models we use are not that sensitive to the hyperparameters (it doesn't matter that much if the learning rate is 0.01, 0.013, or 0.014, but matters if it's closer to 0.0001), so it is not that important for the algorithms that we use for hyperparameter tuning to be that precise.

Tim
  • 138,066
  • Just to clarify are you trying to make a case to say GP is not super sensitive to its hyperparameter setting, or in general ML model is not super sensitive to hyperparameter setting? If it is former then I guess it has to do with some of GP's properties but if it is the latter then isn't using GP for hyperparameter search not that meaningful anymore? – Sam Oct 27 '21 at 10:18
  • @Zzy1130 ML models in general. What I'm trying to say is that you don't need to have perfectly tuned parameters for GP when using it for hyperparameter optimization. Many out-of-the-box tools have reasonable defaults. Of course, there may be cases where some tuning would improve the results. – Tim Oct 27 '21 at 11:11