Grid search vs. random search for hyperparameter tuning

Question

I am interested in comparing the computational cost of grid search vs. random search algorithm for hyperparameter tuning.

Is random search computationally less expensive then grid search? Is there an article/paper that talks about this topic?

Thank you,

That depends. Is the search space the same? Obviously the larger the search the more expensive... — astel, Nov 22 '20 at 20:49
There’s a pathology in the difference. A classic grid assumes some thing about the characteristic phenomenology of the system. You can think of it as the wavenumber analog to Nyquist sampling frequency. If you sample to sparsely on your grin you miss it. This is one of the great advantages that the uniform random sampling gives. One of the things that comes with it as a higher cost. If you knew ahead of time the characteristic frequency than the grid is going to require fewer samples, all else being equal. If you do not know your characteristic frequency, what do you do then? That’s the really — EngrStudent, Nov 22 '20 at 21:04

score 1 · Answer 1 · answered Nov 22 '20 at 21:09

Computational cost of what exactly? In both cases you check a list of parameter values, so it's $O(n)$. They only differ in how the list is created. For grid search, the list is deterministic, where you compare different combinations of listed parameter values. In random search the examples are created randomly. Both have same computational cost. The difference is quality of the results, where usually grid search is more efficient, hence it might need fewer examples.

score 1 · Answer 2 · answered Apr 15 '22 at 19:14

Answering what I think you're asking and to follow-up on @Tim's answer, think about a cube. The sidelength of the cube is smaller than the diagonal length of a face. This is in turn smaller than the distance between opposite corners. In higher dimensions, the problem worsens, so you can end up with systematically very sparse regions despite having many sampled points. From what I understand, random search is less sensitive to this, but you still run into sparse regions, just not systematically. In my mind, pseudo-random Sobol sequences take the best of both worlds (grid search and random search) and is generally considered better than random search. To guarantee the nice statistical properties of a Sobol search, you have to sample in base-2 increments.

https://en.wikipedia.org/wiki/Sobol_sequence (see the 2D comparison of Sobol vs. random)
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.qmc.Sobol.html
https://ax.dev/docs/models.html#sobol-sequence

Grid search vs. random search for hyperparameter tuning

2 Answers2

Related