Learn Before
Concept
Appropriate Scale for Randomly Tuning Hyperparameters in Deep Learning
When randomly choosing values for hyperparameters to find the most optimal ones, sometimes, it's better to choose the random numbers based on specific distributions. For example, instead of choosing random numbers from a uniform distribution, it may be more reasonable to choose from a normal or logarithmic distribution. For choosing a value for:
- $0.0001 < \alpha < 1, it's more reasonable to only try powers of 10 like $10^{-4}, ..., 10^0.
- $0.9 < \beta < 0.9999$, it's more reasonable to only try numbers like 0.9, 0.99, 0.999, ..., 0.99999 because comparing 0.9000 and 0.9005, the latter results in ~10 more samples. On the other hand, 0.999 results in ~1000 samples but 0.9995 results in ~2000 samples.
0
1
Updated 2021-12-09
Tags
Data Science