Hyperparameter optimization: Difference between revisions

Revision as of 16:11, 7 June 2014

Definition

Hyperparameter optimization (sometimes also called model selection, but not to be confused with the core model selection) is the process of coming up with good choices of hyperparameters that control the learning algorithm used in machine learning problems whose goal is to determine the parameters for a functional form that predicts outputs from inputs.

Typical hyperparameter choices

Discrete hyperparameter such as the degree of the polynomial
Regularization hyperparameter: A hyperparameter that acts as a coefficient on the regularization term, used to enforce simplicity and reduce overfitting. The larger this hyperparameter, the more the enforcement of simplicity. However, very large hyperparameters may cause the model to be underfit.
Learning rate: This is a rate that controls how quickly the learning algorithm converges to its optimum. This isn't quite a hyperparameter that affects the choice of optimal value, but it affects the rate of convergence, and therefore it affects the value that we finally stop at.
Number of iterations and/or other stopping criterion parameter

@@ Line 1: / Line 1: @@
 ==Definition==
-'''Hyperparameter optimization''' is the process of coming up with good choices of [[hyperparameter]]s that control the [[learning algorithm]] used in machine learning problems whose goal is to determine the [[parameter]]s for a functional form that predicts outputs from inputs.
+'''Hyperparameter optimization''' (sometimes also called '''model selection''', but not to be confused with the core [[model selection]]) is the process of coming up with good choices of [[hyperparameter]]s that control the [[learning algorithm]] used in machine learning problems whose goal is to determine the [[parameter]]s for a functional form that predicts outputs from inputs.
+==Typical hyperparameter choices==
+* Discrete hyperparameter such as the degree of the polynomial
+* [[Regularization hyperparameter]]: A hyperparameter that acts as a coefficient on the regularization term, used to enforce simplicity and reduce [[overfitting]]. The larger this hyperparameter, the more the enforcement of simplicity. However, very large hyperparameters may cause the model to be underfit.
+* [[Learning rate]]: This is a rate that controls how quickly the learning algorithm converges to its optimum. This isn't quite a hyperparameter that affects the choice of optimal value, but it affects the rate of convergence, and therefore it affects the value that we finally stop at.
+* Number of iterations and/or other stopping criterion parameter