Supervised learning - Revision history

Vipul: /* Steps of supervised learning */

2021-06-12T01:50:47Z

Steps of supervised learning

← Older revision		Revision as of 01:50, 12 June 2021
Line 16:		Line 16:
	\| [[Model class selection]] (not to be confused with [[hyperparameter optimization]]) \|\| The functional form (with [[parameter]]s) describing how the output depends on the features. \|\| This again depends on theoretical knowledge based on the problem domain, as well as empirical exploration of the data gathered.		\| [[Model class selection]] (not to be confused with [[hyperparameter optimization]]) \|\| The functional form (with [[parameter]]s) describing how the output depends on the features. \|\| This again depends on theoretical knowledge based on the problem domain, as well as empirical exploration of the data gathered.
	\|-		\|-
	\| [[Cost function selection]] \|\| The cost function (or error function) used to measure error on new data. \|\| This again depends on theoretical knowledge based on the problem domain, and also on the choice of model type. Often, the error function selection is bundled with the model selection, because part of the model selection process also includes identifying the nature of distribution of errors or anomalies. This has the property that if we choose parameters so that our predicted function matches the actual function precisely, the error is zero. In principle, however, the error function is independent of the model, so we can essentially combine any permissible model with any permissible error function.		\| [[Cost function selection]] \|\| The cost function (or error function) used to measure error on new data. \|\| This again depends on theoretical knowledge based on the problem domain, and also on the choice of model type. Often, the error function selection is bundled with the model class selection, because part of the model class selection process also includes identifying the nature of distribution of errors or anomalies. This has the property that if we choose parameters so that our predicted function matches the actual function precisely, the error is zero. In principle, however, the error function is independent of the model class, so we can essentially combine any permissible model class with any permissible error function.
	\|-		\|-
	\| Regularization-type choices \|\| The choice of regularization function to add to the cost function when using on the training data. Requires choosing [[regularization hyperparameter]](s). \|\|		\| Regularization-type choices \|\| The choice of regularization function to add to the cost function when using on the training data. Requires choosing [[regularization hyperparameter]](s). \|\|

Sebastian at 18:05, 1 February 2020

2020-02-01T18:05:30Z

← Older revision		Revision as of 18:05, 1 February 2020
Line 1:		Line 1:
	==Definition==		==Definition==

	The term '''supervised learning''' is used to describe a subclass of machine learning problems where we are provided with a set of labeled examples for training and can use that data to determine a function that would take any new example and predict the label for that.		The term '''supervised learning''' is used to describe a subclass of machine learning problems where we are provided with a set of labeled examples for training and can use that data to determine a function that would take any new example and predict the label for that. There are two types of supervised learning techiques, [[classification]] and [[regression]].

	Here, the term "example" refers to the input part of the function (that can be used to make the prediction) and the term "label" refers to the output of the function (that needs to be predicted).		Here, the term "example" refers to the input part of the function (that can be used to make the prediction) and the term "label" refers to the output of the function (that needs to be predicted).

IssaRice: /* Steps of supervised learning */

2016-05-08T23:50:18Z

Steps of supervised learning

← Older revision		Revision as of 23:50, 8 May 2016
Line 6:		Line 6:

	==Steps of supervised learning==		==Steps of supervised learning==

			Supervised learning is a process that goes through several steps, which are presented here in a table.

	{\| class="sortable" border="1"		{\| class="sortable" border="1"

Vipul: /* Steps of supervised learning */

2016-05-07T19:00:24Z

Steps of supervised learning

← Older revision		Revision as of 19:00, 7 May 2016
Line 12:		Line 12:
	\| [[Feature selection]] \|\| The set of features that the model depends on. \|\| Based on the problem domain, we come up with a list of relevant features that affect the output function. If we choose too few features, then the task might be theoretically impossible. For instance, if the only feature we have for a house is its area, and we need to predict the price, we cannot do the prediction too well. The more the features, the better our ability to predict in principle. However, too many features means more effort spent collecting their values, and there are also dangers of [[overfitting]].		\| [[Feature selection]] \|\| The set of features that the model depends on. \|\| Based on the problem domain, we come up with a list of relevant features that affect the output function. If we choose too few features, then the task might be theoretically impossible. For instance, if the only feature we have for a house is its area, and we need to predict the price, we cannot do the prediction too well. The more the features, the better our ability to predict in principle. However, too many features means more effort spent collecting their values, and there are also dangers of [[overfitting]].
	\|-		\|-
	\| [[Model ~~type~~ selection]] (not to be confused with [[hyperparameter optimization]]) \|\| The functional form (with [[parameter]]s) describing how the output depends on the features. \|\| This again depends on theoretical knowledge based on the problem domain, as well as empirical exploration of the data gathered.		\| [[Model class selection]] (not to be confused with [[hyperparameter optimization]]) \|\| The functional form (with [[parameter]]s) describing how the output depends on the features. \|\| This again depends on theoretical knowledge based on the problem domain, as well as empirical exploration of the data gathered.
	\|-		\|-
	\| [[Cost function selection]] \|\| The cost function (or error function) used to measure error on new data. \|\| This again depends on theoretical knowledge based on the problem domain, and also on the choice of model type. Often, the error function selection is bundled with the model selection, because part of the model selection process also includes identifying the nature of distribution of errors or anomalies. This has the property that if we choose parameters so that our predicted function matches the actual function precisely, the error is zero. In principle, however, the error function is independent of the model, so we can essentially combine any permissible model with any permissible error function.		\| [[Cost function selection]] \|\| The cost function (or error function) used to measure error on new data. \|\| This again depends on theoretical knowledge based on the problem domain, and also on the choice of model type. Often, the error function selection is bundled with the model selection, because part of the model selection process also includes identifying the nature of distribution of errors or anomalies. This has the property that if we choose parameters so that our predicted function matches the actual function precisely, the error is zero. In principle, however, the error function is independent of the model, so we can essentially combine any permissible model with any permissible error function.

Vipul: /* Steps of supervised learning */

2014-08-15T23:17:24Z

Steps of supervised learning

← Older revision		Revision as of 23:17, 15 August 2014
Line 14:		Line 14:
	\| [[Model type selection]] (not to be confused with [[hyperparameter optimization]]) \|\| The functional form (with [[parameter]]s) describing how the output depends on the features. \|\| This again depends on theoretical knowledge based on the problem domain, as well as empirical exploration of the data gathered.		\| [[Model type selection]] (not to be confused with [[hyperparameter optimization]]) \|\| The functional form (with [[parameter]]s) describing how the output depends on the features. \|\| This again depends on theoretical knowledge based on the problem domain, as well as empirical exploration of the data gathered.
	\|-		\|-
	\| [[Cost function selection]] \|\| The cost function (or error function) used to measure error on new data. \|\| This again depends on theoretical knowledge based on the problem domain, and also on the choice of model type. Often, the error function selection is bundled with the model selection, because part of the model selection process also includes identifying the nature of distribution of errors or anomalies. This has the ~~properly~~ that if we choose parameters so that our predicted function matches the actual function precisely, the error is zero. In principle, however, the error function is independent of the model, so we can essentially combine any permissible model with any permissible error function.		\| [[Cost function selection]] \|\| The cost function (or error function) used to measure error on new data. \|\| This again depends on theoretical knowledge based on the problem domain, and also on the choice of model type. Often, the error function selection is bundled with the model selection, because part of the model selection process also includes identifying the nature of distribution of errors or anomalies. This has the property that if we choose parameters so that our predicted function matches the actual function precisely, the error is zero. In principle, however, the error function is independent of the model, so we can essentially combine any permissible model with any permissible error function.
	\|-		\|-
	\| Regularization-type choices \|\| The choice of regularization function to add to the cost function when using on the training data. Requires choosing [[regularization hyperparameter]](s). \|\|		\| Regularization-type choices \|\| The choice of regularization function to add to the cost function when using on the training data. Requires choosing [[regularization hyperparameter]](s). \|\|

Vipul: /* Steps of supervised learning */

2014-08-15T23:17:06Z

Steps of supervised learning

← Older revision		Revision as of 23:17, 15 August 2014
Line 14:		Line 14:
	\| [[Model type selection]] (not to be confused with [[hyperparameter optimization]]) \|\| The functional form (with [[parameter]]s) describing how the output depends on the features. \|\| This again depends on theoretical knowledge based on the problem domain, as well as empirical exploration of the data gathered.		\| [[Model type selection]] (not to be confused with [[hyperparameter optimization]]) \|\| The functional form (with [[parameter]]s) describing how the output depends on the features. \|\| This again depends on theoretical knowledge based on the problem domain, as well as empirical exploration of the data gathered.
	\|-		\|-
	\| [[Cost function selection]] \|\| The cost function (or error function) used to measure error on new data. \|\| This again depends on theoretical knowledge based on the problem domain, and also on the choice of model. Often, the error function selection is bundled with the model selection, because part of the model selection process also includes identifying the nature of distribution of errors or anomalies. This has the properly that if we choose parameters so that our predicted function matches the actual function precisely, the error is zero. In principle, however, the error function is independent of the model, so we can essentially combine any permissible model with any permissible error function.		\| [[Cost function selection]] \|\| The cost function (or error function) used to measure error on new data. \|\| This again depends on theoretical knowledge based on the problem domain, and also on the choice of model type. Often, the error function selection is bundled with the model selection, because part of the model selection process also includes identifying the nature of distribution of errors or anomalies. This has the properly that if we choose parameters so that our predicted function matches the actual function precisely, the error is zero. In principle, however, the error function is independent of the model, so we can essentially combine any permissible model with any permissible error function.
	\|-		\|-
	\| Regularization-type choices \|\| The choice of regularization function to add to the cost function when using on the training data. Requires choosing [[regularization hyperparameter]](s). \|\|		\| Regularization-type choices \|\| The choice of regularization function to add to the cost function when using on the training data. Requires choosing [[regularization hyperparameter]](s). \|\|

Vipul: /* Steps of supervised learning */

2014-08-15T18:26:12Z

Steps of supervised learning

← Older revision		Revision as of 18:26, 15 August 2014
Line 12:		Line 12:
	\| [[Feature selection]] \|\| The set of features that the model depends on. \|\| Based on the problem domain, we come up with a list of relevant features that affect the output function. If we choose too few features, then the task might be theoretically impossible. For instance, if the only feature we have for a house is its area, and we need to predict the price, we cannot do the prediction too well. The more the features, the better our ability to predict in principle. However, too many features means more effort spent collecting their values, and there are also dangers of [[overfitting]].		\| [[Feature selection]] \|\| The set of features that the model depends on. \|\| Based on the problem domain, we come up with a list of relevant features that affect the output function. If we choose too few features, then the task might be theoretically impossible. For instance, if the only feature we have for a house is its area, and we need to predict the price, we cannot do the prediction too well. The more the features, the better our ability to predict in principle. However, too many features means more effort spent collecting their values, and there are also dangers of [[overfitting]].
	\|-		\|-
	\| [[Model selection]] (not to be confused with [[hyperparameter optimization]]) \|\| The functional form (with [[parameter]]s) describing how the output depends on the features. \|\| This again depends on theoretical knowledge based on the problem domain, as well as empirical exploration of the data gathered.		\| [[Model type selection]] (not to be confused with [[hyperparameter optimization]]) \|\| The functional form (with [[parameter]]s) describing how the output depends on the features. \|\| This again depends on theoretical knowledge based on the problem domain, as well as empirical exploration of the data gathered.
	\|-		\|-
	\| [[Cost function selection]] \|\| The cost function (or error function) used to measure error on new data. \|\| This again depends on theoretical knowledge based on the problem domain, and also on the choice of model. Often, the error function selection is bundled with the model selection, because part of the model selection process also includes identifying the nature of distribution of errors or anomalies. This has the properly that if we choose parameters so that our predicted function matches the actual function precisely, the error is zero. In principle, however, the error function is independent of the model, so we can essentially combine any permissible model with any permissible error function.		\| [[Cost function selection]] \|\| The cost function (or error function) used to measure error on new data. \|\| This again depends on theoretical knowledge based on the problem domain, and also on the choice of model. Often, the error function selection is bundled with the model selection, because part of the model selection process also includes identifying the nature of distribution of errors or anomalies. This has the properly that if we choose parameters so that our predicted function matches the actual function precisely, the error is zero. In principle, however, the error function is independent of the model, so we can essentially combine any permissible model with any permissible error function.

Vipul at 17:59, 15 August 2014

2014-08-15T17:59:55Z

← Older revision		Revision as of 17:59, 15 August 2014
Line 1:		Line 1:
	==Definition==		==Definition==

	The term '''supervised learning''' is used to describe a subclass of machine learning problems where we are provided with a set of ~~input-output pairs~~ for training ~~data~~ and can use that data to determine a function that would take any new input and ~~predict~~ the output ~~from~~ that ~~input~~.		The term '''supervised learning''' is used to describe a subclass of machine learning problems where we are provided with a set of labeled examples for training and can use that data to determine a function that would take any new example and predict the label for that.

			Here, the term "example" refers to the input part of the function (that can be used to make the prediction) and the term "label" refers to the output of the function (that needs to be predicted).

	==Steps of supervised learning==		==Steps of supervised learning==

Vipul at 01:14, 26 July 2014

2014-07-26T01:14:54Z

← Older revision		Revision as of 01:14, 26 July 2014
Line 2:		Line 2:

	The term '''supervised learning''' is used to describe a subclass of machine learning problems where we are provided with a set of input-output pairs for training data and can use that data to determine a function that would take any new input and predict the output from that input.		The term '''supervised learning''' is used to describe a subclass of machine learning problems where we are provided with a set of input-output pairs for training data and can use that data to determine a function that would take any new input and predict the output from that input.

			==Steps of supervised learning==

			{\| class="sortable" border="1"
			! Aspect !! What gets chosen here !! Description
			\|-
			\| [[Feature selection]] \|\| The set of features that the model depends on. \|\| Based on the problem domain, we come up with a list of relevant features that affect the output function. If we choose too few features, then the task might be theoretically impossible. For instance, if the only feature we have for a house is its area, and we need to predict the price, we cannot do the prediction too well. The more the features, the better our ability to predict in principle. However, too many features means more effort spent collecting their values, and there are also dangers of [[overfitting]].
			\|-
			\| [[Model selection]] (not to be confused with [[hyperparameter optimization]]) \|\| The functional form (with [[parameter]]s) describing how the output depends on the features. \|\| This again depends on theoretical knowledge based on the problem domain, as well as empirical exploration of the data gathered.
			\|-
			\| [[Cost function selection]] \|\| The cost function (or error function) used to measure error on new data. \|\| This again depends on theoretical knowledge based on the problem domain, and also on the choice of model. Often, the error function selection is bundled with the model selection, because part of the model selection process also includes identifying the nature of distribution of errors or anomalies. This has the properly that if we choose parameters so that our predicted function matches the actual function precisely, the error is zero. In principle, however, the error function is independent of the model, so we can essentially combine any permissible model with any permissible error function.
			\|-
			\| Regularization-type choices \|\| The choice of regularization function to add to the cost function when using on the training data. Requires choosing [[regularization hyperparameter]](s). \|\|
			\|-
			\| [[Learning algorithm]] applied on the training data \|\| The values of the parameters (not uniquely determined, we might get a portfolio of choices for different hyperparameter choices)\|\| This is the algorithm that tries to solve the optimization function of choosing values of the parameters (for our chosen model) so that the error function is minimized (or close to minimized). Note that we are trying to minimize the error function for unknown inputs, but our algorithm is being trained on known inputs. Thus, there are issues of [[overfitting]]. This problem is addressed through a number of techniques, including [[regularization]] and [[early stopping]].
			\|-
			\| [[Cross-validation]] (relates to hyperparameter optimization) \|\| The actual values of the hyperparameters and parameters \|\| This includes techniques that tweak the [[hyperparameter]]s that control the performance of the learning algorithm. These could include learning rate parameters for gradient descent, or regularization parameters introduced to avoid overfitting.
			\|}

Vipul: Created page with "==Definition== The term '''supervised learning''' is used to describe a subclass of machine learning problems where we are provided with a set of input-output pairs for train..."

2014-06-07T15:03:46Z

Created page with "==Definition== The term '''supervised learning''' is used to describe a subclass of machine learning problems where we are provided with a set of input-output pairs for train..."

New page

==Definition==

The term '''supervised learning''' is used to describe a subclass of machine learning problems where we are provided with a set of input-output pairs for training data and can use that data to determine a function that would take any new input and predict the output from that input.