Machine learning can be defined as the problem of using partial data about a function or relation to make predictions about how the function or relation would perform on new data.
The case of learning a function
Although this particular formulation is not the most natural formulation of all machine learning problems, any machine learning problem can be converted to this formulation.
An output function depends on some inputs (called features) (and possibly a random noise component). We have access to some training data where we have the value of the output function, and we either already have or can get the values of the features. The machine learning problem asks for an explicit description of a function that would perform well on new inputs that we provide to it.
Types of machine learning
Classification based on nature of training data
|Type of machine learning||Description|
|supervised learning||Explicit training data (input-output pairs) are provided. These can be used to learn the parameters for the functional form that can then be used to make predictions.|
|unsupervised learning||The training data as provided does not provide explicit outputs for explicit inputs. For instance, in the context of classification problems, an unsupervised learning problem would simply provide a lot of inputs without specifying the output values for those inputs. The job of the machine learning algorithm is to use the distribution of the inputs to figure out the output values.|
|semi-supervised learning||The training data is a mix of explicit input-output pairs and input-only data. Semi-supervised learning combines some of the aspects of supervised learning and unsupervised learning.|
Classification based on nature of prediction
|Type of prediction problem||Description|
|binary classification problem||Here, the output has only two permissible values. Therefore, the prediction problem can be viewed as a yes/no or a true/false prediction problem. Models for binary classification can be probabilistic (such as logistic regression or artificial neural networks) or yes/no (such as support vector machines). Note that probabilistic binary classification models structurally resemble regression models.|
|discrete variable prediction problem, or multi-class classification problem||Here, the output can take one of a finite number of values. We can also think of this as a classification problem where each input case needs to be sorted into one of finitely many classes.|
|regression problem, or continuous variable prediction problem||Here, the output can take a value over a continuous range of values.|
Classification based on the stage of learning
|Stage of learning||Description|
|eager learning||This is the more common form of learning. Here, the training data is pre-processed and an explicit and compact representation of the function is learned from it (usually, in the form of a paramater vector). The learning phase that uses the training data to learn the compact representation takes substantially greater time, but once this phase is done, the training data can be thrown away. The compact representation takes much less memory than the training data and can be used to quickly compute the output for any input.|
|lazy learning||With this form of learning, the training data is not completely pre-processed. Every time a new prediction needs to be made, the training data is used to make the prediction. Lazy learning algorithms are useful in cases where the prediction of function values is based on nearby values.|
|online learning||Here, the input data is streamed to the algorithm, one instance at a time. For the one-instance case (each data point) three steps occur: (a) The algorithm reads the data (the input only), (b) the algorithm makes a prediction of the output, (c) the algorithm learns the actual output and updates the parameters according to that.|