Cost function

Definition

For a single piece of data

The cost function associated with a given machine learning problem is a function that takes as input a guess for the function and the observed output and the predicted function value and then associates to that a number measuring how far the observed output is from the predicted function value.

For prediction problems associated with continuous variables, both the predicted value and the actual value are continuous variables. The cost function is a function $C (u, v)$ of two variables $u, v$ (the predicted value and actual value) satisfying the following conditions:

- $C (u, u) = 0$ for all $u \in R$
- For $u \leq v \leq w$ , $C (u, v) \leq C (u, w)$ and $C (v, w) \leq C (u, w)$

The cost function need not satisfy the triangle inequality; in fact, typical cost functions penalize bigger errors superlinearly.

For prediction problems associated with discrete variables, the predicted value is a probability and the actual value is simply a discrete value (0 or 1). The cost function is a function $C (p, v)$ of two variables (the predicted probability and actual value) satisfying the following conditions:

- $C (1, 1) = 0$
- $C (0, 0) = 0$
- For $p \leq q$ , $C (q, 1) \leq C (p, 1)$
- For $p \leq q$ , $C (p, 0) \leq C (q, 0)$