Backpropagation: Difference between revisions

From Machinelearning
No edit summary
Line 9: Line 9:


Neural networks are usually illustrated where each node <math>a^\ell_j</math> is the activation of neuron <math>j</math> in layer <math>\ell</math>. Then the weights are labeled along the edges.
Neural networks are usually illustrated where each node <math>a^\ell_j</math> is the activation of neuron <math>j</math> in layer <math>\ell</math>. Then the weights are labeled along the edges.
==Computational graphs==
The multivariable chain rule can be represented as a [[computational graph]] where the nodes are variables that store values from intermediate computations. Each node can use values given to by edges going into it, and sends its output to nodes going out.
==A different neural network graph==


==See also==
==See also==

Revision as of 01:12, 24 March 2018

Backpropagation has several related meanings when talking about neural networks. Here we mean the process of finding the gradient of a cost function.

Terminology confusion

  • Some people use "backpropagation" to mean the process of finding the gradient of a cost function. This leads people to say things like "backpropagation is just the multivariable chain rule".
  • Some people use "backpropagation" or maybe "backpropagation algorithm" to mean the entire gradient descent algorithm, where the gradient is computed using the multivariable chain rule.

The usual neural network graph

Neural networks are usually illustrated where each node is the activation of neuron in layer . Then the weights are labeled along the edges.

Computational graphs

The multivariable chain rule can be represented as a computational graph where the nodes are variables that store values from intermediate computations. Each node can use values given to by edges going into it, and sends its output to nodes going out.

A different neural network graph

See also

References

External links