Backpropagation derivation using Leibniz notation

From Machinelearning
Revision as of 22:21, 8 November 2018 by IssaRice (talk | contribs) (Created page with "The cost function <math>C</math> depends on <math>w^l_{jk}</math> only through the activation of the <math>j</math>th neuron in the <math>l</math>th layer, i.e. on the value o...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The cost function depends on only through the activation of the th neuron in the th layer, i.e. on the value of . Thus we can use the chain rule to expand:

We know that because .

In turn, depends on only through the activations of the th layer.