The basic equations Given the following notation: $w_{jk}^l$ are the weights from the $k^{th}$ neuron in the $l^{n-1}$ layer to the $j^{th}$ neuron of the $l^{th}$ layer $a_j^l$ is the activation of the $j^{th}$ neuron in the $l^{th}$ layer $b_j^l$ is the bias of the $j^{th}$ neuron in the $l^{th}$ layer $z^l$ is $a^{l-1}w^l + b^l$ for the $l^{th}$ layer $z_j^l$ is $\sum_ja_j^{l-1}w_{jk}^l + b_j^l$ for the $j^{th}$ neuron in the $l^{th}$ layer $\sigma$ is the activation function used $C$ is t...