1Cademy - Batch Normalization Formula

Learn Before

Batch Normalization

Formula

Batch Normalization Formula

Batch normalization is applied to individual layers by standardizing the inputs based on the statistics of the current minibatch $\mathcal{B}$ . For an input $\mathbf{x} \in \mathcal{B}$ , the batch normalization $\textrm{BN}$ is defined as:

$\textrm{BN}(\mathbf{x}) = \boldsymbol{\gamma} \odot \frac{\mathbf{x} - \hat{\boldsymbol{\mu}}_{\mathcal{B}}}{\hat{\boldsymbol{\sigma}}_{\mathcal{B}}} + \boldsymbol{\beta}$

Here, $\hat{\boldsymbol{\mu}}_{\mathcal{B}} = \frac{1}{|\mathcal{B}|} \sum_{\mathbf{x} \in \mathcal{B}} \mathbf{x}$ is the sample mean, and $\hat{\boldsymbol{\sigma}}_{\mathcal{B}}^2 = \frac{1}{|\mathcal{B}|} \sum_{\mathbf{x} \in \mathcal{B}} (\mathbf{x} - \hat{\boldsymbol{\mu}}_{\mathcal{B}})^2 + \epsilon$ is the sample variance with a small constant $\epsilon > 0$ added for numerical stability to prevent division by zero. The parameters $\boldsymbol{\gamma}$ (scale parameter) and $\boldsymbol{\beta}$ (shift parameter) are learned during training to recover the degrees of freedom lost due to standardization.