Learn Before
Batch Renormalization for Minibatch Size Independence
When optimizing minibatch sizes for computational efficiency, a critical caveat arises from the interaction with batch normalization. As the minibatch size grows, the statistical variance of the batch-computed mean and standard deviation estimates decreases, which diminishes the noise-injection that gives batch normalization its regularization benefit. To mitigate this dependence on minibatch size, Ioffe (2017) proposed batch renormalization, a technique that rescales and computes appropriate correction terms so that the normalization statistics remain effective regardless of how large or small the minibatch is. This allows practitioners to select minibatch sizes based purely on computational considerations without sacrificing the regularization properties of batch normalization.
0
1
Tags
D2L
Dive into Deep Learning @ D2L