Learn Before
Concept
Inverted Dropout Technique
In modern implementations of dropout (often called inverted dropout), the activation tensor is not only multiplied by a binary mask but the remaining values are also rescaled. If elements are dropped out with probability , the surviving elements are divided by . This rescaling step, performed during training, preserves the expected value of the activations and eliminates the need for scaling adjustments during the test phase.
0
2
Updated 2026-05-07
Contributors are:
Who are from:
Tags
Data Science
D2L
Dive into Deep Learning @ D2L