1Cademy - Pros and Cons of ReLU

How it works Research Communities Benefits About Us

Learn Before

ReLU (Rectified Linear Unit)

Pros and Cons of ReLU

Pros:

Computationally efficient—allows the network to converge very quickly
Non-linear—although it looks like a linear function, ReLU has a derivative function and allows for backpropagation.
If you're not sure what activation function to use for the hidden layers, it's better to use ReLU by default.
Jeffery Hinton: Allows neuron to express a strong opinion
Gradient doesn't saturate (on the high end)
Less sensitive to random initialization
Runs great on low precision hardware

Cons:

The Dying ReLU problem (Dead neuron): when inputs approach negative values, the gradient of the function becomes zero, the network cannot perform backpropagation and cannot learn. => Solution: Leaky ReLU
Gradient discontinuous at origin: when inputs equal to 0, there is no derivative since it's in the intersection of a horizontal line and a linear line. So learning is not happening there. => Solution: GELU

0

3

4 years ago

Contributors are:

Iman YeckehZaare

Iman YeckehZaare

Yue Kuang

chengyue qiu

Who are from:

University of Michigan - Ann Arbor

University of Michigan - Ann Arbor

References

Tags

Data Science

Related

Pros and Cons of ReLU
Leaky ReLU
Parametric ReLU
Derivative of ReLU (Rectified Linear Unit) function

Learn After

Why is it better to use ReLU by default?