Learn Before
Gaussian Error Linear Unit (GELU)
GELU (Gaussian Error Linear Unit) Formula
The Gaussian Error Linear Unit (GELU) activation function is defined by the following formula, which is applied element-wise to an input vector :
Here, is a random variable following the standard normal distribution, . The term is an informal notation representing the cumulative distribution function (CDF) of the standard normal distribution, commonly denoted by . When applied to the input vector , this term results in a new vector where each entry is the percentile (CDF value) corresponding to the respective entry in . Therefore, the formula can be simplified to:

0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
GELU (Gaussian Error Linear Unit) Formula
Applications of GELU in Large Language Models