Learn Before
Formula

GELU (Gaussian Error Linear Unit) Formula

The Gaussian Error Linear Unit (GeLU) function is defined by the formula:

σgelu(h)=hPr(hh)=hΦ(h)\sigma_{\mathrm{gelu}}(\mathbf{h}) = \mathbf{h} \Pr(h \le \mathbf{h}) = \mathbf{h} \Phi(\mathbf{h})

Here, h\mathbf{h} is the input vector, and hh represents a dd-dimensional vector with entries drawn from the standard normal distribution Gaussian(0,1)\mathrm{Gaussian}(0,1). The informal notation Pr(hh)\Pr(h \le \mathbf{h}) refers to a vector where each entry represents the percentile for the corresponding entry of h\mathbf{h}, which is mathematically equivalent to applying the cumulative distribution function Φ(h)\Phi(\mathbf{h}).

Image 0

0

1

Updated 2026-04-21

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related