1Cademy - GeGLU (GELU-based Gated Linear Unit) Formula

Learn Before

GeGLU (GELU-based Gated Linear Unit)

Formula

GeGLU (GELU-based Gated Linear Unit) Formula

The GeGLU (GELU-based Gated Linear Unit) activation function is defined by the following formula:

$\sigma_{\text{geglu}}(\mathbf{h}) = \sigma_{\text{gelu}}(\mathbf{hW}_1 + \mathbf{b}_1) \odot (\mathbf{hW}_2 + \mathbf{b}_2)$

In this equation, $\mathbf{h}$ represents the input, while $\mathbf{W}_1, \mathbf{W}_2, \mathbf{b}_1$ , and $\mathbf{b}_2$ are learnable model parameters (weights and biases). The function $\sigma_{\text{gelu}}$ is the Gaussian Error Linear Unit (GELU) activation, and $\odot$ signifies the element-wise product.