Concept

Gated Linear Unit (GLU)

The Gated Linear Unit (GLU) is a family of activation functions that has gained popularity for its use in Large Language Models (LLMs). The specific variant of a GLU is defined by its internal non-linear activation function, denoted as σ(·). For example, using the GELU function for σ(·) results in GeGLU, and using the Swish function results in SwiGLU.

0

1

Updated 2026-04-21

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences