Learn Before
Concept

SwiGLU (Swish-based Gated Linear Unit)

The SwiGLU function is a specific variant of the Gated Linear Unit (GLU). It is formulated by adopting the Swish function to serve as the internal non-linear activation, which is generally denoted as σ()\sigma(\cdot) in the GLU architecture.

0

1

Updated 2026-04-21

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences