Learn Before
Example

Applications of GELU in Large Language Models

The Gaussian Error Linear Unit (GELU) activation function has been widely adopted in the architecture of several influential Large Language Models. Notable examples of models that utilize GELU include BERT, GPT-3, and BLOOM.

0

1

Updated 2026-04-21

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related