Learn Before
Concept

Hendrycks and Gimpel [2016] on GELU

The 2016 paper by Hendrycks and Gimpel is the original source that introduced the Gaussian Error Linear Unit (GELU) activation function. It provides the foundational theory and discusses convenient implementation methods for the function.

0

1

Updated 2025-09-29

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Related