Concept

Omission of Bias Terms in LLM Affine Transformations

A popular model design in Large Language Models (LLMs) is the removal of bias terms in affine transformations. This architectural choice can be applied to several components, including layer normalization, the transformations of inputs to QKV attention mechanisms, and feed-forward networks (FFNs).

0

1

Updated 2026-04-21

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences