Learn Before
Activity (Process)

Left Padding in LLM Batching

When batching input sequences of varying lengths for Large Language Model inference, their lengths must be standardized. Left padding is specifically used to add dummy tokens to the beginnings of shorter sequences. This ensures that all sequences within the batch have the identical length required for the prefilling stage.

0

1

Updated 2026-05-05

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences