Learn Before
Left Padding in LLM Batching
When batching input sequences of varying lengths for Large Language Model inference, their lengths must be standardized. Left padding is specifically used to add dummy tokens to the beginnings of shorter sequences. This ensures that all sequences within the batch have the identical length required for the prefilling stage.
0
1
Tags
Foundations of Large Language Models
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of Padded Sequences in Fragmented Memory
A deep learning model is being prepared to process the following three text sequences together in a single batch:
['The', 'cat', 'sat'],['A', 'quick', 'brown', 'fox'], and['On', 'the', 'mat']. To ensure all sequences have a uniform length for efficient computation, a special⟨pad⟩token is added to the end of the shorter sequences. Which of the following options correctly represents the batch after this process is applied?Debugging a Batch Processing Error
Consequences of Non-Uniform Sequence Lengths
Efficiency of Batching Sequences with Similar Lengths
Left Padding in LLM Batching