1Cademy - Greedy Decoding in Language Models

Learn Before

Decoder-Only Transformer as a Language Model

Activity (Process)

Greedy Decoding in Language Models

In language model inference, a common method for generating text is to select the token with the maximum probability from the predicted distribution at each position. This strategy is applied sequentially, where the model's output at each step is determined by the single most likely next token given the preceding sequence.

Updated 2025-10-10

Contributors are: