1Cademy - Example of K-Best Selection with a Beam Width of 3

Learn Before

Ranking and K-Best Selection Process

Example

Example of K-Best Selection with a Beam Width of 3

This example illustrates the K-best selection process with a beam width (K) of 3. Given five candidate words with their respective probabilities—'cute' (Pr=0.34), 'on' (Pr=0.32), 'sick' (Pr=0.21), 'are' (Pr=0.12), and '.' (Pr=0.01)—the process involves two steps. First, the candidates are ranked by their probability scores. Second, the top K=3 candidates ('cute', 'on', 'sick') are selected as the output, forming the beam. The remaining candidates with lower scores ('are', '.') are discarded or 'pruned'.

Updated 2025-10-10

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn After

A text generation system is deciding on the next word in a sentence. It has calculated the probabilities for five potential words. If the system is configured to keep only the top 3 most probable options for the next step, which set of words will it select? The potential words and their probabilities are: 'the' (0.45), 'a' (0.25), 'his' (0.15), 'her' (0.10), and 'its' (0.05).
A language generation model is configured with a beam width of 3. At one step, it considers the following five words and their associated probabilities: 'happy' (0.40), 'joyful' (0.25), 'glad' (0.18), 'content' (0.12), 'pleased' (0.05). Which words will be pruned (discarded) during this selection step?
A text generation system is configured to keep the top 3 most probable words at each step. Given a list of five candidate words with their probabilities, arrange the following actions in the correct sequence to determine which words to keep.

Learn Before

Related

Learn After