Learn Before
Ranking Stage in 1-Best Selection
The ranking stage, labeled as step ②, is the second part of the 1-best selection process. In this stage, the candidate tokens generated during expansion are sorted in descending order according to their probabilities. For example, a list of candidates might be ranked as follows: 'cute' (Pr = 0.34), 'on' (Pr = 0.32), 'sick' (Pr = 0.21), 'are' (Pr = 0.12), and '.' (Pr = 0.01). This ordering prepares for the final output stage where only the top candidate is selected.

0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of 1-Best Selection
Ranking Stage in 1-Best Selection
Expansion Stage in 1-Best Selection
Output Stage in 1-Best Selection
Predicting the Next Word
A language model is determining the next word in a sequence. It follows a process where it first creates a list of possible words, then organizes them by likelihood, and finally chooses the most probable one. Arrange the formal stages of this process in the correct chronological order.
A language model is using a three-stage process (Expansion, Ranking, Output) to select the next word for the phrase 'The cat is...'. The model first expands the possibilities to a set of candidates with their probabilities: 'sleeping' (0.5), 'cute' (0.3), 'on' (0.15), and 'blue' (0.05). However, the model's final output is the word 'on'. Which stage of the process is the most direct point of failure?
Learn After
A language model has generated a set of candidate tokens to follow a sequence, each with an assigned probability. Arrange these candidates in the correct descending order based on their probability, as would occur during the ranking stage of a selection process.
A language model generates the following five candidate tokens and their associated probabilities to complete a sentence: 'sky' (Pr=0.45), 'moon' (Pr=0.15), 'clouds' (Pr=0.35), 'stars' (Pr=0.04), 'blue' (Pr=0.01). What is the primary purpose of sorting these candidates by their probability in a process designed to select only the single most likely token?
Interpreting Model Certainty from Ranked Probabilities