Learn Before
The simplification of the greedy search objective relies on a specific mathematical derivation. Arrange the following steps to correctly represent the logical flow of this derivation, which shows how maximizing the log-probability of the entire sequence up to the current step (log Pr(y_1...y_i | x)) is equivalent to maximizing the log-probability of just the current token (log Pr(y_i | x, y_{<i})).
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Construction of the Optimal Sequence in Greedy Search
When generating text one token at a time, a greedy algorithm aims to select the token
y_iat stepithat maximizes the log-probability of the entire sequence up to that point,log Pr(y_1...y_i | x). This optimization problem can be simplified to choosing the token that maximizes only the conditional log-probability of the current token,log Pr(y_i | x, y_{<i}). Why is this simplification mathematically valid for finding the best current tokeny_i?The simplification of the greedy search objective relies on a specific mathematical derivation. Arrange the following steps to correctly represent the logical flow of this derivation, which shows how maximizing the log-probability of the entire sequence up to the current step (
log Pr(y_1...y_i | x)) is equivalent to maximizing the log-probability of just the current token (log Pr(y_i | x, y_{<i})).Explaining Greedy Search Optimization