Learn Before
Incremental Calculation of Sequence Log-Probability
The log-probability of a generated sequence can be calculated incrementally at each step of the decoding process. For any sequence , the total log-probability is evaluated as the sum of two components. The first component is the accumulated log-probability of the preceding sequence , representing the sum of the log-probabilities on the path from the root to the parent node that was computed in previous steps. The second component is the conditional token prediction log-probability of the current token , which is newly computed by the large language model at the current step. The calculation is: .

0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Incremental Calculation of Sequence Log-Probability
Example of Autoregressive Generation and Log-Probability Calculation
A language model is generating a continuation for the input 'The best way to learn a new skill is'. It has produced two candidate sequences and calculated their total log-probabilities as follows:
- Sequence A: '...by practicing consistently.' (Total log-probability = -1.15)
- Sequence B: '...through osmotic absorption.' (Total log-probability = -7.82)
Based on these values, which sequence is considered more plausible by the model, and why?
When a language model evaluates different possible output sequences, why is it standard practice to sum their log-probabilities instead of multiplying their raw probabilities?
A language model has generated the sequence 'The sun is' with a cumulative log-probability of -2.5. The model is now considering the next token. Given the following conditional log-probabilities for the next token, which choice would result in the most probable three-word sequence?
Learn After
Greedy Search (Greedy Decoding)
Beam search
A language model is generating a sequence of tokens. The total log-probability for the partially generated sequence 'The quick brown' has been calculated as -3.5. In the very next step, the model computes the conditional log-probability for the token 'fox' as -1.2. What is the new total log-probability for the complete sequence 'The quick brown fox'?
A language model is generating a sequence. The table below shows the conditional log-probability for each new token and the claimed total accumulated log-probability for the sequence up to that point. Analyze the table to identify the first step where the total accumulated log-probability is calculated incorrectly based on the principle of incremental summation.
Step Token Conditional log-prob Total Accumulated log-prob 1 'The' -0.9 -0.9 2 'cat' -1.5 -2.4 3 'sat' -1.1 -2.6 Comparing Generation Paths