Learn Before
Analyzing Model Output Selection
A language model is tasked with completing the sentence 'The capital of France is...'. It evaluates two potential output sequences: Sequence A, which is the single token 'Paris', and Sequence B, which is the single token 'Berlin'. The model calculates that, given the input context, the probability of generating Sequence A is 0.98, while the probability of generating Sequence B is 0.01. According to the formal objective of the text generation process, which sequence should the model select as its output? Justify your answer based on the core principle of this process.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.5 Inference - Foundations of Large Language Models
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Mathematical Formulation of LLM Inference
Single-Round Prediction Problem
Token-Level Representation of Input and Output Sequences for a Forward Pass
Multi-Round Prediction Problem
Notation for Concatenated Token Sequences
A language model is given an input sequence of tokens representing the phrase 'The best way to learn a new skill is'. The model then calculates the likelihood for several possible completing sequences. Based on the formal objective of the text generation process, which of the following sequences should the model select to output?
Analyzing Model Output Selection
A language model is given an input context
x. It then evaluates two potential output sequences,y_1andy_2. The model's internal calculations determine thaty_1has a higher probability of occurring afterxthany_2. However, a human evaluator findsy_2to be more creative and detailed. According to the formal objective of the text generation process, what should the model do?