Analyzing Language Model State Evolution
A developer is debugging a text-generation model. The model is given an initial input sequence x = "The recipe calls for two cups of". The developer logs the model's state, which is defined as a pair containing the initial input and the sequence of tokens generated so far, at two consecutive time steps, t=1 and t=2.
- State at
t=1:(x, y_1)wherey_1 = "flour" - State at
t=2:(x, y_2)wherey_2 = "flour and"
Based on the provided logs, analyze the change in the model's state from time step t=1 to t=2. Explain which component of the state was updated and why this update is essential for the model to predict the next token at t=3.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is provided with the initial input sequence 'x' = "The sun is shining and the". The model then generates a sequence of three tokens, 'yt' = "sky is blue". According to the formal state definition where the state is a pair of the initial input and the generated sequence, what is the correct representation of the model's state at this point?
Analyzing Language Model State Evolution
State Evolution in Token Generation