Learn Before
State in the Context of LLMs
In language modeling, a state at a specific time step, denoted as , is defined as the sequence of tokens observed up to that point. This sequence serves as the context the model utilizes to predict the subsequent token. For instance, when predicting the next token at time step , the state can be mathematically defined as , where represents the initial input and represents the generated tokens so far.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
State in the Context of LLMs
An autonomous agent is designed to navigate a maze to find a piece of cheese. At any given moment, the agent knows its current coordinates (e.g., row 3, column 5), whether the adjacent squares contain walls or open paths, and the location of the cheese. Based on this information, the agent must decide whether to move up, down, left, or right. Which of the following best describes the agent's 'state' in this scenario?
Defining the State for a Chess-Playing Agent
Designing a State Representation for a Self-Driving Car
Sum of Future Rewards Notation
Learn After
Example of State Definition for Next-Token Prediction
A language model is given the initial text 'The sun is shining and the sky is'. The model then generates the word 'blue'. At this point, before it attempts to generate the next word, what sequence of tokens represents the model's current 'state' that it will use as context?
The Role of State in Language Models
When a language model generates a new token, the 'state' it uses for the next prediction is updated to include only the token it just produced, discarding all previously seen tokens.