Learn Before
Policy Notation for Autoregressive Models (π_θ)
The notation is often used to represent the policy of an autoregressive model. It denotes the conditional probability of selecting output at time step , given a context and the sequence of previously generated outputs . This policy is governed by the model's parameters . This notation is functionally equivalent to the standard probability notation .

0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
A researcher is comparing two language models. Model A is defined by a set of parameters . Model B is a version of Model A that has been fine-tuned on a new dataset, resulting in a new set of parameters, . The researcher wants to compare the probability of each model generating the word 'innovative' given the same input context and using the same sampling strategy, . Which of the following mathematical expressions accurately represents this comparison?
An AI engineer is working with a pre-trained Large Language Model, whose probability distribution is represented by . The engineer decides to change the method used to select the next word from the model's output probabilities, switching from a greedy approach to a top-k sampling approach. The model's underlying weights and biases are not modified. Which component of the notation would need to be updated to reflect this change?
Policy Notation for Autoregressive Models (π_θ)
Analyzing Model Update Notation
Learn After
Objective Function for Policy Optimization
A language model, with parameters represented by θ, is translating the English sentence 'Hello, how are you?' into French. It has already generated the partial translation 'Bonjour, comment'. The model is now deciding the next word. What does the expression
π_θ('allez' | 'Hello, how are you?', 'Bonjour, comment')represent in this context?Match each component of the policy notation
π_θ(y_t | X, y_<t)to its correct description in the context of an autoregressive language model.Appropriateness of Autoregressive Notation