Learn Before
Match each component of the policy notation π_θ(y_t | X, y_<t) to its correct description in the context of an autoregressive language model.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Objective Function for Policy Optimization
A language model, with parameters represented by θ, is translating the English sentence 'Hello, how are you?' into French. It has already generated the partial translation 'Bonjour, comment'. The model is now deciding the next word. What does the expression
π_θ('allez' | 'Hello, how are you?', 'Bonjour, comment')represent in this context?Match each component of the policy notation
π_θ(y_t | X, y_<t)to its correct description in the context of an autoregressive language model.Appropriateness of Autoregressive Notation