Learn Before
Ranking Chatbot Responses
Based on the scenario described below, use the notation Pr(y_k1 ≻ y_k2 | x) to write the specific conditional probability that the training process is trying to model. Clearly define what x, y_k1, y_k2, and the ≻ symbol represent in this context.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Bradley-Terry Model for Pairwise Preference Probability
Ranking Chatbot Responses
A user provides the prompt, denoted as 'x', 'Translate the phrase "hello world" into French.' to a language model. The model generates two responses: Response A ('y_A'), which is 'Bonjour le monde', and Response B ('y_B'), which is 'Salut monde'. A human evaluator indicates that Response A is a better translation than Response B. Which of the following expressions correctly represents the probability of this specific preference, given the user's prompt?
Modeling Pairwise Preference Probability with a Reward Function
Interpreting Preference Probability Notation