Learn Before
Interpreting Preference Probability Notation
A research team is developing a model to summarize news articles. For a given article, the model generates two different summaries. The team uses human feedback to determine which summary is better. The probability that one summary is preferred over another is represented by the expression: .
In the context of this specific scenario, explain what each of the following components represents:
- The symbol
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Bradley-Terry Model for Pairwise Preference Probability
Ranking Chatbot Responses
A user provides the prompt, denoted as 'x', 'Translate the phrase "hello world" into French.' to a language model. The model generates two responses: Response A ('y_A'), which is 'Bonjour le monde', and Response B ('y_B'), which is 'Salut monde'. A human evaluator indicates that Response A is a better translation than Response B. Which of the following expressions correctly represents the probability of this specific preference, given the user's prompt?
Modeling Pairwise Preference Probability with a Reward Function
Interpreting Preference Probability Notation