Alternative Ranking Methods (RankNet and ListNet)
Beyond the ranking models commonly detailed in the context of RLHF, other established methods exist within the learning-to-rank field. Prominent examples include RankNet, a pairwise comparison method, and ListNet, a listwise approach. These serve as alternative techniques for training a model on human preference data.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Related
Alternative Ranking Methods (RankNet and ListNet)
Analysis of Preference Modeling Strategies
Analysis of Preference Modeling Approaches
A development team is training a model to score chatbot responses based on human feedback. Their data collection method involves presenting two responses to a user and asking them to select the better one. The dataset consists of millions of these 'winner' and 'loser' pairs for various prompts. Which learning-to-rank strategy is most directly aligned with this data structure?