Consider the process of selecting the best output from a set of N candidates, where a reward model r scores each candidate ŷ_i based on an input x. The selection is represented by the formula: ŷ_best = max{r(x, ŷ_1), ..., r(x, ŷ_N)}. This formula implies that the final output, ŷ_best, is a numerical value representing the highest score.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Argmax Formula for Best Candidate Selection in BoN Sampling
A system generates four candidate outputs in response to a user's prompt. A separate evaluation model then assigns a quality score to each candidate, where a higher score indicates a better response. The system's selection rule is to choose the candidate that receives the maximum score. Given the scores below, which candidate will be selected as the final output?
- Candidate A: Score = 0.85
- Candidate B: Score = 0.91
- Candidate C: Score = 0.74
- Candidate D: Score = 0.23
Consider the process of selecting the best output from a set of N candidates, where a reward model
rscores each candidateŷ_ibased on an inputx. The selection is represented by the formula:ŷ_best = max{r(x, ŷ_1), ..., r(x, ŷ_N)}. This formula implies that the final output,ŷ_best, is a numerical value representing the highest score.Diagnosing a Mismatch in Automated Selection