1Cademy - A language model is prompted to solve the math problem What is 7 + 8?. To improve reliability, the model generates five different outputs using a sampling strategy: [15, 14, 15, 15, 16]. A selection process is then used to choose the final answer by identifying the candidate that minimizes the expected disagreement with the other generated candidates. Which output will be selected?

Learn Before

Minimum Bayes Risk Decoding as an Interpretation of Self-Consistency

Multiple Choice

A language model is prompted to solve the math problem 'What is 7 + 8?'. To improve reliability, the model generates five different outputs using a sampling strategy: [15, 14, 15, 15, 16]. A selection process is then used to choose the final answer by identifying the candidate that minimizes the expected disagreement with the other generated candidates. Which output will be selected?

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related