Learn Before
Generating N-Best Candidates in BoN Sampling
In Best-of- (BoN) sampling, the initial step involves generating a set of candidate outputs that aim to maximize the conditional probability . The method for producing these candidates can vary, depending on the search algorithm used by the model. Common techniques include stochastic methods like sampling or more deterministic approaches such as beam search.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.4 Alignment - Foundations of Large Language Models
Related
Input and Output Formulation in BoN Sampling
Generating N-Best Candidates in BoN Sampling
Reward Model Selection in BoN Sampling
Rejection Sampling for LLM Fine-Tuning
A company wants to improve the safety and helpfulness of its AI assistant without the high cost and time of retraining the entire base model. They propose a new system for handling user queries: for each query, the system will first generate 10 different potential responses. Then, a separate, fast-acting 'quality-scoring' model will evaluate all 10 responses based on pre-defined criteria. Finally, the system will present only the single response that received the highest score to the user. What is the most significant trade-off of this approach compared to simply using the first response the base model generates?
A system is designed to improve the quality of its generated text by producing multiple options and then picking the best one. Arrange the following steps of this process in the correct logical order.
Chatbot Response Quality Improvement
Learn After
Improving Output Diversity in a Multi-Candidate Generation System
Comparing Candidate Generation Methods for BoN Sampling
A development team is implementing a Best-of-N (BoN) sampling strategy for a creative storytelling application. Their primary goal is to generate a wide variety of imaginative and non-repetitive story continuations to present to the reranking model. Which of the following methods for generating the initial N candidates would best serve this goal?