Using Scoring Systems for Inference-Time Rescoring
This method involves employing a scoring system, which functions similarly to a reward model, to simulate human feedback on LLM outputs. The system assigns scores to different responses, allowing for the prioritization of those that receive more positive evaluations.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Using Scoring Systems for Inference-Time Rescoring
Best-of-N Sampling (BoN Sampling)
Use of Reranking to Explore Model and Search Errors
The Challenge of Candidate Diversity in Reranking Methods
A development team uses a large, pre-trained language model to generate summaries of news articles. To improve the factual accuracy of the final output, their system first generates five different summary candidates. Then, a separate, specialized scoring model evaluates each of the five summaries for factual consistency with the original article and selects the one with the highest score. Which statement best analyzes the trade-offs of this approach?
Improving Chatbot Responses on a Budget
A system is designed to improve the quality of its generated responses at inference time without altering the base model's parameters. It does this by producing several options and then choosing the best one. Arrange the following actions into the correct operational sequence.
Learn After
Improving Chatbot Response Quality without Retraining
An AI development team is using an inference-time rescoring process to select the best summary for a news article. The model first generates three candidate summaries. A separate scoring system then evaluates each candidate and assigns a single quality score from 0.0 to 1.0, where a higher score indicates a better summary. Given the following scores, which summary will be selected as the final output?
An AI development team is using an inference-time technique to improve the quality of its model's responses. The process involves generating multiple candidate responses and then using a separate system to evaluate and select the best one. Arrange the following steps of this process in the correct chronological order.