A company is developing a model to generate one-sentence summaries of news articles. They have access to a dataset of millions of articles and plan to use a large, crowdsourced workforce to rate each model-generated summary on a simple 1-5 scale for clarity and relevance. The rating task is designed to be straightforward and quick. Based on these project characteristics, which statement best analyzes the suitability of this independent, absolute scoring approach for collecting human feedback?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A company is developing a model to generate one-sentence summaries of news articles. They have access to a dataset of millions of articles and plan to use a large, crowdsourced workforce to rate each model-generated summary on a simple 1-5 scale for clarity and relevance. The rating task is designed to be straightforward and quick. Based on these project characteristics, which statement best analyzes the suitability of this independent, absolute scoring approach for collecting human feedback?
Evaluating a Feedback Collection Strategy for Code Quality
A company is developing a sophisticated AI assistant to generate creative and poetic marketing slogans. To gather human feedback, they plan to have a large group of crowd-sourced workers rate each slogan on an absolute scale of 1 (poor) to 10 (excellent) for 'creativity'. What is the most significant weakness of using this independent, absolute scoring approach for this specific task?