LLM Training Data Strategy
Based on the plan described in the case study, evaluate the startup's data strategy. Identify the most significant potential flaw in their reasoning and explain at least two specific, negative outcomes that could result from training their chatbot on this type of unfiltered data.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
LLM Training Data Strategy
Critique of Unfiltered Data Training Strategy
A development team decides to train a new large language model using a vast, unfiltered corpus of text scraped directly from the public internet. Which of the following is the most significant and direct risk associated with this data collection strategy?