Learn Before
Refining a Customer Service Chatbot Dataset
Based on the issues identified in the case study, describe two distinct data filtering or cleaning procedures you would implement to improve the quality of this training dataset. For each procedure, explain what specific problem it solves and why it is important for the final model's performance and safety.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A data scientist is preparing a large text corpus scraped from public internet forums to train a general-purpose chatbot. To improve data quality, they apply a filter that automatically deletes any text segment containing words from a predefined list of profanities. Which statement provides the most accurate evaluation of this data cleaning strategy?
Refining a Customer Service Chatbot Dataset
You are tasked with creating a data processing pipeline to clean a large, raw text corpus for training a language model. Arrange the following cleaning steps into the most logical and efficient order.