1Cademy - Unintended Consequences of Data Filtering

Learn Before

Heuristics-Based Data Filtering for Fine-Tuning

Case Study

Unintended Consequences of Data Filtering

A development team is fine-tuning a language model to act as a programming assistant. After applying a set of predefined filtering rules to their dataset, they notice the fine-tuned model struggles to generate simple, concise code solutions (e.g., 'one-liners') and fails to explain basic programming concepts effectively. Based on the filtering rules listed in the case study, identify which rule is the most likely cause of this performance degradation and explain your reasoning.

Updated 2025-10-05

Contributors are:

Who are from:

Learn Before

Related