Concept

Data Quality as a Key Issue in LLM Training

The quality of training data is a fundamental issue in the development of data-driven NLP systems, and it is especially critical for Large Language Models. Using raw text directly from various sources is generally undesirable, as research has shown that training on unfiltered data can be harmful to the model's performance and reliability.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences