Short Answer

Interpreting Cross-Entropy for Data Curation

A data curation team uses a small language model to pre-process a large text corpus. The model assigns a cross-entropy score to each document. They find two documents with the following scores:

  • Document A: Cross-entropy = 1.8
  • Document B: Cross-entropy = 9.5

Based on the goal of creating a high-quality, coherent training set, which document is more likely to be included, and why? Explain the relationship between the cross-entropy score and how well a document aligns with the small model's learned patterns.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science