1Cademy - Explaining Perplexitys Limitation in Long-Context Evaluation

Learn Before

Limitation of Perplexity for Evaluating Long-Context LLMs

Short Answer

Explaining Perplexity's Limitation in Long-Context Evaluation

A language model is evaluated on a 100,000-token document. Explain why a low (good) overall perplexity score for the entire document does not guarantee that the model successfully used information from the beginning of the document to inform its predictions at the end.

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences