Concept

Limitation of Perplexity for Evaluating Long-Context LLMs

While perplexity is a straightforward metric for evaluating language models, it has a significant drawback when assessing long-context capabilities. Its application tends to primarily measure a model's performance on local context, failing to adequately capture its understanding and utilization of the broader, global context within a long sequence.

0

1

Updated 2026-04-29

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.2 Generative Models - Foundations of Large Language Models