Learn Before
Concept

Perplexity Evaluation Scenarios

The perplexity of a language model can be evaluated across different scenarios based on its prediction accuracy. In the best-case scenario, the model perfectly estimates the target token's probability as 11, resulting in a perplexity of 11. In the worst-case scenario, the model predicts the target token's probability as 00, leading to a perplexity of positive infinity. As a baseline, if the model predicts a uniform distribution over all available tokens, the perplexity equals the number of unique tokens in the vocabulary. This baseline provides a nontrivial upper bound that any useful model must beat.

0

1

Updated 2026-05-13

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L