Learn Before
Concept

Perplexity

Perplexity is a probability-based metric for evaluating language models. It is the weighted average of the number of possible next words that can follow any word, a.k.a. the weighted average branching factor.

Given a mini-language of 10 words "zero, one ... ten", each word's occurrence probability is 1/10 (unigram), the perplexity is the inverse is 10:

\mathrm{PP}(W) &=P\left(w_{1} w_{2} \ldots w_{N}\right)^{-\frac{1}{N}} &=\left(\frac{1}{10}\right)^{-\frac{1}{N}} &=\frac{1}{10} &=10 \end{aligned}$$

0

1

Updated 2021-09-24

Tags

Natural language processing

Data Science

Related