Learn Before
Concept
Perplexity
Perplexity is a probability-based metric for evaluating language models. It is the weighted average of the number of possible next words that can follow any word, a.k.a. the weighted average branching factor.
Given a mini-language of 10 words "zero, one ... ten", each word's occurrence probability is 1/10 (unigram), the perplexity is the inverse is 10:
\mathrm{PP}(W) &=P\left(w_{1} w_{2} \ldots w_{N}\right)^{-\frac{1}{N}} &=\left(\frac{1}{10}\right)^{-\frac{1}{N}} &=\frac{1}{10} &=10 \end{aligned}$$0
1
Updated 2021-09-24
Tags
Natural language processing
Data Science