1Cademy - Laplace (Add-One) Smoothing Formulas

Learn Before

Laplace (Add-One) Smoothing

Formula

Laplace (Add-One) Smoothing Formulas

Laplace smoothing applied to the unigram probability of word $w_i$ with count $c_i$ , which is normalized by the total number of word tokens $N$ and the number of words in the vocabulary $V$ : $P_{Laplace}(w_i) = \frac{c_i+1}{N+V}$ It is more convenient to define an adjusted count $c_i^*$ , and turn it into a probability $P_i^*$ by normalizing by $N$ : $c_i^* = (c_i + 1)\frac{N}{N+V}$ The add-one smoothed bigram probability and adjusted count are given by: $P^*_{Laplace}(w_n|w_{n-1}) = \frac{C(w_{n-1}w_n) + 1}{\sum_w (C(w_{n-1}w_n) + 1)} = \frac{C(w_{n-1}w_n) + 1}{C(w_{n-1}) + V}$ $c^*(w_{n-1}w_n) = \frac{(C(w_{n-1}w_n) + 1) \times C(w_{n-1})}{C(w_{n-1}) + V}$