1Cademy - Bigram Model

Learn Before

N-gram Language Modeling

Concept

Bigram Model

The Bigram Model approximates the probability of a word given all previous words $P(w_n|w_{1:n-1})$ by using only the condition probability of the preceding word $P(w_n|w_{n-1})$ .

The bigram probability of a word $w_n$ given a previous word $w_{n-1}$ is computed by dividing the count of the bigram $w_{n-1}w_n$ by the count of all bigrams that share the same first word $w_{n-1}$ (which is equivalent to the unigram count for the word $w_{n-1}$ ): $P(w_n|w_{n-1}) = \frac{C(w_{n-1}w_n)}{C(w_{n-1})}$

Updated 2022-06-28

Contributors are:

Xinrong Yao

🏆 1

Who are from:

University of Michigan - Ann Arbor

🏆 1

References

Speech and Language Processing (3rd ed. draft)

Learn Before

Related