Learn Before
Concept

Bigram Model

The Bigram Model approximates the probability of a word given all previous words P(wnw1:n1)P(w_n|w_{1:n-1}) by using only the condition probability of the preceding word P(wnwn1)P(w_n|w_{n-1}).

The bigram probability of a word wnw_n given a previous word wn1w_{n-1} is computed by dividing the count of the bigram wn1wnw_{n-1}w_n by the count of all bigrams that share the same first word wn1w_{n-1} (which is equivalent to the unigram count for the word wn1w_{n-1}): P(wnwn1)=C(wn1wn)C(wn1)P(w_n|w_{n-1}) = \frac{C(w_{n-1}w_n)}{C(w_{n-1})}

0

1

Updated 2022-06-28

Tags

Data Science