1Cademy - N-gram Language Modeling

Learn Before

Concept

N-gram Language Modeling

For quite a long period, particularly before 2010, the dominant approach to language modeling was the $n$ -gram approach. In $n$ -gram language modeling, we estimate the probability of a word given its preceding $n-1$ words, and thus the probability of a sequence can be approximated by the product of a series of $n$ -gram probabilities. These probabilities are typically estimated by collecting smoothed relative counts of $n$ -grams in text. This straightforward approach has been extensively used in NLP, and the success of modern statistical speech recognition and machine translation systems has largely depended on the utilization of $n$ -gram language models.