1Cademy - N-Gram Model

Learn Before

N-gram Language Modeling

Concept

N-Gram Model

The bigram model could be generalized to the N-Gram Model which approximates the probability by looking n-1 words into the past, hence $P(w_n|w_{1:n-1}) ≈ P(w_n|w_{n-N+1:n-1})$ .

The general case of n-gram probability of a word $w_n$ is given by $P(w_n|w_{n-N+1:n-1}) = \frac{C(w_{n-N+1:n-1}w_n)}{C(w_{n-N+1:n-1})}$

Updated 2022-06-28

Contributors are:

Xinrong Yao

🏆 1

Who are from:

University of Michigan - Ann Arbor

🏆 1

References

Speech and Language Processing (3rd ed. draft)

Tags

Data Science

Huge Language Models
N-Gram Representation
Bigram Model
N-Gram Model
Sentence Generation from Unigram Model
Unknown Words and Problem of Sparsity
Historical Significance and Applications of N-gram Models
A statistical language model is built to predict the next word in a sentence based on the probability of it occurring after the preceding sequence of words. This model is trained exclusively on a massive corpus of texts written in the 19th century. When this model is prompted with the partial sentence, 'To save the file, the user clicked the...', which outcome is the most probable explanation for its behavior?
Curse of Dimensionality in Traditional Language Models
Analyzing Zero Probability in an N-gram Model
Evaluating N-gram Model Complexity

Learn Before

Related