1Cademy - Deep Learning Approach to Language Modeling

Learn Before

Language Models (LMs)

Concept

Deep Learning Approach to Language Modeling

In the era of deep learning, a typical approach to language modeling is to estimate token probabilities using a deep neural network. Neural networks trained to accomplish this task receive a sequence of context tokens, $x_0,...,x_{i-1}$ , as input and produce a distribution over the vocabulary $\mathcal{V}$ , which is denoted by $\Pr(\cdot|x_0,...,x_{i-1})$ . The probability of the specific token $x_i$ , denoted as $\Pr(x_{i}|x_0,...,x_{i-1})$ , is the value of the $i$ -th entry of this output distribution.