Learn Before
Concept

Deep Learning Approach to Language Modeling

In the era of deep learning, a typical approach to language modeling is to estimate token probabilities using a deep neural network. Neural networks trained to accomplish this task receive a sequence of context tokens, x0,...,xi1x_0,...,x_{i-1}, as input and produce a distribution over the vocabulary V\mathcal{V}, which is denoted by Pr(x0,...,xi1)\Pr(\cdot|x_0,...,x_{i-1}). The probability of the specific token xix_i, denoted as Pr(xix0,...,xi1)\Pr(x_{i}|x_0,...,x_{i-1}), is the value of the ii-th entry of this output distribution.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences