1Cademy - In the context of training a language model, the objective is often to find parameters that maximize the likelihood of the training data. Consider the following mathematical expression for this objective: `Objective = ∑_{x ∈ D} ∑_{i ∈ A(x)} log Pr(xᵢ | x̄)` Here, `D` is the dataset, `x` is an original text sequence, `x̄` is a version of `x` with some tokens masked, `A(x)` is the set of indices that were masked in `x`, and `xᵢ` is the original token at a masked position `i`. What does the inn

Learn Before

MLM Training Objective as Maximum Likelihood Estimation

Multiple Choice

In the context of training a language model, the objective is often to find parameters that maximize the likelihood of the training data. Consider the following mathematical expression for this objective:

Objective = ∑_{x ∈ D} ∑_{i ∈ A(x)} log Pr(xᵢ | x̄)

Here, D is the dataset, x is an original text sequence, x̄ is a version of x with some tokens masked, A(x) is the set of indices that were masked in x, and xᵢ is the original token at a masked position i.

What does the inn

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course