Definition

Ground-Truth Distribution as a One-Hot Representation

In language modeling, the ground-truth distribution at a given position, denoted as pi+1gold\mathbf{p}_{i+1}^{\mathrm{gold}}, is defined as the one-hot representation of the actual next token, xi+1x_{i+1}. This one-hot vector acts as the exact target for the model's prediction at that step.

0

1

Updated 2026-04-15

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences