1Cademy - In the context of training a language model, representing the ground-truth distribution as a one-hot vector implies that the training process considers all incorrect tokens to be equally wrong, regardless of their semantic similarity to the correct token.

Learn Before

Ground-Truth Distribution as a One-Hot Representation

True/False

In the context of training a language model, representing the ground-truth distribution as a one-hot vector implies that the training process considers all incorrect tokens to be equally wrong, regardless of their semantic similarity to the correct token.

Updated 2025-10-03

Contributors are: