Concept

Shared Vocabulary for Input and Output in Language Models

In language modeling, the model's input tokens and output predictions are drawn from the same vocabulary. As a consequence, both the input representation and the output layer share the same dimensionality, which equals the vocabulary size. This architectural property distinguishes language models from many other sequence-to-sequence tasks where the source and target vocabularies may differ.

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L