Example

Transformer Encoding of a Masked Bilingual Sentence Pair

This example illustrates the encoding process for a masked bilingual sentence pair. The input sequence, [CLS] [MASK]是 [MASK]动物。 [SEP] Whales [MASK] [MASK] . [SEP], is first converted into a series of token embeddings, denoted as e0 through e11. This embedding sequence is then processed by a Transformer Encoder, which outputs a corresponding sequence of contextualized hidden states, h0 through h11. These hidden states serve as the basis for predicting the original masked tokens: '鲸鱼', '哺乳', 'are', and 'mammals'.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related