1Cademy - A language model is generating a five-token sequence ($x_0, x_1, x_2, x_3, x_4$) using a permuted, non-sequential order. At a specific step in the generation process, the model calculates the probability for token $x_2$ as: $\text{Pr}(x_2|\mathbf{e}_0, \mathbf{e}_4)$, where $\mathbf{e}_i$ is the embedding of token $x_i$. Based *only* on this information, what can be definitively concluded about the generation process?

Learn Before

Visual Representation of Permuted Language Modeling

Multiple Choice

A language model is generating a five-token sequence ( $x_0, x_1, x_2, x_3, x_4$ ) using a permuted, non-sequential order. At a specific step in the generation process, the model calculates the probability for token $x_2$ as: $\text{Pr}(x_2|\mathbf{e}_0, \mathbf{e}_4)$ , where $\mathbf{e}_i$ is the embedding of token $x_i$ . Based only on this information, what can be definitively concluded about the generation process?

Updated 2025-10-09

Contributors are:

Who are from:

Learn Before

Related