Classification

Output Variation in Sequence Models

The output o\mathbf{o} from a general sequence model, which is generated by a neural network g(;θ)g(\cdot; \theta), can differ based on the specific problem being addressed. For token prediction problems (such as language modeling), the output o\mathbf{o} is typically a probability distribution over a defined vocabulary. Conversely, for sequence encoding problems, the output o\mathbf{o} serves as a representation of the input sequence, commonly expressed as a sequence of real-valued vectors.

0

1

Updated 2026-04-14

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Foundations of Large Language Models