Learn Before
Output Variation in Sequence Models
The output from a general sequence model, which is generated by a neural network , can differ based on the specific problem being addressed. For token prediction problems (such as language modeling), the output is typically a probability distribution over a defined vocabulary. Conversely, for sequence encoding problems, the output serves as a representation of the input sequence, commonly expressed as a sequence of real-valued vectors.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Foundations of Large Language Models
Related
Output Variation in Sequence Models
Role of the [CLS] Token in Sequence Classification
Masked Language Modeling
Input Formatting with Separator Tokens
Standard Auto-Regressive Probability Factorization using Embeddings
CLS Token as a Start Symbol in Encoder Pre-training
Comparison of Context Usage in Causal vs. Masked Language Modeling
Applying the General Sequence Model Formulation
In the general formulation of a sequence model,
o = g(x_0, x_1, ..., x_m; θ), which statement best analyzes the distinct roles of the components?Match each symbol from the general sequence model formulation,
o = g(x_0, x_1, ..., x_m; θ), with its correct description.Fundamental Issues in Sequence Model Formulation
Neural Network as a Parameterized Function
Learn After
Analysis of Sequence Model Outputs for Different Tasks
A data scientist is working with a sequence model,
g(·; θ), for two different projects. In Project A, the goal is to predict the most likely next word in a user's search query. In Project B, the goal is to classify an entire product review as either 'positive' or 'negative'. How would the nature of the model's direct output,o, most likely differ between these two projects?A data scientist is developing two different systems. Match each system's primary task with the most likely structure of its underlying model's direct output.