Learn Before
Self-Supervised Learning
Prefix Language Modeling (PrefixLM)
Prefix Language Modeling (PrefixLM) is a self-supervised pre-training objective where a model learns to predict a subsequent sequence of text given an initial prefix that serves as context. In an encoder-decoder implementation, the encoder processes the entire prefix non-causally to build a rich contextual representation. The decoder then uses this representation to autoregressively generate the remaining part of the sequence.

0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Comparison of Self-Supervised Pre-training and Self-Training
Architectural Categories of Pre-trained Transformers
Self-Supervised Classification Tasks for Encoder Training
Prefix Language Modeling (PrefixLM)
Mask-Predict Framework
Discriminative Training
Learning World Knowledge from Unlabeled Data
Emergent Linguistic Capabilities from Pre-training
Architectural Approaches to Self-Supervised Pre-training
Self-Supervised Pre-training of Encoders via Masked Language Modeling
Word Prediction as a Core Self-Supervised Task
Learning World Knowledge from Unlabeled Data via Self-Supervision
A research team has a massive collection of unlabeled historical texts. Their goal is to pre-train a language model that understands the specific vocabulary and sentence structures within these documents, but they have no budget for manual data annotation. Which of the following approaches is the most effective and feasible for their pre-training task?
Analysis of Supervision Signal Generation
A team is developing a pre-training strategy for a new language model using a large corpus of unlabeled text. Which of the following proposed tasks best exemplifies the principles of self-supervised learning?
Learn After
Comparison of Prefix and Causal Language Modeling
Example of Prefix Language Modeling Input Format
Training Encoder-Decoder Models with Prefix Language Modeling
Consider a model architecture composed of an encoder and a decoder, trained with a self-supervised objective to complete a text sequence given an initial prefix. Which statement best analyzes the distinct processing methods of the encoder and decoder for this task?
Processing a Text Sequence
In a self-supervised text generation task, a model is given an initial sequence of words (a prefix) and trained to produce the words that follow. For an architecture that uses two distinct components to accomplish this, match each component or data piece with its primary role or characteristic.