1Cademy - Adapting a Bidirectional Model for Generative Tasks

Learn Before

Causal Language Modeling as a Special Case of Masked Language Modeling

Case Study

Adapting a Bidirectional Model for Generative Tasks

Describe the specific masking strategy a research team must apply to their input sequences during fine-tuning to force a pre-trained bidirectional model to behave like a unidirectional, causal language model for a text generation task. For a given token position i that needs to be predicted, which other tokens in the sequence should be masked?

Updated 2025-10-10

Contributors are:

Who are from:

Learn Before

Related