Case Study

Adapting a Bidirectional Model for Generative Tasks

Describe the specific masking strategy a research team must apply to their input sequences during fine-tuning to force a pre-trained bidirectional model to behave like a unidirectional, causal language model for a text generation task. For a given token position i that needs to be predicted, which other tokens in the sequence should be masked?

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science