Case Study

Training Strategy for a BERT-based Encoder

A team is building a machine translation model using an encoder-decoder architecture. They use a pre-trained bidirectional language model as the encoder and a randomly initialized model as the decoder. During the training process on their translation dataset, they 'freeze' all the parameters of the pre-trained encoder and only update the parameters of the decoder. Analyze the primary limitation of this training strategy.

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science