1Cademy - Troubleshooting a Pre-trained Models Performance

Learn Before

Drawback of Masked Language Modeling: The [MASK] Token Discrepancy

Case Study

Troubleshooting a Pre-trained Model's Performance

An engineer is developing a language model. First, they train it on a large corpus of text where 15% of the words in each sentence are replaced with a special [BLANK] symbol; the model's objective is to predict these original words. After this initial training, they adapt the model for a new task: classifying customer reviews as 'positive' or 'negative'. This new task uses complete, unaltered customer reviews. The engineer notices that the model's performance on the classification task is lower than anticipated. Based on this information, identify and explain the fundamental mismatch between the initial training phase and the final task that is likely causing this issue.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Learn Before

Related