1Cademy - Consider a language model pre-training process that uses a single input sequence (e.g., a pair of sentences) to perform two tasks: predicting masked words and determining if the second sentence logically follows the first. In this process, the model first calculates the loss for the masked word task and updates its internal parameters. Then, using the same input, it calculates the loss for the sentence relationship task and performs a second, separate update to its parameters.

Learn Before

Concurrent Loss Calculation for MLM and NSP

True/False

Consider a language model pre-training process that uses a single input sequence (e.g., a pair of sentences) to perform two tasks: predicting masked words and determining if the second sentence logically follows the first. In this process, the model first calculates the loss for the masked word task and updates its internal parameters. Then, using the same input, it calculates the loss for the sentence relationship task and performs a second, separate update to its parameters.

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related