Learn Before
In a specific pre-training setup for a language model, a single input (composed of one or two sentences) is used to perform two distinct tasks simultaneously: one task involves predicting words that have been intentionally hidden in the text, and the other involves determining the relationship between the two sentences (e.g., if one follows the other). Which statement accurately describes how the performance on these two tasks is used to update the model?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
In a specific pre-training setup for a language model, a single input (composed of one or two sentences) is used to perform two distinct tasks simultaneously: one task involves predicting words that have been intentionally hidden in the text, and the other involves determining the relationship between the two sentences (e.g., if one follows the other). Which statement accurately describes how the performance on these two tasks is used to update the model?
Consider a language model pre-training process that uses a single input sequence (e.g., a pair of sentences) to perform two tasks: predicting masked words and determining if the second sentence logically follows the first. In this process, the model first calculates the loss for the masked word task and updates its internal parameters. Then, using the same input, it calculates the loss for the sentence relationship task and performs a second, separate update to its parameters.
A language model is being pre-trained using a dual-task objective on a single input sequence composed of two sentences. One task is to predict masked words within the sentences, and the other is to predict if the second sentence is the actual next sentence. Arrange the following steps in the correct computational order for a single training iteration.