A developer is fine-tuning a language model on a dataset where each entry consists of a context and a desired completion. For training, the context and completion are concatenated into a single input sequence. The training objective is configured so that the loss is calculated only on the model's predictions for the completion part of the sequence. Given this setup, which statement accurately describes how the model's parameters are updated during the backward pass for a single training step?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Example of Context and Prediction Sub-sequences
A developer is fine-tuning a language model on a dataset where each entry consists of a context and a desired completion. For training, the context and completion are concatenated into a single input sequence. The training objective is configured so that the loss is calculated only on the model's predictions for the completion part of the sequence. Given this setup, which statement accurately describes how the model's parameters are updated during the backward pass for a single training step?
Debugging a Fine-Tuning Gradient Flow
Implications of Selective Gradient Propagation