1Cademy - During supervised fine-tuning, if a model is trained on concatenated `[input, output]` sequences and the training loss is calculated across the *entire* sequence (both input and output tokens), the model is still being optimized primarily to improve its conditional generation capabilities for the given input.

Learn Before

SFT as Language Model Training on Concatenated Sequences

True/False

During supervised fine-tuning, if a model is trained on concatenated [input, output] sequences and the training loss is calculated across the entire sequence (both input and output tokens), the model is still being optimized primarily to improve its conditional generation capabilities for the given input.

Updated 2025-10-03

Contributors are:

Who are from:

Learn Before

Related