Learn Before
Rationale for Sub-sequence Loss Calculation
A language model is trained on the text sequence: Input: What is the capital of France? Output: The capital of France is Paris. Explain why the model's training loss is calculated only on the Output portion of the sequence and not the Input portion.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is being trained on the sequence:
⟨s⟩ Translate to Spanish: The cat sat. El gato se sentó. ⟨/s⟩. To effectively teach the model how to perform the translation, on which part of the sequence should the training loss be calculated?Debugging a Chatbot Training Process
Rationale for Sub-sequence Loss Calculation