Consequences of Training Data Omissions
A developer fine-tunes a language model on a large dataset of single, complete sentences. However, they forget to append the special token that normally signals the end of a sequence to each sentence in the training data. When this model is later used for text generation, what is the most probable and problematic behavior it will exhibit, and why does this happen?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A developer is using a text-generation model to complete the sentence: 'The capital of France is'. The model produces the single word 'Paris' and then immediately stops. The developer had configured the generation process to allow for a maximum of 100 new words and is surprised by the short output. Based on how these models are trained to signal completeness, what is the most likely reason the generation process terminated after just one word?
Consequences of Training Data Omissions
Debugging Premature Text Generation Termination