Case Study

Debugging a Generative Language Model

An engineer is debugging a language model designed to generate text one word at a time. They observe that when prompted with 'The cat sat on the', the model's prediction for the next word is heavily influenced by a future, unseen word 'rug' that was accidentally included in the decoder's input sequence during a specific training step. Which fundamental principle of sequential data processing is being violated, and how is this principle typically enforced in the model's internal architecture to prevent such 'cheating'?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science