1Cademy - Root-Cause Analysis of a T5-Style Model Producing Fluent but Unfaithful Outputs

Learn Before

Case Study

Root-Cause Analysis of a T5-Style Model Producing Fluent but Unfaithful Outputs

You are rolling out a single internal NLP service based on an encoder–decoder model in a T5-style text-to-text setup. Every request is formatted as plain text with an instruction prefix (e.g., "summarize:", "translate en->de:", "extract entities:") followed by the user content, and the model always generates a text output.

After pretraining, the team reports a consistent failure mode across multiple downstream tasks: outputs are fluent and on-topic for the instruction, but they often ignore key facts from the provided input. For example:

Input: "summarize: The incident report states the outage lasted 17 minutes and affected only EU customers." Output: "A brief outage impacted customers for about an hour across multiple regions."
Input: "extract entities: Contract signed by Acme Corp on 2024-01-12 for $2.3M." Output: "Acme Corp; 2023-12-01; $3.0M"

You inspect the pretraining pipeline and find it uses span-based denoising with sentinel tokens, but the data engineer implemented the decoder target as the entire original uncorrupted text (i.e., the decoder is trained to reproduce the full input sequence), rather than the standard T5-style target that concatenates only the missing spans with their sentinel tokens.

As the model owner, analyze how this specific pretraining-target mistake would change what the encoder and decoder learn in an encoder–decoder network, and explain why that would plausibly lead to the observed "instruction-following but input-unfaithful" behavior in a text-to-text system. Provide one concrete correction to the pretraining objective/format that would directly address the issue.

Updated 2026-02-06

Contributors are:

Who are from:

Learn Before

Related