Case Study

Case Study: Root-Cause Analysis of “Correct Source, Wrong Answer” in a RAG + k-NN LM Assistant

You are on-call for an internal engineering assistant used by developers to answer questions about your company’s API behavior. The system uses a vector database to retrieve the top-k document snippets (RAG) and also uses a k-NN language model datastore built from last quarter’s resolved support tickets to influence next-token prediction during generation.

A developer asks: “Does endpoint /v2/payments support idempotency keys, and what header name should I use?”

Observed behavior:

  1. The vector database retrieval returns three highly relevant, up-to-date snippets from the current API docs that clearly state: “Idempotency is supported on /v2/payments. Use header: Idempotency-Key.”
  2. The final answer says: “Idempotency is not supported on /v2/payments. Use header: X-Idempotency-Token.”
  3. The answer includes citations pointing to the correct retrieved doc snippets (the ones that say Idempotency-Key), even though the generated text contradicts them.

Assume the retrieved snippets are indeed being inserted into the prompt, and the citations are mechanically attached from the retrieved snippets (not generated by the model).

Image 0

0

1

Updated 2026-02-06

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.3 Prompting - Foundations of Large Language Models

Ch.5 Inference - Foundations of Large Language Models

Related