Essay

Design Review: Choosing Between RAG and k-NN LM for a Regulated Support Assistant

You are leading a design review for an internal LLM assistant used by customer support agents at a regulated company. The assistant must (1) answer questions about frequently changing product policies and pricing, (2) cite the exact source passages used for each answer for auditability, and (3) avoid “confident but wrong” answers when the knowledge base does not contain relevant information.

Your team proposes two options: A) A Retrieval-Augmented Generation (RAG) pipeline that embeds the user question, retrieves the top-k relevant text snippets from a vector database of approved documents, and inserts those snippets into the prompt so the LLM generates an answer grounded in the retrieved sources. B) A k-NN Language Model (k-NN LM) approach that, during generation, retrieves k nearest neighbor hidden-state vectors from an external datastore and interpolates their next-token distributions with the base model to improve next-token prediction.

Write an essay recommending a primary approach (A, B, or a hybrid) and justify it by explicitly explaining how text retrieval via a vector database and grounding with external sources would (or would not) satisfy the auditability and freshness requirements, and how k-NN LM’s nearest-neighbor next-token mechanism changes the failure modes compared with RAG (e.g., factuality, controllability, and behavior when retrieval is irrelevant). Conclude with two concrete design safeguards you would implement to reduce hallucinations when retrieval returns weak or no matches (e.g., thresholds, abstention, citation rules), and explain why they work in your chosen architecture.

Image 0

0

1

Updated 2026-02-06

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.3 Prompting - Foundations of Large Language Models

Ch.5 Inference - Foundations of Large Language Models

Related