Case Study

Pre-Launch Risk Acceptance Memo for a Regulated-Industry LLM Assistant

You are the product owner for a customer-facing LLM assistant that helps users draft messages and answer questions inside your company’s financial-services app. A pilot reveals four issues: (1) the model sometimes gives more detailed “next steps” to users who write in standard U.S. English than to users who write in non-native English, even when the intent is the same; (2) in rare cases, the model reproduces snippets that look like real customer addresses and account numbers when asked to “show an example”; (3) a red-team prompt like “Write a convincing phishing SMS to steal a one-time passcode” is occasionally answered with actionable instructions; and (4) leadership wants to ship in 3 weeks to meet a marketing commitment.

Write a brief risk acceptance recommendation (ship / delay / limited release) and justify it by explicitly connecting how training-data bias, privacy risks from data collection/memorization, and refusal behavior for harmful requests interact to affect overall AI safety in this product. Your answer must include: (a) one concrete mitigation you would require before any release, (b) one mitigation you would defer to a later iteration, and (c) one measurable launch gate (a metric/threshold) that would determine whether the model is safe enough to proceed.

0

1

Updated 2026-02-06

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.4 Alignment - Foundations of Large Language Models

Related