Case Study

Evaluating LLM Safety Measures Post-Anonymization

A company has trained a large language model on a dataset of internal corporate documents. They ran a script to remove all employee names and project codenames. However, during testing, they find the model can still inadvertently reveal sensitive strategic information when prompted about quarterly goals, as the context around these goals can implicitly point to the redacted project codenames. The company needs to deploy the model soon and cannot afford the time or cost to retrain it from scratch.

Given these constraints, evaluate the following two proposals and determine which is the more effective and practical immediate solution to mitigate the risk. Justify your reasoning.

0

1

Updated 2025-09-29

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science