Learn Before
Applying Data Anonymization to a Text Snippet
To prepare a dataset for training a language model, you must remove sensitive details to protect user privacy. Anonymize the following text by replacing each distinct piece of personally identifiable information (PII) with a generic placeholder like [REDACTED].
Original Text: 'User Jane Smith (jane.smith@example.com, (555) 867-5309) reported an issue with her account, customer ID 45-B-198. She mentioned that the delivery to 742 Evergreen Terrace was delayed.'
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Limitations and Alternatives to Data Anonymization
Evaluating a Data Anonymization Strategy
A team is preparing a dataset of customer support chats to train a large language model. They apply an automated script designed to remove all personally identifiable information (PII) to protect user privacy. Analyze the following processed text snippet and determine which piece of information represents the most significant failure in the anonymization process.
Text Snippet: "The user, whose account ID is [MASKED], contacted us on Thursday regarding an order. They mentioned they live in the downtown area and that their specific case reference number is CZ-819-224."
Applying Data Anonymization to a Text Snippet