1Cademy - A research team is training a large language model. They notice that when prompted with a specific user ID, the model sometimes outputs a full name and home address associated with that ID. This users information appeared exactly once in the massive, diverse training dataset. In contrast, a common, publicly available programming code snippet, which appeared thousands of times in the dataset, is never reproduced verbatim by the model. Which statement best analyzes this situation?

Learn Before

Risk of Sensitive Data Memorization by LLMs

Multiple Choice

A research team is training a large language model. They notice that when prompted with a specific user ID, the model sometimes outputs a full name and home address associated with that ID. This user's information appeared exactly once in the massive, diverse training dataset. In contrast, a common, publicly available programming code snippet, which appeared thousands of times in the dataset, is never reproduced verbatim by the model. Which statement best analyzes this situation?

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related